AI For Students Good Or Bad

AI For Students Good Or Bad — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Line detection

    Line detection

    In image processing, line detection is an algorithm that takes a collection of n edge points and finds all the lines on which these edge points lie. The most popular line detectors are the Hough transform and convolution-based techniques. == Hough transform == The Hough transform can be used to detect lines and the output is a parametric description of the lines in an image, for example ρ = r cos(θ) + c sin(θ). If there is a line in a row and column based image space, it can be defined ρ, the distance from the origin to the line along a perpendicular to the line, and θ, the angle of the perpendicular projection from the origin to the line measured in degrees clockwise from the positive row axis. Therefore, a line in the image corresponds to a point in the Hough space. The Hough space for lines has therefore these two dimensions θ and ρ, and a line is represented by a single point corresponding to a unique set of these parameters. The Hough transform can then be implemented by choosing a set of values of ρ and θ to use. For each pixel (r, c) in the image, compute r cos(θ) + c sin(θ) for each values of θ, and place the result in the appropriate position in the (ρ, θ) array. At the end, the values of (ρ, θ) with the highest values in the array will correspond to strongest lines in the image == Convolution-based technique == In a convolution-based technique, the line detector operator consists of a convolution masks tuned to detect the presence of lines of a particular width n and a θ orientation. Here are the four convolution masks to detect horizontal, vertical, oblique (+45 degrees), and oblique (−45 degrees) lines in an image. a) Horizontal mask(R1) (b) Vertical (R3) (C) Oblique (+45 degrees)(R2) (d) Oblique (−45 degrees)(R4) In practice, masks are run over the image and the responses are combined given by the following equation: R(x, y) = max(|R1 (x, y)|, |R2 (x, y)|, |R3 (x, y)|, |R4 (x, y)|) If R(x, y) > T, then discontinuity As can be seen below, if mask is overlay on the image (horizontal line), multiply the coincident values, and sum all these results, the output will be the (convolved image). For example, (−1)(0)+(−1)(0)+(−1)(0) + (2)(1) +(2)(1)+(2)(1) + (−1)(0)+(−1)(0)+(−1)(0) = 6 pixels on the second row, second column in the (convolved image) starting from the upper left corner of the horizontal lines. page 82 == Example == These masks above are tuned for light lines against a dark background, and would give a big negative response to dark lines against a light background. == Code example == The code was used to detect only the vertical lines in an image using Matlab and the result is below. The original image is the one on the top and the result is below it. As can be seen on the picture on the right, only the vertical lines were detected

    Read more →
  • Reservoir sampling

    Reservoir sampling

    Reservoir sampling is a family of randomized algorithms for choosing a simple random sample, without replacement, of k items from a population of unknown size n in a single pass over the items. The size of the population n is not known to the algorithm and is typically too large for all n items to fit into main memory. The population is revealed to the algorithm over time, and the algorithm cannot look back at previous items. At any point, the current state of the algorithm must permit extraction of a simple random sample without replacement of size k over the part of the population seen so far. == Motivation == Suppose we see a sequence of items, one at a time. We want to keep 10 items in memory, and we want them to be selected at random from the sequence. If we know the total number of items n and can access the items arbitrarily, then the solution is easy: select 10 distinct indices i between 1 and n with equal probability, and keep the i-th elements. The problem is that we do not always know the exact n in advance. == Simple: Algorithm R == A simple and popular but slow algorithm, Algorithm R, was created by Jeffrey Vitter. Initialize an array R {\displaystyle R} indexed from 1 {\displaystyle 1} to k {\displaystyle k} , containing the first k items of the input x 1 , . . . , x k {\displaystyle x_{1},...,x_{k}} . This is the reservoir. For each new input x i {\displaystyle x_{i}} , generate a random number j uniformly in { 1 , . . . , i } {\displaystyle \{1,...,i\}} . If j ∈ { 1 , . . . , k } {\displaystyle j\in \{1,...,k\}} , then set R [ j ] := x i . {\displaystyle R[j]:=x_{i}.} Otherwise, discard x i {\displaystyle x_{i}} . Return R {\displaystyle R} after all inputs are processed. This algorithm works by induction on i ≥ k {\displaystyle i\geq k} . While conceptually simple and easy to understand, this algorithm needs to generate a random number for each item of the input, including the items that are discarded. The algorithm's asymptotic running time is thus O ( n ) {\displaystyle O(n)} . Generating this amount of randomness and the linear run time causes the algorithm to be unnecessarily slow if the input population is large. This is Algorithm R, implemented as follows: == Optimal: Algorithm L == If we generate n {\displaystyle n} random numbers u 1 , . . . , u n ∼ U [ 0 , 1 ] {\displaystyle u_{1},...,u_{n}\sim U[0,1]} independently, then the indices of the smallest k {\displaystyle k} of them is a uniform sample of the k {\displaystyle k} -subsets of { 1 , . . . , n } {\displaystyle \{1,...,n\}} . The process can be done without knowing n {\displaystyle n} : Keep the smallest k {\displaystyle k} of u 1 , . . . , u i {\displaystyle u_{1},...,u_{i}} that has been seen so far, as well as w i {\displaystyle w_{i}} , the index of the largest among them. For each new u i + 1 {\displaystyle u_{i+1}} , compare it with u w i {\displaystyle u_{w_{i}}} . If u i + 1 < u w i {\displaystyle u_{i+1} Read more →

  • BRS/Search

    BRS/Search

    BRS/Search is a full-text database and information retrieval system. BRS/Search uses a fully inverted indexing system to store, locate, and retrieve unstructured data. It was the search engine that in 1977 powered Bibliographic Retrieval Services (BRS) commercial operations with 20 databases (including the first national commercial availability of MEDLINE); it has changed ownership several times during its development and is currently sold as Livelink ECM Discovery Server by Open Text Corporation. == Early development == Development on what was to become BRS began as Biomedical Communications Network (BCN) at the State University of New York at Albany (SUNY). BCN, which went online in 1968, provided on-line access to nine databases, including MEDLINE and BIOSIS Previews, to large universities and medical schools primarily in the Northeast of the USA. State funding for the project was withdrawn in 1975, and Bibliographic Retrieval Services (BRS) was formed as a non-profit concern the following year. It was incorporated in May 1976 as a for-profit corporation with Ron Quake as president, Jan Egeland as vice president in charge of marketing and training, and Lloyd Palmer as vice president of systems. == BRS commercial operations == In December 1976, the First BRS User Meeting was held in Syracuse, New York, and by January 1977 BRS started commercial operations with 20 databases (including the first national commercial availability of MEDLINE) and 9 million records, using modified IBM STAIRS (STorage And Information Retrieval System) software, Telenet for telecommunications, and timesharing mainframe computers of Carrier Corporation. In October 1980 BRS was sold by Egeland and Quake to Indian Head, Inc., a subsidiary of the Dutch company Thyssen-Bornemisza Group. == 1989–1993 == In 1989 Robert Maxwell acquired BRS and the BRS/Search software; he announced the planned incorporation of the ORBIT Search Service and BRS Information Technologies and renamed the whole group Maxwell Online, Inc. At that time BRS Information Technologies was serving the medical and academic library marketplace with over 150 databases. Maxwell later bought the publishing company Macmillan and put Maxwell Online under Macmillan. In the same year BRS/LINK (hypertext connection of databases; first application delivering full text) was announced. The initial BRS/LINK application "relates the citation in a bibliographic database to its full-text article in a second database," and "eliminates the need to re-execute a search strategy in the second database in order to find the corresponding full-text article." Initially BRS/LINK supported linking only selected bibliographic databases: MEDLINE, Health Planning and Administration, and MEDLINE References on AIDS to the full-text Comprehensive Core Medical Library. At the time of Robert Maxwell’s death in 1991, Macmillan brought in Andrew Gregory to represent the company during the 2 years that Maxwell’s affairs were being settled and to prepare Maxwell Online to be able to sell the components. Maxwell Online shortly thereafter underwent yet another name change, this time to InfoPro Technologies. == Dataware Technologies ownership of BRS/SEARCH == Early in 1994, InfoPro Technologies, a subsidiary of MHC Inc. (holding company for Macmillan Inc.), the former Maxwell Online service, sold off all its subsidiaries. ORBIT Search Services went to the French-owned Questel, the dial-up BRS Search Services to CD Plus Technologies (later to become OVID), and BRS Software Products (including BRS/SEARCH) to Dataware Technologies. Almost up to the end of InfoPro Technologies, BRS Software had been the fastest growing segment of the company. At the 14th BRS North American Users Group Conference in 1999, Dave Schubmehl of Dataware Technologies presented a paper in which he stated "The purpose of this presentation is to update BRS users on upcoming releases of BRS/Search, NetAnswer, and other Dataware products. BRS/Search 7.0 will include features specifically requested by customers, as well as other enhancements. Earlier this year, Dataware acquired Sovereign Hill Software, makers of InQuery. In light of that acquisition, and Dataware's other development projects, we'll look at Dataware's plans for all products, including BRS/Search and NetAnswer." == Open Text acquisition of BRS/Search == In 2001 BRS/Search was acquired by Open Text and became LiveLink ECM Discovery Server. It is now referred to as Open Text Discovery Server. Open Text still supports both BRS/Search and NetAnswer. The core BRS/Search technology in the Open Text portfolio was augmented with other capabilities through various acquisitions. For example, Dataware's acquisition of Sovereign-Hill brought InQuery, “a probabilistic information retrieval system using an inference network”, which was developed by the University of Massachusetts Amherst Center for Intelligent Information Retrieval] out of the UMass CIIR and into the marketplace. A product re-branding table shows the range of products, their old names and their new names. InQuery is a concept search engine that uses noun phrases, parts of speech and other co-occurrence relationships in overlapping passages of text rather than single term inverted indexes of single words in documents. Open Text's portfolio has grown to include Hummingbird Content Management, and has always included BASIS. == 2003 == BRS/Search North America User's Group (BRSNAUG) website with a June 8, 2003 date listed the following features for BRS/Search. The BRSNAUG also disincorporated in 2003. Cross-references to BRS/Search on the World Wide Web point to Open Text Livelink. Engine features include: Rapid query response time. Numerical data handling and elementary statistical processing (sum, avg, min, max) Search results weighting and relevancy ranking Left- and right-truncation and expansion of search terms Superior data compression – loaded databases typically use only about 1.5 times the input stream size in disk space Large capacity databases – up to 100 million documents, each with up to 65,000 paragraphs Fine control of indexing and searching – right down to the word, sentence, and paragraph level Fine control over data security. Document access can be controlled at the database, document, and paragraph level International language support for all 7/8 bit characters sets and customizable language tables Flexible and customizable stop word lists ANSI-compatible thesauri Hypertext links within and between documents and databases (R6.x) Support for natural language parsing of queries Automatic document summarization tools Client/Server development Programming interfaces for World-Wide Web (HTTP, HTML) access to databases

    Read more →
  • Maritime Informatics

    Maritime Informatics

    Maritime Informatics is a thematic topic within the broader discipline of informatics. It can be considered as both a field of study and domain of application. As an application domain, it is the outlet of innovations originating from data science and artificial intelligence; as a field of study, it is positioned between computer science and marine engineering. == Beginnings of maritime informatics == As a result of the increasing levels of digitalisation occurring in the maritime sector starting around 2010 and stimulated by the EU-endorsed MonaLisa project for sea traffic management (STM), a number of academics and shipping industry leaders recognised that the maritime transportation sector would benefit from a specific field of study and application to be known as Maritime Informatics - the use of information systems, data sharing and data analytics in the business and operations of maritime transportation. They considered that it would lead to improvements in efficiency, safety, resilience, and ecological sustainability - all of which are currently lacking for many aspects of sea transport. One of the first public airings of the concept of Maritime Informatics was a presentation delivered on 11 September 2014 in Gothenburg, Sweden. A proposal for an inaugural minitrack on Maritime Informatics was accepted for the 2015 Americas Conference on Information Systems in Puerto Rico where three papers were presented. Since then numerous publications has been brought forward captured at www.maritimeinformatics.org and in late 2020 the first reference book on Maritime Informatics was co-written by 81 expert contributors (47 practitioners and 34 researchers) from 20 countries. Most impactful authors and journals in the domain have been documented in a review paper. Dimitrios Zissis, Luca Cazzanti and Leonardo M. Millefiori are the top three authors; top journals and conferences include Ocean Engineering, Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems, Sensors, the international Conference On Engineering, Technology And Innovation, Expert Systems With Applications, IEEE Access, and Journal of Navigation. == Background == The shipping industry has several particular organisational aspects that are recognised and taken into account in maritime informatics: It is predominantly a self-organising ecosystem Many activities are undertaken as part of episodic tight coupling There is a so-called maritime stack There is increasing pressure to balance capital productivity and energy efficiency There is the potential virtuous interplay between different types of systems == Data sharing == Digital data sharing is key to the all-important, arguably fundamental, data analytics aspects of maritime informatics because it opens the way for better access to relevant and reliable data. As in land-based commerce, digital data sharing is a growing phenomenon in maritime operations - though there is a way to go. It is enabling greater transparency for all those involved in the transportation of goods and passengers, not least being the end-customer. This leads to better and more informed decision-making and planning by all those involved. The push for digitalisation and data sharing is being pursued both by governments and the commercial sector. For example, the Member States of the IMO agreed a mandatory requirement for their governments to introduce electronic information exchange between ships and ports as from 8 April 2019. Meanwhile, commercial operators, particularly in the container lines are putting systems in place for sharing data for mutual benefit in their operations. Data sharing is an important aspect of the Port Collaborative Decision Making (PortCDM) and Port Call Optimization initiatives, both of which seek to improve the coordination, synchronization and efficiency of the port call process by enabling a common and shared situational awareness among all those involved. == Standardisation == The availability and sharing of relevant digital data underpins maritime informatics and is key to more effective and efficient coordination and synchronisation in the predominantly self-organising ecosystem that is maritime transportation. For this to occur, a high priority underpinning maritime informatics is the encouragement of standardised digital data exchange and data sharing, leading, in turn, to improvements in shipping analytics. Improved availability of data will support better historical analysis, now-casting and forecasting. The International Maritime Organization (IMO) FAL Committee is taking the lead in ensuring that the common terms used in the various standards being developed or in use in the maritime sector are compatible and therefore interoperable as far as is practicable, by creating and maintaining The IMO Compendium on Facilitation and Electronic Business. The IMO Compendium consists of an IMO Data Set and IMO Reference Data Model agreed by the main organisations involved in the development of standards for the electronic exchange of information related to the FAL Convention: the World Customs Organization (WCO), the United Nations Economic Commission for Europe (UNECE) and the International Organization for Standardization (ISO). There are several other prominent international governmental and non-governmental organisations actively contributing to the ongoing standardisation and harmonisation process including the UN Electronic Data Interchange for Administration, Commerce and Transport (UN EDIFACT), the Digital Container Shipping Association (DCSA), the International Harbour Masters Association (IHMA) and BIMCO - the world's largest direct-membership organisation for shipowners, charterers, shipbrokers and agents.

    Read more →
  • Hedgeable

    Hedgeable

    Hedgeable, Inc. was a U.S. based financial services company and digital wealth management platform headquartered in New York City. Hedgeable was known for not following set allocations, and instead actively managing accounts in response to market movements. On August 9, 2018, Hedgeable closed its doors to new investors, with existing investors required to transfer out of the company. The company claimed that it was not shutting down but simply removing its SEC registration. == History == Hedgeable was founded in 2009 by twin brothers Michael and Matthew Kane, who previously worked at high-net worth investment managers such as Bridgewater Associates and Spruce Private Investors. Both Michael and Matthew graduated from Penn State University with degrees in finance. Hedgeable is a Registered Investment Advisor with the U.S. Securities and Exchange Commission. The company has received funding from SixThirty and Route 66 Ventures as well as various other angel investors. On August 9, 2018, Hedgeable closed its doors to new investors. == Investing Strategies == Hedgeable did not follow a buy-and-hold approach, but instead actively manages accounts in response to market movements focusing on downside protection in bear markets. Their strategy was different from other robo-advisors, which use Modern Portfolio Theory. Hedgeable offered investment options including Exchange Traded Funds (ETFs) to individual stocks, master limited partnerships, private equity and bitcoin. Mutual funds were not used in portfolios. Although the firm's focus was to provide a direct-to-consumer service, Hedgeable's investment strategies were available to financial advisors and institutions as well through a variety of platforms. == Product Features == When it was open to external clients, Hedgeable aimed to gamify their personal finance experience. Clients could open a new account or transfer an existing account. Hedgeable accepted retirement accounts, taxable accounts, business accounts and various other account types. Hedgeable offered the following features: Downside protection Account aggregation Alternative investments Alpha rewards API Mobile app It was awarded 4/5 for client transparency by Paladin Research. Hedgeable was the winner of the Finovate Fall 2015 Best of Show Award and the GREAT 2015 Tech Award (FinTech Category). In 2016, Hedgeable launched its first iOS mobile app in order to expand their product offerings.

    Read more →
  • Ontology engineering

    Ontology engineering

    In computer science, information science and systems engineering, ontology engineering is a field which studies the methods and methodologies for building ontologies, which encompasses a representation, formal naming and definition of the categories, properties and relations between the concepts, data and entities of a given domain of interest. In a broader sense, this field also includes a knowledge construction of the domain using formal ontology representations such as OWL/RDF. A large-scale representation of abstract concepts such as actions, time, physical objects and beliefs would be an example of ontological engineering. Ontology engineering is one of the areas of applied ontology, and can be seen as an application of philosophical ontology. Core ideas and objectives of ontology engineering are also central in conceptual modeling. Ontology engineering aims at making explicit the knowledge contained within software applications, and within enterprises and business procedures for a particular domain. Ontology engineering offers a direction towards solving the inter-operability problems brought about by semantic obstacles, i.e. the obstacles related to the definitions of business terms and software classes. Ontology engineering is a set of tasks related to the development of ontologies for a particular domain. Automated processing of information not interpretable by software agents can be improved by adding rich semantics to the corresponding resources, such as video files. One of the approaches for the formal conceptualization of represented knowledge domains is the use of machine-interpretable ontologies, which provide structured data in, or based on, RDF, RDFS, and OWL. Ontology engineering is the design and creation of such ontologies, which can contain more than just the list of terms (controlled vocabulary); they contain terminological, assertional, and relational axioms to define concepts (classes), individuals, and roles (properties) (TBox, ABox, and RBox, respectively). Ontology engineering is a relatively new field of study concerning the ontology development process, the ontology life cycle, the methods and methodologies for building ontologies, and the tool suites and languages that support them. A common way to provide the logical underpinning of ontologies is to formalize the axioms with description logics, which can then be translated to any serialization of RDF, such as RDF/XML or Turtle. Beyond the description logic axioms, ontologies might also contain SWRL rules. The concept definitions can be mapped to any kind of resource or resource segment in RDF, such as images, videos, and regions of interest, to annotate objects, persons, etc., and interlink them with related resources across knowledge bases, ontologies, and LOD datasets. This information, based on human experience and knowledge, is valuable for reasoners for the automated interpretation of sophisticated and ambiguous contents, such as the visual content of multimedia resources. Application areas of ontology-based reasoning include, but are not limited to, information retrieval, automated scene interpretation, and knowledge discovery. == Languages == An ontology language is a formal language used to encode the ontology. There are a number of such languages for ontologies, both proprietary and standards-based: Common logic is ISO standard 24707, a specification for a family of ontology languages that can be accurately translated into each other. The Cyc project has its own ontology language called CycL, based on first-order predicate calculus with some higher-order extensions. The Gellish language includes rules for its own extension and thus integrates an ontology with an ontology language. IDEF5 is a software engineering method to develop and maintain usable, accurate, domain ontologies. KIF is a syntax for first-order logic that is based on S-expressions. Rule Interchange Format (RIF), F-Logic and its successor ObjectLogic combine ontologies and rules. OWL is a language for making ontological statements, developed as a follow-on from RDF and RDFS, as well as earlier ontology language projects including OIL, DAML and DAML+OIL. OWL is intended to be used over the World Wide Web, and all its elements (classes, properties and individuals) are defined as RDF resources, and identified by URIs. OntoUML is a well-founded language for specifying reference ontologies. SHACL (RDF SHapes Constraints Language) is a language for describing structure of RDF data. It can be used together with RDFS and OWL or it can be used independently from them. XBRL (Extensible Business Reporting Language) is a syntax for expressing business semantics. == Methodologies and tools == DOGMA KAON OntoClean HOZO Protégé (software) Large language models == In life sciences == Life sciences is flourishing with ontologies that biologists use to make sense of their experiments. For inferring correct conclusions from experiments, ontologies have to be structured optimally against the knowledge base they represent. The structure of an ontology needs to be changed continuously so that it is an accurate representation of the underlying domain. Recently, an automated method was introduced for engineering ontologies in life sciences such as Gene Ontology (GO), one of the most successful and widely used biomedical ontology. Based on information theory, it restructures ontologies so that the levels represent the desired specificity of the concepts. Similar information theoretic approaches have also been used for optimal partition of Gene Ontology. Given the mathematical nature of such engineering algorithms, these optimizations can be automated to produce a principled and scalable architecture to restructure ontologies such as GO. Open Biomedical Ontologies (OBO), a 2006 initiative of the U.S. National Center for Biomedical Ontology, provides a common 'foundry' for various ontology initiatives, amongst which are: The Generic Model Organism Project (GMOD) Gene Ontology Consortium Sequence Ontology Ontology Lookup Service The Plant Ontology Consortium Standards and Ontologies for Functional Genomics and more

    Read more →
  • Timeline of algorithms

    Timeline of algorithms

    The following timeline of algorithms outlines the development of algorithms (mainly "mathematical recipes") since their inception. == Antiquity == Before – writing about "recipes" (on cooking, rituals, agriculture and other themes) c. 1700–2000 BC – Egyptians develop earliest known algorithms for multiplying two numbers c. 1600 BC – Babylonians develop earliest known algorithms for factorization and finding square roots c. 300 BC – Euclid's algorithm c. 200 BC – the Sieve of Eratosthenes 263 AD – Gaussian elimination described by Liu Hui == Medieval Period == 628 – Chakravala method described by Brahmagupta c. 820 – Al-Khawarizmi described algorithms for solving linear equations and quadratic equations in his Algebra; the word algorithm comes from his name 825 – Al-Khawarizmi described the algorism, algorithms for using the Hindu–Arabic numeral system, in his treatise On the Calculation with Hindu Numerals, which was translated into Latin as Algoritmi de numero Indorum, where "Algoritmi", the translator's rendition of the author's name gave rise to the word algorithm (Latin algorithmus) with a meaning "calculation method" c. 850 – cryptanalysis and frequency analysis algorithms developed by Al-Kindi (Alkindus) in A Manuscript on Deciphering Cryptographic Messages, which contains algorithms on breaking encryptions and ciphers c. 1025 – Ibn al-Haytham (Alhazen), was the first mathematician to derive the formula for the sum of the fourth powers, and in turn, he develops an algorithm for determining the general formula for the sum of any integral powers c. 1400 – Ahmad al-Qalqashandi gives a list of ciphers in his Subh al-a'sha which include both substitution and transposition, and for the first time, a cipher with multiple substitutions for each plaintext letter; he also gives an exposition on and worked example of cryptanalysis, including the use of tables of letter frequencies and sets of letters which can not occur together in one word == Before 1940 == 1540 – Lodovico Ferrari discovered a method to find the roots of a quartic polynomial 1545 – Gerolamo Cardano published Cardano's method for finding the roots of a cubic polynomial 1614 – John Napier develops method for performing calculations using logarithms 1671 – Newton–Raphson method developed by Isaac Newton 1690 – Newton–Raphson method independently developed by Joseph Raphson 1706 – John Machin develops a quickly converging inverse-tangent series for π and computes π to 100 decimal places 1768 – Leonhard Euler publishes his method for numerical integration of ordinary differential equations in problem 85 of Institutiones calculi integralis 1789 – Jurij Vega improves Machin's formula and computes π to 140 decimal places, 1805 – FFT-like algorithm known by Carl Friedrich Gauss 1842 – Ada Lovelace writes the first algorithm for a computing engine 1903 – A fast Fourier transform algorithm presented by Carle David Tolmé Runge 1918 - Soundex 1926 – Borůvka's algorithm 1926 – Primary decomposition algorithm presented by Grete Hermann 1927 – Hartree–Fock method developed for simulating a quantum many-body system in a stationary state. 1934 – Delaunay triangulation developed by Boris Delaunay 1936 – Turing machine, an abstract machine developed by Alan Turing, with others developed the modern notion of algorithm. == 1940s == 1942 – A fast Fourier transform algorithm developed by G.C. Danielson and Cornelius Lanczos 1945 – Merge sort developed by John von Neumann 1947 – Simplex algorithm developed by George Dantzig == 1950s == 1950 – Hamming codes developed by Richard Hamming 1952 – Huffman coding developed by David A. Huffman 1953 – Simulated annealing introduced by Nicholas Metropolis 1954 – Radix sort computer algorithm developed by Harold H. Seward 1964 – Box–Muller transform for fast generation of normally distributed numbers published by George Edward Pelham Box and Mervin Edgar Muller. Independently pre-discovered by Raymond E. A. C. Paley and Norbert Wiener in 1934. 1956 – Kruskal's algorithm developed by Joseph Kruskal 1956 – Ford–Fulkerson algorithm developed and published by R. Ford Jr. and D. R. Fulkerson 1957 – Prim's algorithm developed by Robert Prim 1957 – Bellman–Ford algorithm developed by Richard E. Bellman and L. R. Ford, Jr. 1959 – Dijkstra's algorithm developed by Edsger Dijkstra 1959 – Shell sort developed by Donald L. Shell 1959 – De Casteljau's algorithm developed by Paul de Casteljau 1959 – QR factorization algorithm developed independently by John G.F. Francis and Vera Kublanovskaya 1959 – Rabin–Scott powerset construction for converting NFA into DFA published by Michael O. Rabin and Dana Scott == 1960s == 1960 – Karatsuba multiplication 1961 – CRC (Cyclic redundancy check) invented by W. Wesley Peterson 1962 – AVL trees 1962 – Quicksort developed by C. A. R. Hoare 1962 – Bresenham's line algorithm developed by Jack E. Bresenham 1962 – Gale–Shapley 'stable-marriage' algorithm developed by David Gale and Lloyd Shapley 1964 – Heapsort developed by J. W. J. Williams 1964 – multigrid methods first proposed by R. P. Fedorenko 1965 – Cooley–Tukey algorithm rediscovered by James Cooley and John Tukey 1965 – Levenshtein distance developed by Vladimir Levenshtein 1965 – Cocke–Younger–Kasami (CYK) algorithm independently developed by Tadao Kasami 1965 – Buchberger's algorithm for computing Gröbner bases developed by Bruno Buchberger 1965 – LR parsers invented by Donald Knuth 1966 – Dantzig algorithm for shortest path in a graph with negative edges 1967 – Viterbi algorithm proposed by Andrew Viterbi 1967 – Cocke–Younger–Kasami (CYK) algorithm independently developed by Daniel H. Younger 1968 – A graph search algorithm described by Peter Hart, Nils Nilsson, and Bertram Raphael 1968 – Risch algorithm for indefinite integration developed by Robert Henry Risch 1969 – Strassen algorithm for matrix multiplication developed by Volker Strassen == 1970s == 1970 – Dinic's algorithm for computing maximum flow in a flow network by Yefim (Chaim) A. Dinitz 1970 – Knuth–Bendix completion algorithm developed by Donald Knuth and Peter B. Bendix 1970 – BFGS method of the quasi-Newton class 1970 – Needleman–Wunsch algorithm published by Saul B. Needleman and Christian D. Wunsch 1972 – Edmonds–Karp algorithm published by Jack Edmonds and Richard Karp, essentially identical to Dinic's algorithm from 1970 1972 – Graham scan developed by Ronald Graham 1972 – Red–black trees and B-trees discovered 1973 – RSA encryption algorithm discovered by Clifford Cocks 1973 – Jarvis march algorithm developed by R. A. Jarvis 1973 – Hopcroft–Karp algorithm developed by John Hopcroft and Richard Karp 1974 – Pollard's p − 1 algorithm developed by John Pollard 1974 – Quadtree developed by Raphael Finkel and J.L. Bentley 1975 – Genetic algorithms popularized by John Holland 1975 – Pollard's rho algorithm developed by John Pollard 1975 – Aho–Corasick string matching algorithm developed by Alfred V. Aho and Margaret J. Corasick 1975 – Cylindrical algebraic decomposition developed by George E. Collins 1976 – Salamin–Brent algorithm independently discovered by Eugene Salamin and Richard Brent 1976 – Knuth–Morris–Pratt algorithm developed by Donald Knuth and Vaughan Pratt and independently by J. H. Morris 1977 – Boyer–Moore string-search algorithm for searching the occurrence of a string into another string. 1977 – RSA encryption algorithm rediscovered by Ron Rivest, Adi Shamir, and Len Adleman 1977 – LZ77 algorithm developed by Abraham Lempel and Jacob Ziv 1977 – multigrid methods developed independently by Achi Brandt and Wolfgang Hackbusch 1978 – LZ78 algorithm developed from LZ77 by Abraham Lempel and Jacob Ziv 1978 – Bruun's algorithm proposed for powers of two by Georg Bruun 1979 – Khachiyan's ellipsoid method developed by Leonid Khachiyan 1979 – ID3 decision tree algorithm developed by Ross Quinlan == 1980s == 1980 – Brent's Algorithm for cycle detection Richard P. Brendt 1981 – Quadratic sieve developed by Carl Pomerance 1981 – Smith–Waterman algorithm developed by Temple F. Smith and Michael S. Waterman 1983 – Simulated annealing developed by S. Kirkpatrick, C. D. Gelatt and M. P. Vecchi 1983 – Classification and regression tree (CART) algorithm developed by Leo Breiman, et al. 1984 – LZW algorithm developed from LZ78 by Terry Welch 1984 – Karmarkar's interior-point algorithm developed by Narendra Karmarkar 1984 – ACORN PRNG discovered by Roy Wikramaratna and used privately 1985 – Simulated annealing independently developed by V. Cerny 1985 – Car–Parrinello molecular dynamics developed by Roberto Car and Michele Parrinello 1985 – Splay trees discovered by Sleator and Tarjan 1986 – Blum Blum Shub proposed by L. Blum, M. Blum, and M. Shub 1986 – Push relabel maximum flow algorithm by Andrew Goldberg and Robert Tarjan 1986 – Barnes–Hut tree method developed by Josh Barnes and Piet Hut for fast approximate simulation of n-body problems 1987 – Fast multipole method developed by Leslie Greengard and Vladimir

    Read more →
  • AI-assisted reverse engineering

    AI-assisted reverse engineering

    AI-assisted reverse engineering (AIARE) is a branch of computer science that leverages artificial intelligence (AI), notably machine learning (ML) strategies, to augment and automate the process of reverse engineering. The latter involves breaking down a product, system, or process to comprehend its structure, design, and functionality. AIARE was primarily introduced in the early years of the 21st century, witnessing substantial advancements from the mid-2010s onwards. == Overview == Conventionally, reverse engineering is conducted by specialists who dismantle a system to grasp its working principles, often for the purposes of reproduction, modification, enhancement of compatibility, or forensic examination. This method, while efficient, can be laborious and time-intensive, particularly when dealing with intricate software or hardware systems. AIARE integrates machine learning algorithms to either partially automate or augment this process. It is capable of detecting patterns, relationships, structures, and potential vulnerabilities within the analyzed system, frequently surpassing human experts in speed and accuracy. This has rendered AIARE a critical tool in numerous fields, including cybersecurity, software development, and hardware design and analysis. == Techniques == AIARE encompasses several AI methodologies: === Supervised learning === Supervised learning employs tagged data to train models to recognize system components, their operations, and their interconnections. This method is particularly helpful in software analysis to discover vulnerabilities or enhance compatibility. === Unsupervised learning === Unsupervised learning is utilized to detect concealed patterns and structures in untagged data. It proves beneficial in comprehending complex systems where there's no evident labeling or mapping of components. === Reinforcement learning === Reinforcement learning is employed to build models that progressively refine their system understanding through a process of trial and error. This method is often implemented when deciphering a system's functionality under various circumstances or configurations. === Deep learning === Deep learning is employed for analysis of high-dimensional data. For instance, deep learning techniques can aid in examining the layout and connections of integrated circuits (ICs), substantially reducing the manual effort required for reverse engineering. == Benefits == === Usable Security === AIARE expands usable security as reverse engineering is traditionally slow and highly specialized as it produces dense, low-level information (usually in Assembly or C) when using tools like Ghidra. The use of multiple different methods to interface with models today (such as through chat bots like ChatGPT) greatly reduces the barrier to entry by providing a clear way to interact with the user and even providing meaningful decompiled source code. In addition, either done automatically or through prompt engineering, a model is capable of producing a high-level summary and explanation of its reverse engineering efforts in human-readable form that doesn't require much knowledge on code. === Speedup === AIARE is capable of processing data much faster than humans, providing a boost in speed when analyzing said data. In the context of computer security, this can greatly speed up incident management or response and malware detection as AIARE can be automated to drastically reduce the manual effort usually associated with reverse engineering. == Limitations == In an effort to improve readability for reverse engineering, AI-generated code may introduce erroneous bugs not present in the source. This compromises the correctness of the code if not carefully validated and will throw off reverse engineering efforts. Additionally, AIARE's weakness in zero-shot prompting makes gathering accurate data without reference data in the prompt more inconsistent, thus requiring a user to provide some quality data of their own that hurts its usability.

    Read more →
  • Dave's Redistricting

    Dave's Redistricting

    Dave's Redistricting App (DRA) is an online web app originally created by Dave Bradlee that allows anyone to simulate redistricting a U.S. state's congressional and legislative districts. == Purpose == According to Bradlee, the software was designed to "put power in people's hands," and so that they "can see how the process works, so it's a little less mysterious than it was 10 years ago." Bradlee has noticed that many citizens are taking this process seriously and using his app to create legitimate redistricting maps that could be put in place. Some websites have called Bradlee the pioneer and cause of the rise of do-it-yourself redistricting. States such as Montana in 2021 allowed the general population to use it to submit redistricting proposals following the 2020 United States Census. Dave's Redistricting has frequently been mentioned as a resource that can be used to combat gerrymandering, given that the public has free access to it. Political science firms such as FiveThirtyEight have used the website to draw examples of gerrymandered districts, including on their famous Atlas of Redistricting. Dave Bradlee built the first generation of DRA. DRA 2020 is built by a small team of volunteers—Dave Bradlee, Terry Crowley, Alec Ramsay, and David Rinn—all with a shared passion for technology & democracy and all Microsoft veterans. Their mission is to empower civic organizations and citizen activists to advocate for fair congressional and legislative districts and increased transparency in the redistricting process. == Functions == Users can redraw the congressional and state legislative districts for all 50 states, the District of Columbia, and Puerto Rico using a variety of census and election datasets including Cook PVI. Maps can be optimized for different criteria. DRA 2020 added several major features to the first generation app: Sharing & collaborative editing of maps, like Google Docs Multiple statewide elections for all 50 states including the ability to import your own data Comprehensive analytics for evaluating and comparing maps Custom overlays, and Block-level editing DRA remains free to use. == Versions == 2.2: This uses Bing Maps, an outdated software that projects the districts of a single state onto a map of the United States. 2.5: After Bing Maps announced that it would no longer be updating for the foreseen future, the U.S. Map feature was removed. DRA 2020: At the end of 2018, a beta version of 2020 was released. This version that did not require Microsoft Silverlight and could be used in any web browser. DRA 2020 has been under continuous development since and is built using React (JavaScript library), Mapbox, OpenStreetMap, TypeScript, Node.js, Amazon Web Services, as well as many open source components, tools, and icons.

    Read more →
  • Pointer jumping

    Pointer jumping

    Pointer jumping or path doubling is a design technique for parallel algorithms that operate on pointer structures, such as linked lists and directed graphs. Pointer jumping allows an algorithm to follow paths with a time complexity that is logarithmic with respect to the length of the longest path. It does this by "jumping" to the end of the path computed by neighbors. The basic operation of pointer jumping is to replace each neighbor in a pointer structure with its neighbor's neighbor. In each step of the algorithm, this replacement is done for all nodes in the data structure, which can be done independently in parallel. In the next step when a neighbor's neighbor is followed, the neighbor's path already followed in the previous step is added to the node's followed path in a single step. Thus, each step effectively doubles the distance traversed by the explored paths. Pointer jumping is best understood by looking at simple examples such as list ranking and root finding. == List ranking == One of the simpler tasks that can be solved by a pointer jumping algorithm is the list ranking problem. This problem is defined as follows: given a linked list of N nodes, find the distance (measured in the number of nodes) of each node to the end of the list. The distance d(n) is defined as follows, for nodes n that point to their successor by a pointer called next: If n.next is nil, then d(n) = 0. For any other node, d(n) = d(n.next) + 1. This problem can easily be solved in linear time on a sequential machine, but a parallel algorithm can do better: given n processors, the problem can be solved in logarithmic time, O(log N), by the following pointer jumping algorithm: The pointer jumping occurs in the last line of the algorithm, where each node's next pointer is reset to skip the node's direct successor. It is assumed, as in common in the PRAM model of computation, that memory access are performed in lock-step, so that each n.next.next memory fetch is performed before each n.next memory store; otherwise, processors may clobber each other's data, producing inconsistencies. The following diagram follows how the parallel list ranking algorithm uses pointer jumping for a linked list with 11 elements. As the algorithm describes, the first iteration starts initialized with all ranks set to 1 except those with a null pointer for next. The first iteration looks at immediate neighbors. Each subsequent iteration jumps twice as far as the previous. Analyzing the algorithm yields a logarithmic running time. The initialization loop takes constant time, because each of the N processors performs a constant amount of work, all in parallel. The inner loop of the main loop also takes constant time, as does (by assumption) the termination check for the loop, so the running time is determined by how often this inner loop is executed. Since the pointer jumping in each iteration splits the list into two parts, one consisting of the "odd" elements and one of the "even" elements, the length of the list pointed to by each processor's n is halved in each iteration, which can be done at most O(log N) time before each list has a length of at most one. == Root finding == Following a path in a graph is an inherently serial operation, but pointer jumping reduces the total amount of work by following all paths simultaneously and sharing results among dependent operations. Pointer jumping iterates and finds a successor — a vertex closer to the tree root — each time. By following successors computed for other vertices, the traversal down each path can be doubled every iteration, which means that the tree roots can be found in logarithmic time. Pointer doubling operates on an array successor with an entry for every vertex in the graph. Each successor[i] is initialized with the parent index of vertex i if that vertex is not a root or to i itself if that vertex is a root. At each iteration, each successor is updated to its successor's successor. The root is found when the successor's successor points to itself. The following pseudocode demonstrates the algorithm. algorithm Input: An array parent representing a forest of trees. parent[i] is the parent of vertex i or itself for a root Output: An array containing the root ancestor for every vertex for i ← 1 to length(parent) do in parallel successor[i] ← parent[i] while true for i ← 1 to length(successor) do in parallel successor_next[i] ← successor[successor[i]] if successor_next = successor then break for i ← 1 to length(successor) do in parallel successor[i] ← successor_next[i] return successor The following image provides an example of using pointer jumping on a small forest. On each iteration the successor points to the vertex following one more successor. After two iterations, every vertex points to its root node. == History and examples == Although the name pointer jumping would come later, JáJá attributes the first uses of the technique in early parallel graph algorithms and list ranking. The technique has been described with other names such as shortcutting, but by the 1990s textbooks on parallel algorithms consistently used the term pointer jumping. Today, pointer jumping is considered a software design pattern for operating on recursive data types in parallel. As a technique for following linked paths, graph algorithms are a natural fit for pointer jumping. Consequently, several parallel graph algorithms utilizing pointer jumping have been designed. These include algorithms for finding the roots of a forest of rooted trees, connected components, minimum spanning trees, and biconnected components. However, pointer jumping has also shown to be useful in a variety of other problems including computer vision, image compression, and Bayesian inference.

    Read more →
  • Transaction data

    Transaction data

    Transaction data or transaction information is a category of data describing transactions. Transaction data/information gather variables generally referring to reference data or master data – e.g. dates, times, time zones, currencies. Typical transactions are: Financial transactions about orders, invoices, payments; Work transactions about plans, activity records; Logistic transactions about deliveries, storage records, travel records, etc.. == Management == Recording and storing transactions is called records management. The record of the transaction is stored in a place where the retention can be guaranteed and where data is archived or removed following a retention period. Formats of recorded transactions can be digital data in databases and spreadsheets, or handwritten texts in physical documents like former bankbooks. Transaction processing systems are application software that generate transactions and manage transaction data/information, e.g. SAP and Oracle Financials. == Data warehousing == Transaction data can be summarised in a data warehouse, which helps accessibility and analysis of the data.

    Read more →
  • Semantic translation

    Semantic translation

    Semantic translation is the process of using semantic information to aid in the translation of data in one representation or data model to another representation or data model. Semantic translation takes advantage of semantics that associate meaning with individual data elements in one dictionary to create an equivalent meaning in a second system. An example of semantic translation is the conversion of XML data from one data model to a second data model using formal ontologies for each system such as the Web Ontology Language (OWL). This is frequently required by intelligent agents that wish to perform searches on remote computer systems that use different data models to store their data elements. The process of allowing a single user to search multiple systems with a single search request is also known as federated search. Semantic translation should be differentiated from data mapping tools that do simple one-to-one translation of data from one system to another without actually associating meaning with each data element. Semantic translation requires that data elements in the source and destination systems have "semantic mappings" to a central registry or registries of data elements. The simplest mapping is of course where there is equivalence. There are three types of Semantic equivalence: Class Equivalence - indicating that class or "concepts" are equivalent. For example: "Person" is the same as "Individual" Property Equivalence - indicating that two properties are equivalent. For example: "PersonGivenName" is the same as "FirstName" Instance Equivalence - indicating that two individual instances of objects are equivalent. For example: "Dan Smith" is the same person as "Daniel Smith" Semantic translation is very difficult if the terms in a particular data model do not have direct one-to-one mappings to data elements in a foreign data model. In that situation, an alternative approach must be used to find mappings from the original data to the foreign data elements. This problem can be alleviated by centralized metadata registries that use the ISO-11179 standards such as the National Information Exchange Model (NIEM).

    Read more →
  • Protocol engineering

    Protocol engineering

    Protocol engineering is the application of systematic methods to the development of communication protocols. It uses many of the principles of software engineering, but it is specific to the development of distributed systems. == History == When the first experimental and commercial computer networks were developed in the 1970s, the concept of protocols was not yet well developed. These were the first distributed systems. In the context of the newly adopted layered protocol architecture (see OSI model), the definition of the protocol of a specific layer should be such that any entity implementing that specification in one computer would be compatible with any other computer containing an entity implementing the same specification, and their interactions should be such that the desired communication service would be obtained. On the other hand, the protocol specification should be abstract enough to allow different choices for the implementation on different computers. It was recognized that a precise specification of the expected service provided by the given layer was important. It is important for the verification of the protocol, which should demonstrate that the communication service is provided if both protocol entities implement the protocol specification correctly. This principle was later followed during the standardization of the OSI protocol stack, in particular for the transport layer. It was also recognized that some kind of formalized protocol specification would be useful for the verification of the protocol and for developing implementations, as well as test cases for checking the conformance of an implementation against the specification. While initially mainly finite-state machine were used as (simplified) models of a protocol entity, in the 1980s three formal specification languages were standardized, two by ISO and one by ITU. The latter, called SDL, was later used in industry and has been merged with UML state machines. == Principles == The following are the most important principles for the development of protocols: Layered architecture: A protocol layer at the level n consists of two (or more) entities that have a service interface through which the service of the layer is provided to the users of the protocol, and which uses the service provided by a local entity of level (n-1). The service specification of a layer describes, in an abstract and global view, the behavior of the layer as visible at the service interfaces of the layer. The protocol specification defines the requirements that should be satisfied by each entity implementation. Protocol verification consists of showing that two (or more) entities satisfying the protocol specification will provide at their service interfaces the specified service of that layer. The (verified) protocol specification is used mainly for the following two activities: The development of an entity implementation. Note that the abstract properties of the service interface are defined by the service specification (and also used by the protocol specification), but the detailed nature of the interface can be chosen during the implementation process, separately for each entity. Test suite development for conformance testing. Protocol conformance testing checks that a given entity implementation conforms to the protocol specification. The conformance test cases are developed based on the protocol specification and are applicable to all entity implementations. Therefore standard conformance test suites have been developed for certain protocol standards. == Methods and tools == Tools for the activities of protocol verification, entity implementation and test suite development can be developed when the protocol specification is written in a formalized language which can be understood by the tool. As mentioned, formal specification languages have been proposed for protocol specification, and the first methods and tools where based on finite-state machine models. Reachability analysis was proposed to understand all possible behaviors of a distributed system, which is essential for protocol verification. This was later complemented with model checking. However, finite-state descriptions are not powerful enough to describe constraints between message parameters and the local variables in the entities. Such constraints can be described by the standardized formal specification languages mentioned above, for which powerful tools have been developed. It is in the field of protocol engineering that model-based development was used very early. These methods and tools have later been used for software engineering as well as hardware design, especially for distributed and real-time systems. On the other hand, many methods and tools developed in the more general context of software engineering can also be used of the development of protocols, for instance model checking for protocol verification, and agile methods for entity implementations. == Constructive methods for protocol design == Most protocols are designed by human intuition and discussions during the standardization process. However, some methods have been proposed for using constructive methods possibly supported by tools to automatically derive protocols that satisfy certain properties. The following are a few examples: Semi-automatic protocol synthesis: The user defines all message sending actions of the entities, and the tool derives all necessary reception actions (even if several messages are in transit). Synchronizing protocol: The state transitions of one protocol entity are given by the user, and the method derives the behavior of the other entity such that it remains in states that correspond to the former entity. Protocol derived from service specification: The service specification is given by the user and the method derives a suitable protocol for all entities. Protocol for control applications: The specification of one entity (called the plant - which must be controlled) is given, and the method derives a specification of the other entity such that certain fail states of the plant are never reached and certain given properties of the plant's service interactions are satisfied. This is a case of supervisory control. == Books == Ming T. Liu, Protocol Engineering, Advances in Computers, Volume 29, 1989, Pages 79–195. G.J. Holzmann, Design and Validation of Computer Protocols, Prentice Hall, 1991. H. König, Protocol Engineering, Springer, 2012. M. Popovic, Communication Protocol Engineering, CRC Press, 2nd Ed. 2018. P. Venkataram, S.S. Manvi, B.S. Babu, Communication Protocol Engineering, 2014.

    Read more →
  • Vinelink.com

    Vinelink.com

    Vinelink.com (VINE) is a national website in the United States that allows victims of crime, and the general public, to track the movements of prisoners held by the various states and territories. The first four letters in the websites name, "vine", are an acronym for "Victim Information and Notification Everyday". Vinelink.com displays information, based on the information provided by the various states' departments of correction and other law enforcement agencies, on whether an inmate is in custody, has been released, has been granted parole or probation, or has escaped from custody. In some cases, the website will reveal whether a defendant has been granted parole or probation, but then subsequently violated conditions of their release and become a fugitive. Information provided on Vinelink.com represents metadata, in that the website lists a defendant's custody status; but does not list what the individual is charged with, their criminal history, or the amount of their bail, if applicable. Internet users accessing the Vinelink.com website choose from a map of states and provinces within the United States where they wish to perform a search for an inmate. The user may then search for an individual using the inmate's or parolee's name, or by entering the inmate's specific department of corrections inmate number, if known. When the inmate's custody status changes, users who have registered to be notified of such changes will be notified via email, phone or both. This information is currently released upon request, without the website requesting reasons for the users search or requiring payment, as public records available to the general public. Inmate information is available for most states, and for Puerto Rico, on the website. The states of Arizona, Georgia, Massachusetts, Montana, New Hampshire and West Virginia provide very limited information on the site. In March of 2025, The Maine Sheriff's Association entered into a contract to pilot the use of the VINE system in three counties in the state as well as a regional jail, therefore making South Dakota the only state that does not participate in the VINE system to any degree. The website does not provide data on prisoners detained by the Federal Bureau of Prisons which has its own inmate locator web site nor for inmates of the U.S. military prisons.

    Read more →
  • Knowledge spillover

    Knowledge spillover

    Knowledge spillover is an exchange of ideas among individuals. Knowledge spillover is usually replaced by terminations of technology spillover, R&D spillover and/or spillover (economics) when the concept is specific to technology management and innovation economics. In knowledge management economics, knowledge spillovers are non-rival knowledge market costs incurred by a party not agreeing to assume the costs that has a spillover effect of stimulating technological improvements in a neighbor through one's own innovation. Such innovations often come from specialization within an industry. There are two kinds of knowledge spillovers: internal and external. Internal knowledge spillover occurs if there is a positive impact of knowledge between individuals within an organization that produces goods and/or services. An external knowledge spillover occurs when the positive impact of knowledge is between individuals outside of a production organization. Marshall–Arrow–Romer (MAR) spillovers, Porter spillovers and Jacobs spillovers are three types of spillovers. == Conceptualizations == === Marshall–Arrow–Romer === Marshall–Arrow–Romer (MAR) spillover has its origins in 1890, where the English economist Alfred Marshall developed a theory of knowledge spillovers. Knowledge spillovers later were extended by economists Kenneth Arrow (1962) and Paul Romer (1986). In 1992, Edward Glaeser, Hedi Kallal, José Scheinkman, and Andrei Shleifer pulled together the Marshall–Arrow–Romer views on knowledge spillovers and accordingly named the view MAR spillover in 1992. Under the Marshall–Arrow–Romer (MAR) spillover view, the proximity of firms within a common industry often affects how well knowledge travels among firms to facilitate innovation and growth. The closer the firms are to one another, the greater the MAR spillover. The exchange of ideas is largely from employee to employee, in that employees from different firms in an industry exchange ideas about new products and new ways to produce goods. The opportunity to exchange ideas that lead to innovations key to new products and improved production methods. Research on the Cambridge IT Cluster (UK) suggests that technological knowledge spillovers might only happen rarely and are less important than other cluster benefits such as labour market pooling. === Porter === Porter (1990), like MAR, argues that knowledge spillovers in specialized, geographically concentrated industries stimulate growth. He insists, however, that local competition, as opposed to local monopoly, fosters the pursuit and rapid adoption of innovation. He gives examples of Italian ceramics and gold jewellery industries, in which hundreds of firms are located together and fiercely compete to innovate since the alternative to innovation is demise. Porter's externalities are maximized in cities with geographically specialized, competitive industries. === Jacobs === Under the Jacobs spillover view, the proximity of firms from different industries affect how well knowledge travels among firms to facilitate innovation and growth. This is in contrast to MAR spillovers, which focus on firms in a common industry. The diverse proximity of a Jacobs spillover brings together ideas among individuals with different perspectives to encourage an exchange of ideas and foster innovation in an industrially diverse environment. Developed in 1969 by urbanist Jane Jacobs and John Jackson the concept that Detroit’s shipbuilding industry from the 1830s was the critical antecedent leading to the 1890s development of the auto industry in Detroit since the gasoline engine firms easily transitioned from building gasoline engines for ships to building them for automobiles. == Incoming and outgoing spillovers == Knowledge spillover has asymmetric directions. The focal entity and receives or outflows know-how to others, creating incoming and outgoing spillovers. Cassiman and Veugelers (2002) use survey data and estimate incoming and outgoing spillover and study the economic impacts. Incoming spillover increases growth opportunity and productivity improvements of receivers, while outgoing spillover leads to free rider problem in the technology competition. Chen et al. (2013) use econometric method to gauge incoming spillover, a way that applies for all companies without survey. They find that incoming spillover explains R&D profits of industrial firms. == Policy implications == As information is largely non-rival in nature, certain measures must be taken to ensure that, for the originator, the information remains a private asset. As the market cannot do this efficiently, public regulations have been implemented to facilitate a more appropriate equilibrium. As a result, the concept of intellectual property rights have developed and ensure the ability of entrepreneurs to temporarily hold on to the profitability of their ideas through patents, copyrights, trade secrets, and other governmental safeguards. Conversely, such barriers to entry prevent the exploitation of informational developments by rival firms within an industry. For example, Wang (2023) indicates that technology spillovers are reduced by 27% to 51% when trade secrets laws are implemented by the Uniform Trade Secrets Act in the US. On the other hand, when the research and development of a private firm results in a social benefit, unaccounted for within the market price, often greater than the private return of the firm's research, then a subsidy to offset the underproduction of that benefit might be offered to the firm in return for its continued output of that benefit. Government subsidies are often controversial, and while they might often result in a more appropriate social equilibrium, they could also lead to undesirable political repercussions as such a subsidy must come from taxpayers, some of whom may not directly benefit from the researching firm's subsidized knowledge spillover. The concept of knowledge spillover is also used to justify subsidies to foreign direct investment, as foreign investors help diffuse technology among local firms. == Examples == Business parks are a good specific example of concentrated businesses that may benefit from MAR spillover. Many semiconductor firms intentionally located their research and development facilities in Silicon Valley to take advantage of MAR spillover. In addition, the film industry in Los Angeles, California, and elsewhere relies on a geographic concentration of specialists (directors, producers, scriptwriters, and set designers) to bring together narrow aspects of movie-making into a final product. A general example of a knowledge spillover could be the collective growth associated with the research and development of online social networking tools like Facebook, YouTube, and Twitter. Such tools have not only created a positive feedback loop, and a host of originally unintended benefits for their users, but have also created an explosion of new software, programming platforms, and conceptual breakthroughs that have perpetuated the development of the industry as a whole. The advent of online marketplaces, the utilization of user profiles, the widespread democratization of information, and the interconnectivity between tools within the industry have all been products of each tool's individual developments. These developments have since spread outside the industry into the mainstream media as news and entertainment firms have developed their own market feedback applications within the tools themselves, and their own versions of online networking tools (e.g. CNN’s iReport).

    Read more →