AI Chatbot Questionnaire

AI Chatbot Questionnaire — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • MeeMix

    MeeMix

    MeeMix Ltd is a company specializing in personalizing media-related content recommendations, discovery and advertising for the telecommunication industry, founded in 2006. On January 1, 2008, MeeMix launched meemix.com, a public personalized internet radio serving as an online testbed for the development of music taste-prediction technologies. Subsequently, MeeMix released in 2009 a line of Business-to-business commercial services intended to personalize media recommendations, discovery and advertising. MeeMix hybrid taste-prediction technology relies on integrating machine learning algorithms, digital signal processing, behavior analysis, metadata analysis and collaborative filtering, and is provided via API web service. In August 2009, MeeMix was announced as Innovator Nominee in the GSM Association’s Mobile Innovation Grand Prix worldwide contest. As of 2013, MeeMix no longer features internet radios on meemix.com. On Sep 28, 2014, meemix.com went offline.

    Read more →
  • Query language

    Query language

    A query language, also known as data query language or database query language (DQL), is a computer language used to make queries in databases and information systems. In database systems, query languages rely on strict theory to retrieve information. A well known example is the Structured Query Language (SQL). == Types == Broadly, query languages can be classified according to whether they are database query languages or information retrieval query languages. The difference is that a database query language attempts to give factual answers to factual questions, while an information retrieval query language attempts to find documents containing information that is relevant to an area of inquiry. Other types of query languages include: Full-text. The simplest query language is treating all terms as bag of words that are to be matched with the postings in the inverted index and where subsequently ranking models are applied to retrieve the most relevant documents. Only tokens are defined in the CFG. Web search engines often use this approach. Boolean. A query language that also supports the use of the Boolean operators AND, OR, NOT. Structured. A language that supports searching within (a combination of) fields when a document is structured and has been indexed using its document structure. Natural language. A query language that supports natural language by parsing the natural language query to a form that can be best used to retrieve relevant documents, for example with Question answering systems or conversational search. == Examples == Attempto Controlled English is a query language that is also a controlled natural language. AQL is a query language for the ArangoDB native multi-model database system. .QL is a proprietary object-oriented query language for querying relational databases; successor of Datalog. CodeQL is the analysis engine used by developers to automate security checks, and by security researchers to perform variant analysis on GitHub. Contextual Query Language (CQL) a formal language for representing queries to information retrieval systems such as web indexes or bibliographic catalogues. Cypher is a query language for the Neo4j graph database. DMX is a query language for data mining models. Datalog is a query language for deductive databases. F-logic is a declarative object-oriented language for deductive databases and knowledge representation. FQL enables you to use a SQL-style interface to query the data exposed by the Graph API. It provides advanced features not available in the Graph API. Gellish English is a language that can be used for queries in Gellish English Databases, for dialogues (requests and responses) as well as for information modeling and knowledge modeling. Gremlin is an Apache Software Foundation graph traversal language for OLTP and OLAP graph systems. GraphQL is a data query language developed by Facebook as an alternate to REST and ad-hoc webservice architectures. HTSQL is a query language that translates HTTP queries to SQL. ISBL is a query language for PRTV, one of the earliest relational database management systems. Jaql is a functional data processing and query language most commonly used for JSON query processing. JPQL is a query language defined as part of Jakarta Persistence (used in Java applications to make queries to a relational DB using entity objects instead of DB tables). jq is a functional programming language often used for processing queries against one or more JSON documents, including very large ones. JSONiq is a declarative query language designed for collections of JSON documents. KQL (Kusto Query Language), a query language by Microsoft used in Azure Data Explorer LDAP is an application protocol for querying and modifying directory services running over TCP/IP. LogiQL is a variant of Datalog and is the query language for the LogicBlox system. M Formula language, a mashup query language used in Microsoft's Power Query. MQL is a cheminformatics query language for a substructure search allowing beside nominal properties also numerical properties. MDX is a query language for OLAP databases. N1QL is a Couchbase's query language finding data in Couchbase Servers. Object Query Language OCL (Object Constraint Language). Despite its name, OCL is also an object query language and an OMG standard. OPath, intended for use in querying WinFS Stores. Poliqarp Query Language is a special query language designed to analyze annotated text. Used in the Poliqarp search engine. PQL is a special-purpose programming language for managing process models based on information about scenarios that these models describe. PRQL PRQL (Pipelined Relational Query Language) is a modern language for transforming data. Consists of a curated set of orthogonal transformations, which are combined together to form a pipeline. PTQL based on relational queries over program traces, allowing programmers to write expressive, declarative queries about program behavior. QUEL is a relational database access language, similar in most ways to SQL. RDQL is a RDF query language. SMARTS is the cheminformatics standard for a substructure search. SPARQL is a query language for RDF graphs. SQL is a well-known query language and data manipulation language for relational databases. XQuery is a query language for XML data sources. XPath is a declarative language for navigating XML documents. YQL is an SQL-like query language created by Yahoo!. Search engine query languages, e.g., as used by Google. or Bing

    Read more →
  • Open Compute Project

    Open Compute Project

    The Open Compute Project (OCP) is an organization that facilitates the sharing of data center product designs and industry best practices among companies. Founded in 2011, OCP has significantly influenced the design and operation of large-scale computing facilities worldwide. As of February 2025, over 400 companies across the world are members of OCP, including Arm, Meta, IBM, Wiwynn, Intel, Nokia, Google, Microsoft, Seagate Technology, Dell, Rackspace, Hewlett Packard Enterprise, NVIDIA, Cisco, Goldman Sachs, Fidelity, Lenovo, Accton Technology Corporation and Alibaba Group. == Structure == The Open Compute Project Foundation is a 501(c)(6) non-profit incorporated in the state of Delaware, United States. OCP has multiple committees, including the board of directors, advisory board and steering committee to govern its operations. As of July 2020, there are seven members who serve on the board of directors which is made up of one individual member and six organizational members. Mark Roenigk (Facebook) is the Foundation's president and chairman. Andy Bechtolsheim is the individual member. In addition to Mark Roenigk who represents Facebook, other organizations on the Open Compute board of directors include Intel (Rebecca Weekly), Microsoft (Kushagra Vaid), Google (Partha Ranganathan), and Rackspace (Jim Hawkins). A list of members can be found on the OCP website. == History == The Open Compute Project began at Facebook (now Meta) in 2009 as an internal project called "Project Freedom". The hardware designs and engineering teams were led by Amir Michael (Manager, Hardware Design) and sponsored by Jonathan Heiliger (VP, Technical Operations) and Frank Frankovsky (Director, Hardware Design and Infrastructure). The three would later open source the designs of Project Freedom and co-found the Open Compute Project. The project was announced at a press event at Facebook's headquarters in Palo Alto on April 7, 2011. == OCP projects == The Open Compute Project Foundation maintains a number of OCP projects, such as: === Server designs === In 2013, two years after the Open Compute Project had started, it was noted that the goal of a more modular server design was "still a long way from live data centers". However, by then some aspects published had been used in Facebook's Prineville data center to improve energy efficiency, as measured by the power usage effectiveness index defined by The Green Grid. Efforts to advance server compute node designs included one for Intel processors and one for AMD processors. Also in 2013, Calxeda contributed a design with ARM architecture processors. Since then, several generations of OCP server designs have been deployed: Wildcat (Intel), Spitfire (AMD), Windmill (Intel E5-2600), Watermark (AMD), Winterfell (Intel E5-2600 v2) and Leopard (Intel E5-2600 v3). === OCP Accelerator Module === OCP Accelerator Module (OAM) is a design specification for hardware architectures that implement artificial intelligence systems that require high module-to-module bandwidth. OAM is used in some of AMD's Instinct accelerator modules. === Rack and power designs === Designs for a mechanical mounting system to replace standard 19-inch racks have been published, with a cabinet the same outside width (600 mm) and depth as existing racks, but with an interior space allowing for wider equipment chassis with a 537 mm width (21 inches). This allows more equipment to fit in the same volume and improves air flow. Compute chassis sizes are defined in multiples of an OpenU or OU, which is 48 mm, slightly taller than the 44 mm rack unit defined for 19-inch racks. As of March 2026, the most current base mechanical definition is the Open Rack V3.1 Specification. At the time the base specification was released, Meta also defined in greater depth the specifications for the rectifiers and power shelf. Specifications for the power monitoring interface (PMI), a communications interface enabling upstream communications between the rectifiers and battery backup unit(BBU) were published by Meta that same year, with Delta Electronics as the main technical contributor to the BBU spec. However, since 2022 the AI boom in the data center has created higher power requirements in order to satisfy the demands of AI accelerators that have been released. As of September 2024, Meta is in the process of updating its Open Rack v3 rectifier, power shelf, battery backup and power management interface specifications to accommodate this increased energy demand. In May 2024, at an Open Compute regional summit, Meta and Rittal outlined their plans for development of their High Power Rack (HPR) ecosystem in conjunction with rack, power and cable partners, increasing power capacity in the rack to 92 kilowatts or more. At the same meeting, Delta Electronics and Advanced Energy reported on their progress in developing new Open Compute standard specifications for power shelf and rectifier designs for HPR applications. Rittal also outlined their collaboration with Meta in designing airflow containment, busbar designs and grounding schemes for the new HPR requirements. === Data storage === Open Vault storage building blocks (also called "Knox") offer high disk densities, with 30 drives in a 2 OU Open Rack chassis designed for easy disk drive replacement. The 3.5 inch disks are stored in two drawers, five across and three deep in each drawer, with connections via serial attached SCSI. There is a "cold storage" variant where idle disks power down to reduce energy consumption. Another design concept was contributed by Hyve Solutions, a division of Synnex, in 2012. At the OCP Summit 2016 Facebook, together with Taiwanese ODM Wistron's spin-off Wiwynn, introduced "Lightning", a flexible NVMe JBOF (just a bunch of flash), based on the existing Open Vault (Knox) design. === Energy efficient data centers === The OCP has published data center designs for energy efficiency. These include power distribution at three-phase 277/480 VAC, which eliminates one transformer stage in typical North American data centers, a single voltage (12.5 VDC) power supply designed to work with 277/480 VAC input, and 48 VDC battery backup. For European (and other 230V countries) datacenters, there is a specification for 230/400 VAC power distribution and its conversion to 12.5 VDC. === Open networking switches === On May 8, 2013, an effort to define an open network switch was announced. The plan was to allow Facebook to load its own operating system software onto its top-of-rack switches. Press reports predicted that more expensive and higher-performance switches would continue to be popular, while less expensive products treated more like a commodity. The first attempt at an open networking switch by Facebook was designed together with Taiwanese ODM Accton using Broadcom Trident II chip and is called "Wedge"; the Linux OS that it runs is called "FBOSS". Later switch contributions include "6-pack" and Wedge-100, based on Broadcom Tomahawk chips. Similar switch hardware designs have been contributed by: Accton Technology Corporation (and its Edgecore Networks subsidiary), Mellanox Technologies, Interface Masters Technologies, Agema Systems. Capable of running Open Network Install Environment (ONIE)-compatible network operating systems such as Cumulus Linux, Switch Light OS by Big Switch Networks, or PICOS by Pica8. A similar project for a custom switch for the Google platform had been rumored, and evolved to use the OpenFlow protocol. === Servers === A sub-project for Mezzanine (NIC) OCP NIC 3.0 specification 1v00 was released in late 2019 establishing three form factors: SFF, TSFF, and LFF. == Litigation == In March, 2015, BladeRoom Group Limited and Bripco (UK) Limited sued Facebook, Emerson Electric Co. and others alleging that Facebook has disclosed BladeRoom and Bripco's trade secrets for prefabricated data centers in the Open Compute Project. Facebook petitioned for the lawsuit to be dismissed, but this was rejected in 2017. A confidential mid-trial settlement was agreed in April 2018.

    Read more →
  • OpenWSN

    OpenWSN

    OpenWSN aims to build an open standard-based and open source implementation of a complete constrained network protocol stack for wireless sensor networks and Internet of Things. The project was created at the University of California Berkeley and extended at the INRIA and at the Open University of Catalonia (UOC). The root of OpenWSN is a deterministic MAC layer implementing the IEEE 802.15.4e TSCH based on the concept of Time Slotted Channel Hopping (TSCH). Above the MAC layer, the Low Power Lossy Network stack is based on IETF standards including the IETF 6TiSCH management and adaptation layer (a minimal configuration profile, 6top protocol and different scheduling functions). The stack is complemented by an implementation of 6LoWPAN, RPL in non-storing mode, UDP and CoAP, enabling access to devices running the stack from the native IPv6 through open standards. OpenWSN is related to other projects including the following: RIOT OpenMote OpenWSN is available for Linux, Windows and OS X platforms. Current release of OpenWSN is 1.14.0.

    Read more →
  • Seed (programming)

    Seed (programming)

    Seed is a JavaScript interpreter and a library of the GNOME project to create standalone applications in JavaScript. It uses the JavaScript engine JavaScriptCore of the WebKit project. It is possible to easily create modules in C. Seed is integrated in GNOME since the 2.28 version and is used by two games in the GNOME Games package. It is also used by the Web web browser for the design of its extensions. The module is also officially supported by the GTK+ project. == Hello world in Seed == This example uses the standard output to output the string "Hello, World". == A program using GTK+ == This code shows an empty window named "Example". == Modules == To use a module, just instantiate a class having for name imports. followed by the name of the module respecting the case sensitivity. The modules using GObject Introspection, who starts by imports.gi. : Gtk Gst GObject Gio Clutter GLib Gdk WebKit GdkPixbuf, GdkPixbuf Libxml Cairo DBus MPFR Os (system library) Canvas (using Cairo) multiprocessing readline Archived 2009-11-09 at the Wayback Machine ffi sqlite sandbox Archived 2009-11-09 at the Wayback Machine == List of the Seed versions == The names of the versions of Seed are albums of famous rock bands.

    Read more →
  • Known-item search

    Known-item search

    Known-item search is a specialization of information exploration which represents the activities carried out by searchers who have a particular item in mind. In the context of library catalogs, known‐item search means a search for an item for which the author or title is known. Although the concept of known-item search originated in library science, it is now applied in the context of web search and other online search activities. Known-item search is distinguished from exploratory search, in which a searcher is unfamiliar with the domain of their search goal, unsure about the ways to achieve their goal, and/or unsure about what their goal is.

    Read more →
  • Sieve of Pritchard

    Sieve of Pritchard

    In mathematics, the sieve of Pritchard is an algorithm for finding all prime numbers up to a specified bound. Like the ancient sieve of Eratosthenes, it has a simple conceptual basis in number theory. It is especially suited to quick hand computation for small bounds. Whereas the sieve of Eratosthenes marks off each non-prime for each of its prime factors, the sieve of Pritchard avoids considering almost all non-prime numbers by building progressively larger wheels, which represent the pattern of numbers not divisible by any of the primes processed thus far. It thereby achieves a better asymptotic complexity, and was the first sieve with a running time sublinear in the specified bound. Its asymptotic running-time has not been improved on, and it deletes fewer composites than any other known sieve. It was created in 1979 by Paul Pritchard. Since Pritchard has created a number of other sieve algorithms for finding prime numbers, the sieve of Pritchard is sometimes singled out by being called the wheel sieve (by Pritchard himself) or the dynamic wheel sieve. == Overview == A prime number is a natural number that has no natural number divisors other than the number 1 and itself. To find all the prime numbers less than or equal to a given integer N, a sieve algorithm examines a set of candidates in the range 2, 3, …, N, and eliminates those that are not prime, leaving the primes at the end. The sieve of Eratosthenes examines all of the range, first removing all multiples of the first prime 2, then of the next prime 3, and so on. The sieve of Pritchard instead examines a subset of the range consisting of numbers that occur on successive wheels, which represent the pattern of numbers left after each successive prime is processed by the sieve of Eratosthenes. For i > 0, the ith wheel Wi represents this pattern. It is the set of numbers between 1 and the product Pi = p1 · p2 ⋯ pi of the first i prime numbers that are not divisible by any of these prime numbers (and is said to have an associated length Pi). This is because adding Pi to a number does not change whether it is divisible by one of the first i prime numbers, since the remainder on division by any one of these primes is unchanged. So W1 = {1} with length P1 = 2 represents the pattern of odd numbers; W2 = {1,5} with length P2 = 6 represents the pattern of numbers not divisible by 2 or 3; etc. Wheels are so-called because Wi can be usefully visualized as a circle of circumference Pi with its members marked at their corresponding distances from an origin. Then rolling the wheel along the number line marks points corresponding to successive numbers not divisible by one of the first i prime numbers. The animation shows W2 being rolled up to 30. It is useful to define Wi → n for n > 0 to be the result of rolling Wi up to n. Then the animation generates W2 → 30 = {1,5,7,11,13,17,19,23,25,29}. Note that up to 52 − 1 = 24, this consists only of 1 and the primes between 5 and 25. The sieve of Pritchard is derived from the observation that this holds generally: for all i > 0, the values in Wi → (p2i+1 − 1) are 1 and the primes between pi+1 and p2i+1. It even holds for i = 0, where the wheel has length 1 and contains just 1 (representing all the natural numbers). So the sieve of Pritchard starts with the trivial wheel W0 and builds successive wheels until the square of the wheel's first member after 1 is at least N. Wheels grow very quickly, but only their values up to N are needed and generated. It remains to find a method for generating the next wheel. Note in the animation that W3 = {1,5,7,11,13,17,19,23,25,29} − {5 · 1 , 5 · 5} can be obtained by rolling W2 up to 30 and then removing 5 times each member of W2.This also holds generally: for all i ≥ 0, Wi+1 = (Wi → Pi+1) − {pi+1 · w | w ∈ Wi}. Rolling Wi past Pi just adds values to Wi, so the current wheel is first extended by getting each successive member starting with w = 1, adding Pi to it, and inserting the result in the set. Then the multiples of pi+1 are deleted. Care must be taken to avoid a number being deleted that itself needs to be multiplied by pi+1. The sieve of Pritchard as originally presented does so by first skipping past successive members until finding the maximum one needed, and then doing the deletions in reverse order by working back through the set. This is the method used in the first animation above. A simpler approach is just to gather the multiples of pi+1 in a list, and then delete them. Another approach is given by Gries and Misra. If the main loop terminates with a wheel whose length is less than N, it is extended up to N to generate the remaining primes. The algorithm, for finding all primes up to N, is therefore as follows: Start with a set W = {1} and length = 1 representing wheel 0, and prime p = 2. As long as p2 ≤ N, do the following: if length < N, then extend W by repeatedly getting successive members w of W starting with 1 and inserting length + w into W as long as it does not exceed p · length or N; increase length to the minimum of p · length and N. repeatedly delete p times each member of W by first finding the largest ≤ length and then working backwards. note the prime p, then set p to the next member of W after 1 (or 3 if p was 2). if length < N, then extend W to N by repeatedly getting successive members w of W starting with 1 and inserting length + w into W as long as it does not exceed N; On termination, the rest of the primes up to N are the members of W after 1. === Example === To find all the prime numbers less than or equal to 150, proceed as follows. Start with wheel 0 with length 1, representing all natural numbers 1, 2, 3...: 1 The first number after 1 for wheel 0 (when rolled) is 2; note it as a prime. Now form wheel 1 with length 2 × 1 = 2 by first extending wheel 0 up to 2 and then deleting 2 times each number in wheel 0, to get: 1 2 The first number after 1 for wheel 1 (when rolled) is 3; note it as a prime. Now form wheel 2 with length 3 × 2 = 6 by first extending wheel 1 up to 6 and then deleting 3 times each number in wheel 1, to get 1 2 3 5 The first number after 1 for wheel 2 is 5; note it as a prime. Now form wheel 3 with length 5 × 6 = 30 by first extending wheel 2 up to 30 and then deleting 5 times each number in wheel 2 (in reverse order), to get 1 2 3 5 7 11 13 17 19 23 25 29 The first number after 1 for wheel 3 is 7; note it as a prime. Now wheel 4 has length 7 × 30 = 210, so we only extend wheel 3 up to our limit 150. (No further extending will be done now that the limit has been reached.) We then delete 7 times each number in wheel 3 until we exceed our limit 150, to get the elements in wheel 4 up to 150: 1 2 3 5 7 11 13 17 19 23 25 29 31 37 41 43 47 49 53 59 61 67 71 73 77 79 83 89 91 97 101 103 107 109 113 119 121 127 131 133 137 139 143 149 The first number after 1 for this partial wheel 4 is 11; note it as a prime. Since we have finished with rolling, we delete 11 times each number in the partial wheel 4 until we exceed our limit 150, to get the elements in wheel 5 up to 150: 1 2 3 5 7 11 13 17 19 23 25 29 31 37 41 43 47 49 53 59 61 67 71 73 77 79 83 89 91 97 101 103 107 109 113 119 121 127 131 133 137 139 143 149 The first number after 1 for this partial wheel 5 is 13. Since 13 squared is at least our limit 150, we stop. The remaining numbers (other than 1) are the rest of the primes up to our limit 150. Just 8 composite numbers are removed, once each. The rest of the numbers considered (other than 1) are prime. In comparison, the natural version of Eratosthenes sieve (stopping at the same point) removes composite numbers 184 times. == Pseudocode == The sieve of Pritchard can be expressed in pseudocode, as follows: algorithm Sieve of Pritchard is input: an integer N >= 2. output: the set of prime numbers in {1,2,...,N}. let W and Pr be sets of integer values, and all other variables integer values. k, W, length, p, Pr := 1, {1}, 2, 3, {2}; {invariant: p = pk+1 and W = Wk ∩ {\displaystyle \cap } {1,2,...,N} and length = minimum of Pk,N and Pr = the primes up to pk} while p2 <= N do if (length < N) then Extend W,length to minimum of plength,N; Delete multiples of p from W; Insert p into Pr; k, p := k+1, next(W, 1) if (length < N) then Extend W,length to N; return Pr ∪ {\displaystyle \cup } W - {1}; where next(W, w) is the next value in the ordered set W after w. procedure Extend W,length to n is {in: W = Wk and length = Pk and n > length} {out: W = Wk → {\displaystyle \rightarrow } n and length = n} integer w, x; w, x := 1, length+1; while x <= n do Insert x into W; w := next(W,w); x := length + w; length := n; procedure Delete multiples of p from W,length is integer w; w := p; while pw <= length do w := next(W,w); while w > 1 do w := prev(W,w); Remove pw from W; where prev(W, w) is the previous value in the ordered set W before w. The algorithm can be initialized with W0 instead of W1 at the minor complication of making next(W, 1) a special case when k = 0. This a

    Read more →
  • Very large database

    Very large database

    A very large database, (originally written very large data base) or VLDB, is a database that contains a very large amount of data, so much that it can require specialized architectural, management, processing and maintenance methodologies. == Definition == The vague adjectives of very and large allow for a broad and subjective interpretation, but attempts at defining a metric and threshold have been made. Early metrics were the size of the database in a canonical form via database normalization or the time for a full database operation like a backup. Technology improvements have continually changed what is considered very large. One definition has suggested that a database has become a VLDB when it is "too large to be maintained within the window of opportunity… the time when the database is quiet". == Sizes of a VLDB database == There is no absolute amount of data that can be cited. For example, one cannot say that any database with more than 1 TB of data is considered a VLDB. This absolute amount of data has varied over time as computer processing, storage and backup methods have become better able to handle larger amounts of data. That said, VLDB issues may start to appear when 1 TB is approached, and are more than likely to have appeared as 30 TB or so is exceeded. == VLDB challenges == Key areas where a VLDB may present challenges include configuration, storage, performance, maintenance, administration, availability and server resources. === Configuration === Careful configuration of databases that lie in the VLDB realm is necessary to alleviate or reduce issues raised by VLDB databases. === Administration === The complexities of managing a VLDB can increase exponentially for the database administrator as database size increases. === Availability and maintenance === When dealing with VLDB operations relating to maintenance and recovery such as database reorganizations and file copies which were quite practical on a non-VLDB take very significant amounts of time and resources for a VLDB database. In particular it typically infeasible to meet a typical recovery time objective (RTO), the maximum expected time a database is expected to be unavailable due to interruption, by methods which involve copying files from disk or other storage archives. To overcome these issues techniques such as clustering, cloned/replicated/standby databases, file-snapshots, storage snapshots or a backup manager may help achieve the RTO and availability, although individual methods may have limitations, caveats, license, and infrastructure requirements while some may risk data loss and not meet the recovery point objective (RPO). For many systems only geographically remote solutions may be acceptable. ==== Backup and recovery ==== Best practice is for backup and recovery to be architectured in terms of the overall availability and business continuity solution. === Performance === Given the same infrastructure there may typically be a decrease in performance, that is increase in response time as database size increases. Some accesses will simply have more data to process (scan) which will take proportionally longer (linear time); while the indexes used to access data may grow slightly in height requiring perhaps an extra storage access to reach the data (sub-linear time). Other effects can be caching becoming less efficient because proportionally less data can be cached and while some indexes such as the B+ automatically sustain well with growth others such as a hash table may need to be rebuilt. Should an increase in database size cause the number of accessors of the database to increase then more server and network resources may be consumed, and the risk of contention will increase. Some solutions to regaining performance include partitioning, clustering, possibly with sharding, or use of a database machine. ==== Partitioning ==== Partitioning may be able assist the performance of bulk operations on a VLDB including backup and recovery., bulk movements due to information lifecycle management (ILM), reducing contention as well as allowing optimization of some query processing. === Storage === In order to satisfy needs of a VLDB the database storage needs to have low access latency and contention, high throughput, and high availability. === Server resources === The increasing size of a VLDB may put pressure on server and network resources and a bottleneck may appear that may require infrastructure investment to resolve. == Relationship to big data == VLDB is not the same as big data, but the storage aspect of big data may involve a VLDB database. That said some of the storage solutions supporting big data were designed from the start to support large volumes of data, so database administrators may not encounter VLDB issues that older versions of traditional RDBMS's might encounter.

    Read more →
  • Control-flow integrity

    Control-flow integrity

    Control-flow integrity (CFI) is a general term for computer security techniques that prevent a wide variety of malware attacks from redirecting the flow of execution (the control flow) of a program. == Background == A computer program commonly changes its control flow to make decisions and use different parts of the code. Such transfers may be direct, in that the target address is written in the code itself, or indirect, in that the target address itself is a variable in memory or a CPU register. In a typical function call, the program performs a direct call, but returns to the caller function using the stack – an indirect backward-edge transfer. When a function pointer is called, such as from a virtual table, we say there is an indirect forward-edge transfer. Attackers seek to inject code into a program to make use of its privileges or to extract data from its memory space. Before executable code was commonly made read-only, an attacker could arbitrarily change the code as it is run, targeting direct transfers or even do with no transfers at all. After W^X became widespread, an attacker wants to instead redirect execution to a separate, unprotected area containing the code to be run, making use of indirect transfers: one could overwrite the virtual table for a forward-edge attack or change the call stack for a backward-edge attack (return-oriented programming). CFI is designed to protect indirect transfers from going to unintended locations. == Techniques == Associated techniques include code-pointer separation (CPS), code-pointer integrity (CPI), stack canaries, shadow stacks (SS), and vtable pointer verification. These protections can be classified into either coarse-grained or fine-grained based on the number of targets restricted. A coarse-grained forward-edge CFI implementation, could, for example, restrict the set of indirect call targets to any function that may be indirectly called in the program, while a fine-grained one would restrict each indirect call site to functions that have the same type as the function to be called. Similarly, for a backward edge scheme protecting returns, a coarse-grained implementation would only allow the procedure to return to a function of the same type (of which there could be many, especially for common prototypes), while a fine-grained one would enforce precise return matching (so it can return only to the function that called it). == Implementations == Related implementations are available in Clang (LLVM front-end),, GNU Compiler Collection, Microsoft's Control Flow Guard and Return Flow Guard, Google's Indirect Function-Call Checks and Reuse Attack Protector (RAP). === LLVM/Clang === The LLVM compiler's C/C++ front-end Clang provides a number of "CFI" schemes that works on the forward edge by checking for errors in virtual tables and type casts. Not all of the schemes are supported on all platforms and most of them, the exception being two "kcfi" schemes intended for low-level kernel software, depends on link-time optimization (LTO) to know what functions are supposed to be called in normal cases. Also provided is a separate "shadow call stack" (SCS) instrumentation pass that defends on the backward edge by checking for call stack modifications, available only for the aarch64 and RISC-V ISAs. And due to use of a shared processor register SCS is only enforceable on certain ABIs or if in other ways it is ensured that any other software using the register set (thread/processor) does not interfere with this use. Google has shipped Android with the Linux kernel compiled by Clang with link-time optimization (LTO) and CFI enabled since 2018. Even though SCS is available for the Linux kernel as an option, and support is also available for Android's system components it is recommended only to enable it for components for which it can be ensured that no third party code is loaded. === GCC === The GNU Compiler Collection implemented a "shadow call stack" compatible with Clang for aarch64 in v12 released in 2022. This feature is primarily intended for building the Linux kernel as support is missing from GCC user space libraries. === Intel Control-flow Enforcement Technology === Intel Control-flow Enforcement Technology (CET) detects compromises to control flow integrity with a shadow stack (SS) and indirect branch tracking (IBT). The kernel must map a region of memory for the shadow stack not writable to user space programs except by special instructions. The shadow stack stores a copy of the return address of each CALL. On a RET, the processor checks if the return address stored in the normal stack and shadow stack are equal. If the addresses are not equal, the processor generates an INT #21 (Control Flow Protection Fault). Indirect branch tracking detects indirect JMP or CALL instructions to unauthorized targets. It is implemented by adding a new internal state machine in the processor. The behavior of indirect JMP and CALL instructions is changed so that they switch the state machine from IDLE to WAIT_FOR_ENDBRANCH. In the WAIT_FOR_ENDBRANCH state, the next instruction to be executed is required to be the new ENDBRANCH instruction (ENDBR32 in 32-bit mode or ENDBR64 in 64-bit mode), which changes the internal state machine from WAIT_FOR_ENDBRANCH back to IDLE. Thus every authorized target of an indirect JMP or CALL must begin with ENDBRANCH. If the processor is in a WAIT_FOR_ENDBRANCH state (meaning, the previous instruction was an indirect JMP or CALL), and the next instruction is not an ENDBRANCH instruction, the processor generates an INT #21 (Control Flow Protection Fault). On processors not supporting CET indirect branch tracking, ENDBRANCH instructions are interpreted as NOPs and have no effect. === Microsoft Control Flow Guard === Control Flow Guard (CFG) was first released for Windows 8.1 Update 3 (KB3000850) in November 2014. Developers can add CFG to their programs by adding the /guard:cf linker flag before program linking in Visual Studio 2015 or newer. As of Windows 10 Creators Update (Windows 10 version 1703), the Windows kernel is compiled with CFG. The Windows kernel uses Hyper-V to prevent malicious kernel code from overwriting the CFG bitmap. CFG operates by creating a per-process bitmap, where a set bit indicates that the address is a valid destination. Before performing each indirect function call, the application checks if the destination address is in the bitmap. If the destination address is not in the bitmap, the program terminates. This makes it more difficult for an attacker to exploit a use-after-free by replacing an object's contents and then using an indirect function call to execute a payload. ==== Implementation details ==== For all protected indirect function calls, the _guard_check_icall function is called, which performs the following steps: Convert the target address to an offset and bit number in the bitmap. The highest 3 bytes are the byte offset in the bitmap The bit offset is a 5-bit value. The first four bits are the 4th through 8th low-order bits of the address. The 5th bit of the bit offset is set to 0 if the destination address is aligned with 0x10 (last four bits are 0), and 1 if it is not. Examine the target's address value in the bitmap If the target address is in the bitmap, return without an error. If the target address is not in the bitmap, terminate the program. ==== Bypass techniques ==== There are several generic techniques for bypassing CFG: Set the destination to code located in a non-CFG module loaded in the same process. Find an indirect call that was not protected by CFG (either CALL or JMP). Use a function call with a different number of arguments than the call is designed for, causing a stack misalignment, and code execution after the function returns (patched in Windows 10). Use a function call with the same number of arguments, but one of pointers passed is treated as an object and writes to a pointer-based offset, allowing overwriting a return address. Overwrite the function call used by the CFG to validate the address (patched in March 2015) Set the CFG bitmap to all 1's, allowing all indirect function calls Use a controlled-write primitive to overwrite an address on the stack (since the stack is not protected by CFG) === Microsoft eXtended Flow Guard === eXtended Flow Guard (XFG) has not been officially released yet, but is available in the Windows Insider preview and was publicly presented at Bluehat Shanghai in 2019. XFG extends CFG by validating function call signatures to ensure that indirect function calls are only to the subset of functions with the same signature. Function call signature validation is implemented by adding instructions to store the target function's hash in register r10 immediately prior to the indirect call and storing the calculated function hash in the memory immediately preceding the target address's code. When the indirect call is made, the XFG validation function compares the value in r10 to the target

    Read more →
  • SQL/PSM

    SQL/PSM

    SQL/PSM (SQL/Persistent Stored Modules) is an ISO standard mainly defining an extension of SQL with a procedural language for use in stored procedures. Initially published in 1996 as an extension of SQL-92 (ISO/IEC 9075-4:1996, a version sometimes called PSM-96 or even SQL-92/PSM), SQL/PSM was later incorporated into the multi-part SQL:1999 standard, and has been part 4 of that standard since then, most recently in SQL:2023. The SQL:1999 part 4 covered less than the original PSM-96 because the SQL statements for defining, managing, and invoking routines were actually incorporated into part 2 SQL/Foundation, leaving only the procedural language itself as SQL/PSM. The SQL/PSM facilities are still optional as far as the SQL standard is concerned; most of them are grouped in Features P001-P008. SQL/PSM standardizes syntax and semantics for control flow, exception handling (called "condition handling" in SQL/PSM), local variables, assignment of expressions to variables and parameters, and (procedural) use of cursors. It also defines an information schema (metadata) for stored procedures. SQL/PSM is one language in which methods for the SQL:1999 structured types can be defined. The other is Java, via SQL/JRT. SQL/PSM is derived, seemingly directly, from Oracle's PL/SQL. Oracle developed PL/SQL and released it in 1991, basing the language on the US Department of Defense's Ada programming language. However, Oracle has maintained a distance from the standard in its documentation. IBM's SQL PL (used in DB2) and Mimer SQL's PSM were the first two products officially implementing SQL/PSM. It is commonly thought that these two languages, and perhaps also MySQL/MariaDB's procedural language, are closest to the SQL/PSM standard. However, a PostgreSQL addon implements SQL/PSM (alongside its other procedural languages like the PL/SQL-derived plpgsql), although it is not part of the core product. RDF functionality in OpenLink Virtuoso was developed entirely through SQL/PSM, combined with custom datatypes (e.g., ANY for handling URI and Literal relation objects), sophisticated indexing, and flexible physical storage choices (column-wise or row-wise).

    Read more →
  • Organizational information theory

    Organizational information theory

    Organizational Information Theory (OIT) is a communication theory, developed by Karl Weick, offering systemic insight into the processing and exchange of information within organizations and among its members. Unlike the past structure-centered theory, OIT focuses on the process of organizing in dynamic, information-rich environments. Given that, it contends that the main activity of organizations is the process of making sense of equivocal information. Organizational members are instrumental to reduce equivocality and achieve sensemaking through some strategies — enactment, selection, and retention of information. With a framework that is interdisciplinary in nature, organizational information theory's desire to eliminate both ambiguity and complexity from workplace messaging builds upon earlier findings from general systems theory and phenomenology. == Inspiration and influence of pre-existing theories == 1. General Systems Theory The General Systems Theory, on its most basic premise, describes the phenomenon of a cohesive group of interrelated parts. When one part of the system is changed or affected, it will affect the system as a whole. Weick uses this theoretical framework from 1950 to influence his organizational information theory. Likewise, organizations can be viewed as a system of related parts that work together towards a common goal or vision. Applying this to Weick's organizational information theory, organizations must work to reduce ambiguity and complexity in the workplace to maximize cohesiveness and efficiency. Weick uses the term, coupling, to describe how organizations, like a system, can be composed of interrelated and dependent parts. Coupling looks at the relationship between people and work. There are two types of coupling: 1. Loose coupling Loose coupling describes that while people within the organization or system are connected and often work together, they do not depend on one another to continue or fully complete individual work. The dependencies are weak and workflow is flexible. For example, "if the whole Science department completely shuts down because all of teachers are sick or for whatsoever reason, the school can still continue to operate because other departments are still present." 2. Tight coupling Tight coupling describes when connections within an organization are strong and dependent. If one part of the organization is not operating correctly, the organization as a whole cannot continue to their fullest potential. " For instance, the format and ink section completely shuts down hence the succeeding steps cannot be continued, so the whole process of the organization will be dropped. Thus, components of a system are directly dependent on one another." 2. Theory of evolution The theory of evolution, by Charles Darwin, is a framework for survival of the fittest. According to Darwin, organisms attempt to adapt and live in an unforgiving environment. Those that are unsuccessful in adaptation do not survive, while the strong organisms continue to thrive and reproduce. Weick invokes inspiration from Darwin, to incorporate a biological perspective to his theory. It is natural for organizations to have to adapt to incoming information that often interfere with the preexisting environment. Organizations that are able to plan and alter strategies in accordance with their constant need of organizing and sense making, will survive and be the most successful. However, there is a notable difference between animal evolution and survival of the fittest in organizations, "A given animal is what it is; variation comes through mutation. But the nature of an organization can change when its members alter their behavior." == Assumptions == 1. Human organizations exist in an information environment Unlike senders and receivers models, OIT stands on the situational perspective. Karl Weick views a human organization as an open social system. People in that system develop a mechanism to establish goals, obtain and process information, or perceive the environment. In this process, people and the environment come to conclusions on "what's going on here?". Colville believes that this attributional process is retrospective. Take an education institution as an example. A university can obtain information regarding students' needs in numerous ways. It might create feedback section in its website. It could organize alumni panels or academic affairs to attract prospective students and collect concrete questions they are interested in. It may also conduct the survey or host focus group to get the information. After that, the staff of the university have to decide how to deal with these information, based on which, it has to set and accomplish its goals for current and prospective students. 2. The information an organization receives differs in terms of equivocality Weick posits that numerous feasible interpretations of reality exist when organizations process information. Their varying levels of understandability lead to different outcomes of information inputs. In other academic works, scholars tend to say that messages are uncertain or ambiguous. While according to OIT, messages are described to be equivocal. believes that people proactively exclude a number of possibilities to perceive what is going on in the environment. Due to OIT's situational perspective, the meanings of messages consist of the messages, the interpretations of receivers, and the interactional context. However, ambiguity and uncertainty can mean that a standard answer - the only one true objective interpretation - exists. Also, Weick emphasizes that "the equivocality is the engine that motivates people to organize". Maitlis and Christianson states that the equivocality trigger sensemaking for three reasons: environment jolts and organizational crises, threats to identity, and planned change interventions. 3. Human organizations engage in information processing to reduce equivocality of information Based upon the first two assumption, OIT proposes that information processing within organizations is a social activity. Sharing is the key feature of organizational information processing. In that particular context, members jointly make sense the reality by reducing equivocality. It other words, the sensemaking is a joint responsibility which includes numerous interdependent people to accomplish. In this process, organizations and its members combine actions and attributions together in order to find the balance between the complexity of thoughts and the simplicity of actions. Weick also proposes that people create their own environment though enactment, which is the action of making sense. This is because people have different perceptual schemas and selective perception, so people create different information environments. In creating different information environments, people can arrive at the same or close to the same understanding or solution through different thought processes and overall understanding. == Key concepts == === The organization === In order to place Weick's vision regarding Organizational Information Theory into proper working context, exploring his view regarding what constitutes the organization and how its individuals embody that construct might yield significant insights. From a fundamental standpoint, he shared a belief that organizational validation is derived---not through bricks and mortar, or locale—but from a series of events which enable entities to "collect, manage and use the information they receive." In elaborating further on what constitutes an organization during early writings outlining OIT, Weick said, "The word organization is a noun and it is also a myth. if one looks for an organization, one will not find it. What will be found is that there are events linked together, that transpire within concrete walls and these sequences, their pathways, their timing, are the forms we erroneously make into substances when we talk about an organization". When viewed in this modular fashion, the organization meets Weick's theoretical vision by encompassing parameters that are less bound by concrete, wood, and structural restraints and more by an ability to serve as a repository where information can be consistently and effectively channeled. Taking these defining characteristics into account, proper channel execution relies on maximization of messaging clarity, context, delivery and evolution through any system. One example as to how these interactions might unfold on a more granular level within these confines can be gleaned through Weick's double interact loop, which he considers the "building blocks of every organization". Simply put, double interacts describe interpersonal exchanges that, inherently, occur across the organizational chain of command and in life, itself. Thus: "An act occurs when you say something (Can I have a Popsicle?). An interact occurs when you say something and I respond ("No, it will spoil your dinner

    Read more →
  • Transaction data

    Transaction data

    Transaction data or transaction information is a category of data describing transactions. Transaction data/information gather variables generally referring to reference data or master data – e.g. dates, times, time zones, currencies. Typical transactions are: Financial transactions about orders, invoices, payments; Work transactions about plans, activity records; Logistic transactions about deliveries, storage records, travel records, etc.. == Management == Recording and storing transactions is called records management. The record of the transaction is stored in a place where the retention can be guaranteed and where data is archived or removed following a retention period. Formats of recorded transactions can be digital data in databases and spreadsheets, or handwritten texts in physical documents like former bankbooks. Transaction processing systems are application software that generate transactions and manage transaction data/information, e.g. SAP and Oracle Financials. == Data warehousing == Transaction data can be summarised in a data warehouse, which helps accessibility and analysis of the data.

    Read more →
  • BeyondCorp

    BeyondCorp

    BeyondCorp is an implementation of zero-trust computer security concepts creating a zero trust network. It is created by Google. == Background == It was created in response to the 2009 Operation Aurora. An open source implementation inspired by Google's research paper on an access proxy is known as "transcend". Google documented its Zero Trust journey from 2014 to 2018 through a series of articles in the journal ;login:. Google called their ZT network "BeyondCorp". Google implemented a Zero Trust architecture on a large scale, and relied on user and device credentials, regardless of location. Data was encrypted and protected from managed devices. Unmanaged devices, such as BYOD, were not given access to the BeyondCorp resources. == Design and technology == BeyondCorp utilized a zero trust security model, which is a relatively new security model that it assumes that all devices and users are potentially compromised. This is in contrast to traditional security models, which rely on firewalls and other perimeter defenses to protect sensitive data. === Trust === The corporate network grants no inherent trust, and all internal apps are accessed via the BeyondCorp system, regardless of whether the user is in a Google office or working remotely. BeyondCorp is related to Zero Trust architecture as it implements a true Zero Trust network, where all access is granted on identity, device, and authentication, based on robust underlying device and identity data sources. BeyondCorp works by using a number of security policies including authentication, authorization, and access control to ensure that only authorized users can access corporate resources. Authentication verifies the identity of the user, authorization determines whether the user has permission to access the requested resource, and access control policies restrict what the user can do with the resource. ==== Trust Inferrer ==== One of the main components in BeyondCorp's implementation is the Trust Inferrer. The Trust Inferrer is a security component (typically software) that looks at information about a user's device, like a computer or phone, to decide how much it can be trusted to access certain resources like important company documents. The Trust Inferrer checks things like the security of the device, whether it has the right software installed, and if it belongs to an authorized user. Based on all this information, the Trust Inferrer decides what the device can access and what it can't. === Security mechanisms === Unlike traditional VPNs, BeyondCorp's access policies are based on information about a device, its state, and its associated user. BeyondCorp considers both internal networks and external networks to be completely untrusted, and gates access to applications by dynamically asserting and enforcing levels, or “tiers,” of access. === Device Inventory Database === BeyondCorp utilized a Device Inventory Database and Device Identity that uniquely identifies a device through a digital certificate. Any changes to the device are recorded in the Device Inventory Database. The certificate is used to uniquely identify a device; however, additional information is required to grant access privileges to a resource. === Access Control Engine === Another important component of BeyondCorp's implementation is the Access Control Engine. Think of this as the brain of the Zero Trust architecture. The Access Control Engine is like a traffic cop standing at an intersection. Its job is to make sure that only authorized devices and users are allowed to access specific resources (like files or applications) on the network. It checks the access policy (the rules that say who can access what), the device's state (like whether it has the right software updates or security settings), and the resources being requested. Then it makes a decision on whether to grant or deny access based on all of this information. It helps ensure that only the right people and devices are allowed access to the network, which helps keep things secure. The Access Control Engine utilizes the output from the Trust Inferrer and other data that is fed into its system. == Usage == One of the first things Google did to implement a Zero Trust architecture was to capture and analyze network traffic. The purpose of analyzing the traffic was to build a baseline of what typical network traffic looked like. In doing so, BeyondCorp also discovered unusual, unexpected, and unauthorized traffic. This was very useful because it gave the BeyondCorp engineers critical information that assisted them in reengineering the system in a secure manner. Some of the benefits BeyondCorp realized by adopting a Zero Trust architecture include the ability to allow their employees to work securely from any location. It reduces the risk of data breaches since data and applications are protected and users and devices are constantly being verified. The Zero Trust architecture is scalable and can be adapted to the changing needs of the businesses and their users. Especially relevant in today's work-from-home era, BeyondCorp allows employees to access enterprise resources securely from any location, without the need for traditional VPNs.

    Read more →
  • Berlekamp–Rabin algorithm

    Berlekamp–Rabin algorithm

    In number theory, Berlekamp's root finding algorithm, also called the Berlekamp–Rabin algorithm, is the probabilistic method of finding roots of polynomials over the field F p {\displaystyle \mathbb {F} _{p}} with p {\displaystyle p} elements. The method was discovered by Elwyn Berlekamp in 1970 as an auxiliary to the algorithm for polynomial factorization over finite fields. The algorithm was later modified by Rabin for arbitrary finite fields in 1979. The method was also independently discovered before Berlekamp by other researchers. == History == The method was proposed by Elwyn Berlekamp in his 1970 work on polynomial factorization over finite fields. His original work lacked a formal correctness proof and was later refined and modified for arbitrary finite fields by Michael Rabin. In 1986 René Peralta proposed a similar algorithm for finding square roots in F p {\displaystyle \mathbb {F} _{p}} . In 2000 Peralta's method was generalized for cubic equations. == Statement of problem == Let p {\displaystyle p} be an odd prime number. Consider the polynomial f ( x ) = a 0 + a 1 x + ⋯ + a n x n {\textstyle f(x)=a_{0}+a_{1}x+\cdots +a_{n}x^{n}} over the field F p ≃ Z / p Z {\displaystyle \mathbb {F} _{p}\simeq \mathbb {Z} /p\mathbb {Z} } of remainders modulo p {\displaystyle p} . The algorithm should find all λ {\displaystyle \lambda } in F p {\displaystyle \mathbb {F} _{p}} such that f ( λ ) = 0 {\textstyle f(\lambda )=0} in F p {\displaystyle \mathbb {F} _{p}} . == Algorithm == === Randomization === Let f ( x ) = ( x − λ 1 ) ( x − λ 2 ) ⋯ ( x − λ n ) {\textstyle f(x)=(x-\lambda _{1})(x-\lambda _{2})\cdots (x-\lambda _{n})} . Finding all roots of this polynomial is equivalent to finding its factorization into linear factors. To find such factorization it is sufficient to split the polynomial into any two non-trivial divisors and factorize them recursively. To do this, consider the polynomial f z ( x ) = f ( x − z ) = ( x − λ 1 − z ) ( x − λ 2 − z ) ⋯ ( x − λ n − z ) {\textstyle f_{z}(x)=f(x-z)=(x-\lambda _{1}-z)(x-\lambda _{2}-z)\cdots (x-\lambda _{n}-z)} where z {\displaystyle z} is some element of F p {\displaystyle \mathbb {F} _{p}} . If one can represent this polynomial as the product f z ( x ) = p 0 ( x ) p 1 ( x ) {\displaystyle f_{z}(x)=p_{0}(x)p_{1}(x)} then in terms of the initial polynomial it means that f ( x ) = p 0 ( x + z ) p 1 ( x + z ) {\displaystyle f(x)=p_{0}(x+z)p_{1}(x+z)} , which provides needed factorization of f ( x ) {\displaystyle f(x)} . === Classification of === F p {\displaystyle \mathbb {F} _{p}} elements Due to Euler's criterion, for every monomial ( x − λ ) {\displaystyle (x-\lambda )} exactly one of following properties holds: The monomial is equal to x {\displaystyle x} if λ = 0 {\displaystyle \lambda =0} , The monomial divides g 0 ( x ) = ( x ( p − 1 ) / 2 − 1 ) {\textstyle g_{0}(x)=(x^{(p-1)/2}-1)} if λ {\displaystyle \lambda } is quadratic residue modulo p {\displaystyle p} , The monomial divides g 1 ( x ) = ( x ( p − 1 ) / 2 + 1 ) {\textstyle g_{1}(x)=(x^{(p-1)/2}+1)} if λ {\displaystyle \lambda } is quadratic non-residual modulo p {\displaystyle p} . Thus if f z ( x ) {\displaystyle f_{z}(x)} is not divisible by x {\displaystyle x} , which may be checked separately, then f z ( x ) {\displaystyle f_{z}(x)} is equal to the product of greatest common divisors gcd ( f z ( x ) ; g 0 ( x ) ) {\displaystyle \gcd(f_{z}(x);g_{0}(x))} and gcd ( f z ( x ) ; g 1 ( x ) ) {\displaystyle \gcd(f_{z}(x);g_{1}(x))} . === Berlekamp's method === The property above leads to the following algorithm: Explicitly calculate coefficients of f z ( x ) = f ( x − z ) {\displaystyle f_{z}(x)=f(x-z)} , Calculate remainders of x , x 2 , x 2 2 , x 2 3 , x 2 4 , … , x 2 ⌊ log 2 ⁡ p ⌋ {\textstyle x,x^{2},x^{2^{2}},x^{2^{3}},x^{2^{4}},\ldots ,x^{2^{\lfloor \log _{2}p\rfloor }}} modulo f z ( x ) {\displaystyle f_{z}(x)} by squaring the current polynomial and taking remainder modulo f z ( x ) {\displaystyle f_{z}(x)} , Using exponentiation by squaring and polynomials calculated on the previous steps calculate the remainder of x ( p − 1 ) / 2 {\textstyle x^{(p-1)/2}} modulo f z ( x ) {\textstyle f_{z}(x)} , If x ( p − 1 ) / 2 ≢ ± 1 ( mod f z ( x ) ) {\textstyle x^{(p-1)/2}\not \equiv \pm 1{\pmod {f_{z}(x)}}} then gcd {\displaystyle \gcd } mentioned below provide a non-trivial factorization of f z ( x ) {\displaystyle f_{z}(x)} , Otherwise all roots of f z ( x ) {\displaystyle f_{z}(x)} are either residues or non-residues simultaneously and one has to choose another z {\displaystyle z} . If f ( x ) {\displaystyle f(x)} is divisible by some non-linear primitive polynomial g ( x ) {\displaystyle g(x)} over F p {\displaystyle \mathbb {F} _{p}} then when calculating gcd {\displaystyle \gcd } with g 0 ( x ) {\displaystyle g_{0}(x)} and g 1 ( x ) {\displaystyle g_{1}(x)} one will obtain a non-trivial factorization of f z ( x ) / g z ( x ) {\displaystyle f_{z}(x)/g_{z}(x)} , thus algorithm allows to find all roots of arbitrary polynomials over F p {\displaystyle \mathbb {F} _{p}} . === Modular square root === Consider equation x 2 ≡ a ( mod p ) {\textstyle x^{2}\equiv a{\pmod {p}}} having elements β {\displaystyle \beta } and − β {\displaystyle -\beta } as its roots. Solution of this equation is equivalent to factorization of polynomial f ( x ) = x 2 − a = ( x − β ) ( x + β ) {\textstyle f(x)=x^{2}-a=(x-\beta )(x+\beta )} over F p {\displaystyle \mathbb {F} _{p}} . In this particular case problem it is sufficient to calculate only gcd ( f z ( x ) ; g 0 ( x ) ) {\displaystyle \gcd(f_{z}(x);g_{0}(x))} . For this polynomial exactly one of the following properties will hold: GCD is equal to 1 {\displaystyle 1} which means that z + β {\displaystyle z+\beta } and z − β {\displaystyle z-\beta } are both quadratic non-residues, GCD is equal to f z ( x ) {\displaystyle f_{z}(x)} which means that both numbers are quadratic residues, GCD is equal to ( x − t ) {\displaystyle (x-t)} which means that exactly one of these numbers is quadratic residue. In the third case GCD is equal to either ( x − z − β ) {\displaystyle (x-z-\beta )} or ( x − z + β ) {\displaystyle (x-z+\beta )} . It allows to write the solution as β = ( t − z ) ( mod p ) {\textstyle \beta =(t-z){\pmod {p}}} . === Example === Assume we need to solve the equation x 2 ≡ 5 ( mod 11 ) {\textstyle x^{2}\equiv 5{\pmod {11}}} . For this we need to factorize f ( x ) = x 2 − 5 = ( x − β ) ( x + β ) {\displaystyle f(x)=x^{2}-5=(x-\beta )(x+\beta )} . Consider some possible values of z {\displaystyle z} : Let z = 3 {\displaystyle z=3} . Then f z ( x ) = ( x − 3 ) 2 − 5 = x 2 − 6 x + 4 {\displaystyle f_{z}(x)=(x-3)^{2}-5=x^{2}-6x+4} , thus gcd ( x 2 − 6 x + 4 ; x 5 − 1 ) = 1 {\displaystyle \gcd(x^{2}-6x+4;x^{5}-1)=1} . Both numbers 3 ± β {\displaystyle 3\pm \beta } are quadratic non-residues, so we need to take some other z {\displaystyle z} . Let z = 2 {\displaystyle z=2} . Then f z ( x ) = ( x − 2 ) 2 − 5 = x 2 − 4 x − 1 {\displaystyle f_{z}(x)=(x-2)^{2}-5=x^{2}-4x-1} , thus gcd ( x 2 − 4 x − 1 ; x 5 − 1 ) ≡ x − 9 ( mod 11 ) {\textstyle \gcd(x^{2}-4x-1;x^{5}-1)\equiv x-9{\pmod {11}}} . From this follows x − 9 = x − 2 − β {\textstyle x-9=x-2-\beta } , so β ≡ 7 ( mod 11 ) {\displaystyle \beta \equiv 7{\pmod {11}}} and − β ≡ − 7 ≡ 4 ( mod 11 ) {\textstyle -\beta \equiv -7\equiv 4{\pmod {11}}} . A manual check shows that, indeed, 7 2 ≡ 49 ≡ 5 ( mod 11 ) {\textstyle 7^{2}\equiv 49\equiv 5{\pmod {11}}} and 4 2 ≡ 16 ≡ 5 ( mod 11 ) {\textstyle 4^{2}\equiv 16\equiv 5{\pmod {11}}} . == Correctness proof == The algorithm finds factorization of f z ( x ) {\displaystyle f_{z}(x)} in all cases except for ones when all numbers z + λ 1 , z + λ 2 , … , z + λ n {\displaystyle z+\lambda _{1},z+\lambda _{2},\ldots ,z+\lambda _{n}} are quadratic residues or non-residues simultaneously. According to theory of cyclotomy, the probability of such an event for the case when λ 1 , … , λ n {\displaystyle \lambda _{1},\ldots ,\lambda _{n}} are all residues or non-residues simultaneously (that is, when z = 0 {\displaystyle z=0} would fail) may be estimated as 2 − k {\displaystyle 2^{-k}} where k {\displaystyle k} is the number of distinct values in λ 1 , … , λ n {\displaystyle \lambda _{1},\ldots ,\lambda _{n}} . In this way even for the worst case of k = 1 {\displaystyle k=1} and f ( x ) = ( x − λ ) n {\displaystyle f(x)=(x-\lambda )^{n}} , the probability of error may be estimated as 1 / 2 {\displaystyle 1/2} and for modular square root case error probability is at most 1 / 4 {\displaystyle 1/4} . == Complexity == Let a polynomial have degree n {\displaystyle n} . We derive the algorithm's complexity as follows: Due to the binomial theorem ( x − z ) k = ∑ i = 0 k ( k i ) ( − z ) k − i x i {\textstyle (x-z)^{k}=\sum \limits _{i=0}^{k}{\binom {k}{i}}(-z)^{k-i}x^{i}} , we may transition from f ( x ) {\displaystyle f(x)} to f ( x − z ) {\displaystyle f(x-z)} in O ( n 2 ) {\displaystyle O(n^{2})} time. Polynomial multiplication a

    Read more →
  • Education by algorithm

    Education by algorithm

    Education by algorithm refers to automated solutions that algorithmic agents or social bots offer to education, to assist with mundane educational tasks. These are often instrumentalist “educational reforms” or “curriculum transformations”, which have been implemented by policy makers and are supported by proprietary education technologies. New educational policies, mandated by transnational governance forums (like the OECD), have manufactured a connection between economies and education. Governments, schools and universities are expected to introduce or prepare students for an “unknown future”, to “future proof” them against an identified issue or to mitigate a national crisis. Technologies are seen as a catalyst to effect these changes. However, these policies mask a deeper problem, which include the assetization of education and the use of technologies as a means for surveillance and behavior modification. The traces that students and leave, through cookies, logins learning activities, assignments and tests, are collected, facetted, and shared with commercial organizations by these agents, to both predict future behavior and shape it. Techno solutionist thinking has led to managers adopting educational policies and reforms, and looking towards technologies to act as disrupters, liberators or agents to improve efficiency. During the COVID-19 pandemic, many more students had to modify their learning and working circumstances to protect themselves. Academics shifted their assessment practices from the dominant assessment of learning paradigm to an orientation that saw value in "assessment for learning". Big tech assisted, and teaching infrastructure became further privatized, and unbundling of education provision went a step further. Following the return to class, this assessment paradigm became rationalised in education. Leaving the space for algorithmic agents to step in. Academics work was increasingly driven by learning experience platforms and student understanding was extended through interleaving, behavior modification nudges and rewards and scheduled high stakes assessments. This data collection may also be construed as surveillance., or perceived as evidence of a Fourth Industrial Revolution

    Read more →