AI Grammar Summarizer

AI Grammar Summarizer — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Learning automaton

    Learning automaton

    A learning automaton is one type of machine learning algorithm studied since 1970s. Learning automata select their current action based on past experiences from the environment. It will fall into the range of reinforcement learning if the environment is stochastic and a Markov decision process (MDP) is used. == History == Research in learning automata can be traced back to the work of Michael Lvovitch Tsetlin in the early 1960s in the Soviet Union. Together with some colleagues, he published a collection of papers on how to use matrices to describe automata functions. Additionally, Tsetlin worked on reasonable and collective automata behaviour, and on automata games. Learning automata were also investigated by researches in the United States in the 1960s. However, the term learning automaton was not used until Narendra and Thathachar introduced it in a survey paper in 1974. == Definition == A learning automaton is an adaptive decision-making unit situated in a random environment that learns the optimal action through repeated interactions with its environment. The actions are chosen according to a specific probability distribution which is updated based on the environment response the automaton obtains by performing a particular action. With respect to the field of reinforcement learning, learning automata are characterized as policy iterators. In contrast to other reinforcement learners, policy iterators directly manipulate the policy π. Another example for policy iterators are evolutionary algorithms. Formally, Narendra and Thathachar define a stochastic automaton to consist of: a set X of possible inputs, a set Φ = { Φ1, ..., Φs } of possible internal states, a set α = { α1, ..., αr } of possible outputs, or actions, with r ≤ s, an initial state probability vector p(0) = ≪ p1(0), ..., ps(0) ≫, a computable function A which after each time step t generates p(t+1) from p(t), the current input, and the current state, and a function G: Φ → α which generates the output at each time step. In their paper, they investigate only stochastic automata with r = s and G being bijective, allowing them to confuse actions and states. The states of such an automaton correspond to the states of a "discrete-state discrete-parameter Markov process". At each time step t=0,1,2,3,..., the automaton reads an input from its environment, updates p(t) to p(t+1) by A, randomly chooses a successor state according to the probabilities p(t+1) and outputs the corresponding action. The automaton's environment, in turn, reads the action and sends the next input to the automaton. Frequently, the input set X = { 0,1 } is used, with 0 and 1 corresponding to a nonpenalty and a penalty response of the environment, respectively; in this case, the automaton should learn to minimize the number of penalty responses, and the feedback loop of automaton and environment is called a "P-model". More generally, a "Q-model" allows an arbitrary finite input set X, and an "S-model" uses the interval [0,1] of real numbers as X. A visualised demo/ Art Work of a single Learning Automaton had been developed by μSystems (microSystems) Research Group at Newcastle University. == Finite action-set learning automata == Finite action-set learning automata (FALA) are a class of learning automata for which the number of possible actions is finite or, in more mathematical terms, for which the size of the action-set is finite.

    Read more →
  • Factorization of polynomials over finite fields

    Factorization of polynomials over finite fields

    In mathematics and computer algebra the factorization of a polynomial consists of decomposing it into a product of irreducible factors. This decomposition is theoretically possible and is unique for polynomials with coefficients in any field, but rather strong restrictions on the field of the coefficients are needed to allow the computation of the factorization by means of an algorithm. In practice, algorithms have been designed only for polynomials with coefficients in a finite field, in the field of rationals or in a finitely generated field extension of one of them. All factorization algorithms, including the case of multivariate polynomials over the rational numbers, reduce the problem to this case; see polynomial factorization. It is also used for various applications of finite fields, such as coding theory (cyclic redundancy codes and BCH codes), cryptography (public key cryptography by the means of elliptic curves), and computational number theory. As the reduction of the factorization of multivariate polynomials to that of univariate polynomials does not have any specificity in the case of coefficients in a finite field, only polynomials with one variable are considered in this article. == Background == === Finite field === The theory of finite fields, whose origins can be traced back to the works of Gauss and Galois, has played a part in various branches of mathematics. Due to the applicability of the concept in other topics of mathematics and sciences like computer science there has been a resurgence of interest in finite fields and this is partly due to important applications in coding theory and cryptography. Applications of finite fields introduce some of these developments in cryptography, computer algebra and coding theory. A finite field or Galois field is a field with a finite order (number of elements). The order of a finite field is always a prime or a power of prime. For each prime power q = pr, there exists exactly one finite field with q elements, up to isomorphism. This field is denoted GF(q) or Fq. If p is prime, GF(p) is the prime field of order p; it is the field of residue classes modulo p, and its p elements are denoted 0, 1, ..., p−1. Thus a = b in GF(p) means the same as a ≡ b (mod p). === Irreducible polynomials === Let F be a finite field. As for general fields, a non-constant polynomial f in F[x] is said to be irreducible over F if it is not the product of two polynomials of positive degree. A polynomial of positive degree that is not irreducible over F is called reducible over F. Irreducible polynomials allow us to construct the finite fields of non-prime order. In fact, for a prime power q, let Fq be the finite field with q elements, unique up to isomorphism. A polynomial f of degree n greater than one, which is irreducible over Fq, defines a field extension of degree n which is isomorphic to the field with qn elements: the elements of this extension are the polynomials of degree lower than n; addition, subtraction and multiplication by an element of Fq are those of the polynomials; the product of two elements is the remainder of the division by f of their product as polynomials; the inverse of an element may be computed by the extended GCD algorithm (see Arithmetic of algebraic extensions). It follows that, to compute in a finite field of non prime order, one needs to generate an irreducible polynomial. For this, the common method is to take a polynomial at random and test it for irreducibility. For sake of efficiency of the multiplication in the field, it is usual to search for polynomials of the shape xn + ax + b. Irreducible polynomials over finite fields are also useful for pseudorandom number generators using feedback shift registers and discrete logarithm over F2n. The number of irreducible monic polynomials of degree n over Fq is the number of aperiodic necklaces, given by Moreau's necklace-counting function Mq(n). The closely related necklace function Nq(n) counts monic polynomials of degree n which are primary (a power of an irreducible); or alternatively irreducible polynomials of all degrees d which divide n. === Example === The polynomial P = x4 + 1 is irreducible over Q but not over any finite field. On any field extension of F2, P = (x + 1)4. On every other finite field, at least one of −1, 2 and −2 is a square, because the product of two non-squares is a square and so we have If − 1 = a 2 , {\displaystyle -1=a^{2},} then P = ( x 2 + a ) ( x 2 − a ) . {\displaystyle P=(x^{2}+a)(x^{2}-a).} If 2 = b 2 , {\displaystyle 2=b^{2},} then P = ( x 2 + b x + 1 ) ( x 2 − b x + 1 ) . {\displaystyle P=(x^{2}+bx+1)(x^{2}-bx+1).} If − 2 = c 2 , {\displaystyle -2=c^{2},} then P = ( x 2 + c x − 1 ) ( x 2 − c x − 1 ) . {\displaystyle P=(x^{2}+cx-1)(x^{2}-cx-1).} === Complexity === Polynomial factoring algorithms use basic polynomial operations such as products, divisions, gcd, powers of one polynomial modulo another, etc. A multiplication of two polynomials of degree at most n can be done in O(n2) operations in Fq using "classical" arithmetic, or in O(nlog(n) log(log(n)) ) operations in Fq using "fast" arithmetic. A Euclidean division (division with remainder) can be performed within the same time bounds. The cost of a polynomial greatest common divisor between two polynomials of degree at most n can be taken as O(n2) operations in Fq using classical methods, or as O(nlog2(n) log(log(n)) ) operations in Fq using fast methods. For polynomials h, g of degree at most n, the exponentiation hq mod g can be done with O(log(q)) polynomial products, using exponentiation by squaring method, that is O(n2log(q)) operations in Fq using classical methods, or O(nlog(q)log(n) log(log(n))) operations in Fq using fast methods. In the algorithms that follow, the complexities are expressed in terms of number of arithmetic operations in Fq, using classical algorithms for the arithmetic of polynomials. == Factoring algorithms == Many algorithms for factoring polynomials over finite fields include the following three stages: Square-free factorization Distinct-degree factorization Equal-degree factorization An important exception is Berlekamp's algorithm, which combines stages 2 and 3. === Berlekamp's algorithm === Berlekamp's algorithm is historically important as being the first factorization algorithm which works well in practice. However, it contains a loop on the elements of the ground field, which implies that it is practicable only over small finite fields. For a fixed ground field, its time complexity is polynomial, but, for general ground fields, the complexity is exponential in the size of the ground field. === Square-free factorization === The algorithm determines a square-free factorization for polynomials whose coefficients come from the finite field Fq of order q = pm with p a prime. This algorithm firstly determines the derivative and then computes the gcd of the polynomial and its derivative. If it is not one then the gcd is again divided into the original polynomial, provided that the derivative is not zero (a case that exists for non-constant polynomials defined over finite fields). This algorithm uses the fact that, if the derivative of a polynomial is zero, then it is a polynomial in xp, which is, if the coefficients belong to Fp, the pth power of the polynomial obtained by substituting x by x1/p. If the coefficients do not belong to Fp, the pth root of a polynomial with zero derivative is obtained by the same substitution on x, completed by applying the inverse of the Frobenius automorphism to the coefficients. This algorithm works also over a field of characteristic zero, with the only difference that it never enters in the blocks of instructions where pth roots are computed. However, in this case, Yun's algorithm is much more efficient because it computes the greatest common divisors of polynomials of lower degrees. A consequence is that, when factoring a polynomial over the integers, the algorithm which follows is not used: one first computes the square-free factorization over the integers, and to factor the resulting polynomials, one chooses a p such that they remain square-free modulo p. Algorithm: SFF (Square-Free Factorization) Input: A monic polynomial f in Fq[x] where q = pm Output: Square-free factorization of f R ← 1 # Make w be the product (without multiplicity) of all factors of f that have # multiplicity not divisible by p c ← gcd(f, f′) w ← f/c # Step 1: Identify all factors in w i ← 1 while w ≠ 1 do y ← gcd(w, c) fac ← w / y R ← R · faci w ← y; c ← c / y; i ← i + 1 end while # c is now the product (with multiplicity) of the remaining factors of f # Step 2: Identify all remaining factors using recursion # Note that these are the factors of f that have multiplicity divisible by p if c ≠ 1 then c ← c1/p R ← R·SFF(c)p end if Output(R) The idea is to identify the product of all irreducible factors of f with the same multiplicity. This is done in two steps. The first step uses the formal d

    Read more →
  • Corporate surveillance

    Corporate surveillance

    Corporate surveillance describes the practice of businesses monitoring and extracting information from their users, clients, or staff. This information may consist of online browsing history, email correspondence, phone calls, location data, and other private details. Acts of corporate surveillance frequently look to boost results, detect potential security problems, or adjust advertising strategies. These practices have been criticized for violating ethical standards and invading personal privacy. Critics and privacy activists have called for businesses to incorporate rules and transparency surrounding their monitoring methods to ensure they are not misusing their position of authority or breaching regulatory standards. Monitoring can feel intrusive and give the impression that the business does not promote ethical behavior among its personnel. Staff satisfaction, productivity, and staff turnover may all suffer as a result of the invasion of privacy. == Monitoring methods == Employers may be authorized to gather information through keystroke logging and mouse tracking, which involves recording the keys individuals interact with and cursor position on computers. In cases where employment contracts permit it, they may also monitor webcam activity on company-provided computers. Employers may be able to view the emails sent from business accounts and may be able to see the websites visited when using a corporate internet connection. The screenshot capability is another tool that enables companies to see what remote workers are doing. This feature, which can be found in tracking software, takes screenshots throughout the day at predetermined or arbitrary intervals. Additionally, people who don't work in offices are observed. For instance, it has been claimed that Amazon has incorporated tracking technology to monitor warehouse staff and delivery drivers. == Use of collected information == Information collected by corporations can be used for a variety of uses including marketing research, targeting advertising, fraud detection and prevention, ensuring policy adherence, preventing lawsuits, and safeguarding records and company assets. == Privacy concerns == Concerns over corporate privacy have become more important due to companies collection and manipulation of personal data. Since these practices have been recognized there has been a rising concern about both the security and the possible mishandling of the data accumulated. Social Media data collection and monitoring has been one of the most concerned areas regarding corporate surveillance. Recently, many employers on CareerBuilder have checked their potential candidates' social media activities before the hiring process. This approach can be excusable since it is important to be aware of a future employee or applicant's online presence, and how it might affect the company's reputation in the future. This is crucial since employers are often made legally responsible for their worker's digital actions. These data can also be used to enact political gains. The Facebook-Cambridge Analytica data scandal in 2018 revealed that its British branch to have surreptitiously sold American psychological data to the Trump campaign. This information was supposed to be private, but Facebook's inability to protect user information had reportedly not been a top priority of the company at the time. == Laws and regulations == The National Labor and Relations Act (NLRA) safeguards workplace democracy by giving workers in the private sector the basic freedom to demand better working conditions and choice of representation without fear of retaliation. General Data Protection Regulation (GDPR) outlines the broad responsibilities of data controllers and the "processors" that handle personal data on their behalf. They must adopt the necessary security measures in accordance with the risk involved in the data processing operations they carry out.[1] Electronics Communication Privacy Act (ECPA), as amended, provides protection for electronic, oral, and wire communications while they are being created, while they are being sent, and while they are being stored on computers. Email, phone calls, and electronically stored data are covered by the Act. == Sale of customer data == If it is business intelligence, data collected on individuals and groups can be sold to other corporations, so that they can use it for the aforementioned purpose. It can be used for direct marketing purposes, such as targeted advertisements on Google and Yahoo. These ads are tailored to the individual user of the search engine by analyzing their search history and emails (if they use free webmail services). For example, the world's most popular web search engine stores identifying information for each web search. Google stores an IP address and the search phrase used in a database for up to 2 years. Google also scans the content of emails of users of its Gmail webmail service, in order to create targeted advertising based on what people are talking about in their personal email correspondences. Google is, by far, the largest web advertising agency. Their revenue model is based on receiving payments from advertisers for each page-visit resulting from a visitor clicking on a Google AdWords ad, hosted either on a Google service or a third-party website. Millions of sites place Google's advertising banners and links on their websites, in order to share this profit from visitors who click on the ads. Each page containing Google advertisements adds, reads, and modifies cookies on each visitor's computer. These cookies track the user across all of these sites, and gather information about their web surfing habits, keeping track of which sites they visit, and what they do when they are on these sites. This information, along with the information from their email accounts, and search engine histories, is stored by Google to use for building a profile of the user to deliver better-targeted advertising. == Surveillance of workers == In 1993, David Steingard and Dale Fitzgibbons argued that modern management, far from empowering workers, had features of neo-Taylorism, where teamwork perpetuated surveillance and control. They argued that employees had become their own "thought police" and the team gaze was the equivalent of Bentham's panopticon guard tower. A critical evaluation of the Hawthorne Plant experiments has in turn given rise to the notion of a Hawthorne effect, where workers increase their productivity in response to their awareness of being observed or because they are gratified for being chosen to participate in a project. According to the American Management Association and the ePolicy Institute, who undertook a quantitative survey in 2007 about electronic monitoring and surveillance with approximately 300 US companies, "more than one fourth of employers have fired workers for misusing email and nearly one third have fired employees for misusing the Internet." Furthermore, about 30 percent of the companies had also fired employees for usage of "inappropriate or offensive language" and "viewing, downloading, or uploading inappropriate/offensive content." More than 40 percent of the companies monitor email traffic of their workers, and 66 percent of corporations monitor Internet connections. In addition, most companies use software to block websites such as sites with games, social networking, entertainment, shopping, and sports. The American Management Association and the ePolicy Institute also stress that companies track content that is being written about them, for example by monitoring blogs and social media, and scanning all files that are stored in a filesystem. == Government use of corporate surveillance data == The United States government often gains access to corporate databases, either by producing a warrant for it, or by asking. The Department of Homeland Security has openly stated that it uses data collected from consumer credit and direct marketing agencies—such as Google—for augmenting the profiles of individuals that it is monitoring. The US government has gathered information from grocery store discount card programs, which track customers' shopping patterns and store them in databases, in order to look for terrorists by analyzing shoppers' buying patterns. == Corporate surveillance of citizens == According to Dennis Broeders, "Big Brother is joined by big business". He argues that corporations are in any event interested in data on their potential customers and that placing some forms of surveillance in the hands of companies, results in companies owning video surveillance data for stores and public places. The commercial availability of surveillance systems has led to their rapid spread. Therefore it is almost impossible for citizens to maintain their anonymity. When businesses can monitor their customers, such customers run the risk of facing prejudice when applying for housing, loans, jobs, and other economic opportun

    Read more →
  • Hyper-encryption

    Hyper-encryption

    Hyper-encryption is a form of encryption invented by Michael O. Rabin which uses a high-bandwidth source of public random bits, together with a secret key that is shared by only the sender and recipient(s) of the message. It uses the assumptions of Ueli Maurer's bounded-storage model as the basis of its secrecy. Although everyone can see the data, decryption by adversaries without the secret key is still not feasible, because of the space limitations of storing enough data to mount an attack against the system. Unlike almost all other cryptosystems except the one-time pad, hyper-encryption can be proved to be information-theoretically secure, provided the storage bound cannot be surpassed. Moreover, if the necessary public information cannot be stored at the time of transmission, the plaintext can be shown to be impossible to recover, regardless of the computational capacity available to an adversary in the future, even if they have access to the secret key at that future time. A highly energy-efficient implementation of a hyper-encryption chip was demonstrated by Krishna Palem et al. using the Probabilistic CMOS or PCMOS technology and was shown to be ~205 times more efficient in terms of Energy-Performance-Product.

    Read more →
  • Shell Control Box

    Shell Control Box

    Shell Control Box (SCB) is a network security appliance that controls privileged access to remote IT systems, records activities in replayable audit trails, and prevents malicious actions. For example, it records as a system administrator updates a file server or a third-party network operator configures a router. The recorded audit trails can be replayed like a movie to review the events as they occurred. The content of the audit trails is indexed to make searching for events and automatic reporting possible. SCB is a Linux-based device developed by Balabit. It is an application level proxy gateway. In 2017, Balabit changed the name of the product to Privileged Session Management (PSM) and repositioned it as the core module of its Privileged Access Management solution. == Main Features == Balabit’s Privileged Session Management (PSM), Shell Control Box (SCB) is a device that controls, monitors, and audits remote administrative access to servers and network devices. It is a tool to oversee system administrators by controlling the encrypted connections used for administration. PSM (SCB) has full control over the SSH, RDP, Telnet, TN3270, TN5250, Citrix ICA, and VNC connections, providing a framework (with solid boundaries) for the work of the administrators. === Gateway Authentication === PSM (SCB) acts as an authentication gateway, enforcing strong authentication before users access IT assets. PSM can also integrate to user directories (for example, a Microsoft Active Directory) to resolve the group memberships of the users who access the protected servers. Credentials for accessing the server are retrieved transparently from PSM’s credential store or a third-party password management system by PSM impersonating the authenticated user. This automatic password retrieval protects the confidentiality of passwords as users can never access them. === Access Control === PSM controls and audits privileged access over the most wide-spread protocols such as SSH, RDP, or HTTP(s). The detailed access management helps to control who can access what and when on servers. It is also possible to control advanced features of the protocols, like the type of channels permitted. For example, unneeded channels like file transfer or file sharing can be disabled, reducing the security risk on the server. With PSM policies for privileged access can be enforced in one single system. === 4-eyes Authorization === To avoid accidental misconfiguration and other human errors, PSM supports the 4-eyes authorization principle. This is achieved by requiring an authorizer to allow administrators to access the server. The authorizer also has the possibility to monitor – and terminate - the session of the administrator in real-time, as if they were watching the same screen. === Real-time Monitoring and Session Termination === PSM can monitor the network traffic in real time, and execute various actions if a certain pattern (for example, a suspicious command, window title or text) appears on the screen. PSM can also detect specific patterns such as credit card numbers. In case of detecting a suspicious user action, PSM can send an e-mail alert or immediately terminate the connection. For example, PSM can block the connection before a destructive administrator command, such as the „rm” comes into effect. === Session Recording === PSM makes user activities traceable by recording them in tamper-proof and confidential audit trails. It records the selected sessions into encrypted, timestamped, and digitally signed audit trails. Audit trails can be browsed online, or followed real-time to monitor the activities of the users. PSM replays the recorded sessions just like a movie – actions of the users can be seen exactly as they appeared on their monitor. The Balabit Desktop Player enables fast forwarding during replays, searching for events (for example, typed commands or pressing Enter) and texts seen by the user. In the case of any problems (database manipulation, unexpected shutdown, etc.) the circumstances of the event are readily available in the trails, thus the cause of the incident can be identified. In addition to recording audit trails, transferred files can be also recorded and extracted for further analysis.

    Read more →
  • Social media use by businesses

    Social media use by businesses

    Social media use by businesses includes a range of applications. Although social media accessed via desktop computers offer an online shopping variety of opportunities for companies in a wide range of business sectors, mobile social media, which users can access when they are "on the go" via tablet computers or smartphones, benefit companies because of the location- and time-sensitive awareness of their users. Mobile social media tools can be used for marketing research, communication, sales promotions/discounts, informal employee learning/organizational development, relationship development/loyalty programs, and e-commerce. Marketing research: Mobile social media applications provide companies data about offline consumer movements at a level of detail that was previously accessible to online companies only. These applications allow any business to know the exact time a customer who uses social media entered one of its locations, as well as know the social media comments made during the visit. Communication: Mobile social media communication takes two forms: company-to-consumer (in which a company may establish a connection to a consumer based on its location and provide reviews about locations nearby) and user-generated content. For example, McDonald's offered $5 and $10 gift-cards to 100 users randomly selected among those checking in at one of its restaurants. This promotion increased check-ins by 33% (from 2,146 to 2,865), resulted in over 50 articles and blog posts, and prompted several hundred thousand news feeds and Twitter messages. Sales promotions and discounts: Although customers have had to use printed coupons in the past, mobile social media allows companies to tailor promotions to specific users at specific times. For example, when launching its California-Cancun service, Virgin America offered users who checked in through Loopt at one of three designated taco trucks in San Francisco or Los Angeles between 11 a.m. and 3 p.m. on 31 August 2010, two tacos for $1 and two flights to Cancun or Cabo for the price of one. This special promotion was only available to people who were at a certain location at a certain time. Relationship development and loyalty programs: In order to increase long-term relationships with customers, companies can develop loyalty programs that allow customers who check-in via social media regularly at a location to earn discounts or perks. For example, American Eagle Outfitters remunerates such customers with a tiered 10%, 15%, or 20% discount on their total purchase. Informal employee learning/organizational development is facilitated by social media. Technologies such as blogs, wiki pages, web forums, social networks and other social media act as technology enhanced learning (TEL) tools, and their users perceive change in organizational structure, culture and knowledge management. The prerequisite for the successful use of social media are motivated employees who want to use the new technologies. It is central for companies to understand the factors that determine the willingness to use social media. Customer service and support: A company can gain cost savings and increase revenue and customer satisfaction by using social media platforms in customer service and support. By using social media tools, company's have easy and widescale contact to its customers and simultaneously increase their brand knowledge. E-commerce: Social media sites are increasingly implementing marketing-friendly strategies, creating platforms that are mutually beneficial for users, businesses, and the networks themselves in the popularity and accessibility of e-commerce, or online purchases. The user who posts their comments about a company's product or service benefits because they are able to share their views with their online friends and acquaintances. The company benefits because it obtains insight (positive or negative) about how their product or service is viewed by consumers. Mobile social media applications such as Amazon.com and Pinterest have started to influence an upward trend in the popularity and accessibility of e-commerce. E-commerce businesses may refer to social media as consumer-generated media (CGM). A common thread running through all definitions of social media is a blending of technology and social interaction for the co-creation of value for the business or organization that is using it. People obtain valuable information, education, news, and other data from electronic and print media. Social media are distinct from industrial and traditional media such as newspapers, magazines, television, and film as they are comparatively inexpensive marketing tools and are highly accessible. They enable anyone, including private individuals, to publish or access information easily. Industrial media generally require significant resources to publish information, and in most cases the articles go through many revisions before being published. This process adds to the cost and the resulting market price. Originally social media was only used by individuals, but now it is used by both businesses and nonprofit organizations and also in government and politics. One characteristic shared by both social and industrial media is the capability to reach small or large audiences; for example, either a blog post or a television show may reach no people or millions of people. Some of the properties that help describe the differences between social and industrial media are: Quality: In industrial (traditional) publishing—mediated by a publisher—the typical range of quality is substantially narrower (skewing to the high quality side) than in niche, unmediated markets like user-generated social media posts. The main challenge posed by the content in social media sites is the fact that the distribution of quality has high variance: from very high-quality items to low-quality, sometimes even abusive or inappropriate content. Reach: Both industrial and social media technologies provide scale and are capable of reaching a global audience. Industrial media, however, typically use a centralized framework for organization, production, and dissemination, whereas social media are by their very nature more decentralized, less hierarchical, and distinguished by multiple points of production and utility. Frequency: The number of times users access a type of media per day. Heavy social media users, such as young people, check their social media account numerous times throughout the day. Accessibility: The means of production for industrial media are typically government or corporate (privately owned); social media tools are generally available to the public at little or no cost, or they are supported by advertising revenue. While social media tools are available to anyone with access to Internet and a computer or mobile device, due to the digital divide, the poorest segment of the population lacks access to the Internet and computer. Low-income people may have more access to traditional media (TV, radio, etc.), as an inexpensive TV and aerial or radio costs much less than an inexpensive computer or mobile device. Moreover, in many regions, TV or radio owners can tune into free over the air programming; computer or mobile device owners need Internet access to go to social media sites. Usability: Industrial media production typically requires specialized skills and training. For example, in the 1970s, to record a pop song, an aspiring singer would have to rent time in an expensive professional recording studio and hire an audio engineer. Conversely, most social media activities, such as posting a video of oneself singing a song require only modest reinterpretation of existing skills (assuming a person understands Web 2.0 technologies); in theory, anyone with access to the Internet can operate the means of social media production, and post digital pictures, videos or text online. Immediacy: The time lag between communications produced by industrial media can be long (days, weeks, or even months, by the time the content has been reviewed by various editors and fact checkers) compared to social media (which can be capable of virtually instantaneous responses). The immediacy of social media can be seen as a strength, in that it enables regular people to instantly communicate their opinions and information. At the same time, the immediacy of social media can also be seen as a weakness, as the lack of fact checking and editorial "gatekeepers" facilitates the circulation of hoaxes and fake news. Permanence: Industrial media, once created, cannot be altered (e.g., once a magazine article or paper book is printed and distributed, changes cannot be made to that same article in that print run) whereas social media posts can be altered almost instantaneously, when the user decides to edit their post or due to comments from other readers. Community media constitute a hybrid of industrial and social media. Though community-owned, some community radio,

    Read more →
  • OARnet

    OARnet

    The Ohio Academic Resources Network (OARnet) is a state-funded IT organization that provides member organizations with intrastate networking, virtualization and cloud computing applications, advanced videoconferencing, connections to regional and international research networks and the commodity Internet, colocation services, and emergency web-hosting. The OARnet network (known for a time as Third Frontier Network and later, OSCnet) is a dedicated, statewide, high-speed fiber-optic network that serves Ohio K-12 schools, college and university campuses, academic medical centers, public broadcasting stations and state and local/state government. OARnet is connected in Cleveland and Cincinnati to Internet2, the United States' most advanced nationwide research and education network. OARnet also maintains direct connections to Michigan's Merit network and OmniPoP in Chicago. OARnet offices are located on the West Campus of Ohio State University in Columbus, Ohio, United States. OARnet additionally serves as the delegated registrar for many third-level domains (both generic and locality-based) under .oh.us and some under .in.us and .ky.us. == History == A member-organization of the Ohio Technology Consortium, the technology and information division of the Ohio Board of Regents (now the Ohio Department of Higher Education), OARnet was created by the Ohio General Assembly in 1987 to provide Ohio researchers with network connectivity to the resources of the Ohio Supercomputer Center (OSC). It was recognized at the time that the network would serve a much broader audience, so when a network name was selected in early 1988, OARnet was chosen to emphasize the many uses of the network. The initial plan (1987) was to make use of a number of existing BITNET and CCnet (regional DECnet network) connections to get started. Three network (compatible) protocols were used, NJE, DECnet, and TCP/IP. The first OARnet-funded line was installed between Case Western Reserve University and John Carroll University in June 1987. Many subsequent lines at 9.6 kbit/s, 56 kbit/s, and T1 (1.544 Mbit/s) were installed with the aid of an Ohio Department of Administrative Services contract with Litel Corp. Internet (then NSFNET) connections were obtained in the spring of 1988. The non-TCP/IP protocols were soon phased out, and a process of upgrading connections took place regularly. In 1991, it was decided that OARnet would accept commercial business, at appropriate rates, for Internet connection services. Thus OARnet became one of the first Internet service providers (ISPs) in Ohio. After commercial ISPs entered the business extensively, OARnet stopped seeking new commercial accounts. A very large increase in backbone capacity occurred (planning 2000–02, installation 2003–04) when it became possible to lease optical fiber lines themselves ("dark fiber"). A new network backbone of 1,850 miles was installed at much higher capacity, and the eTech Ohio Commission and the Ohio Department of Education joined in funding and using OARnet. The fiber-optic backbone was launched in November 2004. In 2006, OARnet provided one of the first networks for delivery of live TV via Internet Protocol, known today as IPTV. OARnet served as the backbone for Ohio News Network to transmit Miami Redhawks hockey. The team finished the 2008-2009 season at the Frozen Four with a 4-3 OT loss to Boston University in the championship. It was one of the first live sports transmission deliveries over IPTV in the US. Another sharp jump in capacity occurred in 2012, when the State of Ohio funded an upgrade of the OARnet backbone to 100 Gigabits per second. Today, more than 1,500 miles of Ohio’s network backbone runs at an ultra-fast 100 Gbit/s, which was recognized by ComputerWorld in the Emerging Technology category of their 2013 Computerworld Honors Laureates program. In November 2012, Case Western Reserve University became the first member institution to connect at 100 Gbit/s to the OARnet backbone. The OARnet leaders have been: Russell M. Pitzer, director, 1987–88 Alison Brown, director, 1988–94 John Ritter, acting director, 1995 Larry Buell, acting director, 1996–97 Douglas Gale, director, 1998–2002 Alvin Stutz, director, 2002–05 Pankaj Shah, executive director, 2005–15 Paul Schopis, interim executive director, 2015–2018, executive director 2018–19 Denis Walsh, interim executive director, 2019–20 Pankaj Shah, executive director, 2020–

    Read more →
  • Bitmap index

    Bitmap index

    A bitmap index is a special kind of database index that uses bitmaps. Bitmap indexes have traditionally been considered to work well for low-cardinality columns, which have a modest number of distinct values, either absolutely, or relative to the number of records that contain the data. The extreme case of low cardinality is Boolean data (e.g., does a resident in a city have internet access?), which has two values, True and False. Bitmap indexes use bit arrays (commonly called bitmaps) and answer queries by performing bitwise logical operations on these bitmaps. Bitmap indexes have a significant space and performance advantage over other structures for query of such data. Their drawback is they are less efficient than the traditional B-tree indexes for columns whose data is frequently updated: consequently, they are more often employed in read-only systems that are specialized for fast query - e.g., data warehouses, and generally unsuitable for online transaction processing applications. Some researchers argue that bitmap indexes are also useful for moderate or even high-cardinality data (e.g., unique-valued data) which is accessed in a read-only manner, and queries access multiple bitmap-indexed columns using the AND, OR or XOR operators extensively. Bitmap indexes are also useful in data warehousing applications for joining a large fact table to smaller dimension tables such as those arranged in a star schema. == Example == Continuing the internet access example, a bitmap index may be logically viewed as follows: On the left, Identifier refers to the unique number assigned to each resident, HasInternet is the data to be indexed, the content of the bitmap index is shown as two columns under the heading bitmaps. Each column in the left illustration under the Bitmaps header is a bitmap in the bitmap index. In this case, there are two such bitmaps, one for "has internet" Yes and one for "has internet" No. It is easy to see that each bit in bitmap Y shows whether a particular row refers to a person who has internet access. This is the simplest form of bitmap index. Most columns will have more distinct values. For example, the sales amount is likely to have a much larger number of distinct values. Variations on the bitmap index can effectively index this data as well. We briefly review three such variations. Note: Many of the references cited here are reviewed at (John Wu (2007)). For those who might be interested in experimenting with some of the ideas mentioned here, many of them are implemented in open source software such as FastBit, the Lemur Bitmap Index C++ Library, the Roaring Bitmap Java library and the Apache Hive Data Warehouse system. == Compression == For historical reasons, bitmap compression and inverted list compression were developed as separate lines of research, and only later were recognized as solving essentially the same problem. Software can compress each bitmap in a bitmap index to save space. There has been considerable amount of work on this subject. Though there are exceptions such as Roaring bitmaps, Bitmap compression algorithms typically employ run-length encoding, such as the Byte-aligned Bitmap Code, the Word-Aligned Hybrid code, the Partitioned Word-Aligned Hybrid (PWAH) compression, the Position List Word Aligned Hybrid, the Compressed Adaptive Index (COMPAX), Enhanced Word-Aligned Hybrid (EWAH) and the COmpressed 'N' Composable Integer SEt (CONCISE). These compression methods require very little effort to compress and decompress. More importantly, bitmaps compressed with BBC, WAH, COMPAX, PLWAH, EWAH and CONCISE can directly participate in bitwise operations without decompression. This gives them considerable advantages over generic compression techniques such as LZ77. BBC compression and its derivatives are used in a commercial database management system. BBC is effective in both reducing index sizes and maintaining query performance. BBC encodes the bitmaps in bytes, while WAH encodes in words, better matching current CPUs. "On both synthetic data and real application data, the new word aligned schemes use only 50% more space, but perform logical operations on compressed data 12 times faster than BBC." PLWAH bitmaps were reported to take 50% of the storage space consumed by WAH bitmaps and offer up to 20% faster performance on logical operations. Similar considerations can be done for CONCISE and Enhanced Word-Aligned Hybrid. The performance of schemes such as BBC, WAH, PLWAH, EWAH, COMPAX and CONCISE is dependent on the order of the rows. A simple lexicographical sort can divide the index size by 9 and make indexes several times faster. The larger the table, the more important it is to sort the rows. Reshuffling techniques have also been proposed to achieve the same results of sorting when indexing streaming data. == Encoding == Basic bitmap indexes use one bitmap for each distinct value. It is possible to reduce the number of bitmaps used by using a different encoding method. For example, it is possible to encode C distinct values using log(C) bitmaps with binary encoding. This reduces the number of bitmaps, further saving space, but to answer any query, most of the bitmaps have to be accessed. This makes it potentially not as effective as scanning a vertical projection of the base data, also known as a materialized view or projection index. Finding the optimal encoding method that balances (arbitrary) query performance, index size and index maintenance remains a challenge. Without considering compression, Chan and Ioannidis analyzed a class of multi-component encoding methods and came to the conclusion that two-component encoding sits at the kink of the performance vs. index size curve and therefore represents the best trade-off between index size and query performance. == Binning == For high-cardinality columns, it is useful to bin the values, where each bin covers multiple values and build the bitmaps to represent the values in each bin. This approach reduces the number of bitmaps used regardless of encoding method. However, binned indexes can only answer some queries without examining the base data. For example, if a bin covers the range from 0.1 to 0.2, then when the user asks for all values less than 0.15, all rows that fall in the bin are possible hits and have to be checked to verify whether they are actually less than 0.15. The process of checking the base data is known as the candidate check. In most cases, the time used by the candidate check is significantly longer than the time needed to work with the bitmap index. Therefore, binned indexes exhibit irregular performance. They can be very fast for some queries, but much slower if the query does not exactly match a bin. == History == The concept of bitmap index was first introduced by Professor Israel Spiegler and Rafi Maayan in their research "Storage and Retrieval Considerations of Binary Data Bases", published in 1985. The first commercial database product to implement a bitmap index was Computer Corporation of America's Model 204. Patrick O'Neil published a paper about this implementation in 1987. This implementation is a hybrid between the basic bitmap index (without compression) and the list of Row Identifiers (RID-list). Overall, the index is organized as a B+tree. When the column cardinality is low, each leaf node of the B-tree would contain long list of RIDs. In this case, it requires less space to represent the RID-lists as bitmaps. Since each bitmap represents one distinct value, this is the basic bitmap index. As the column cardinality increases, each bitmap becomes sparse and it may take more disk space to store the bitmaps than to store the same content as RID-lists. In this case, it switches to use the RID-lists, which makes it a B+tree index. == In-memory bitmaps == One of the strongest reasons for using bitmap indexes is that the intermediate results produced from them are also bitmaps and can be efficiently reused in further operations to answer more complex queries. Many programming languages support this as a bit array data structure. For example, Java has the BitSet class and .NET have the BitArray class. Some database systems that do not offer persistent bitmap indexes use bitmaps internally to speed up query processing. For example, PostgreSQL versions 8.1 and later implement a "bitmap index scan" optimization to speed up arbitrarily complex logical operations between available indexes on a single table. For tables with many columns, the total number of distinct indexes to satisfy all possible queries (with equality filtering conditions on either of the fields) grows very fast, being defined by this formula: C n [ n 2 ] ≡ n ! ( n − [ n 2 ] ) ! [ n 2 ] ! {\displaystyle \mathbf {C} _{n}^{\left[{\frac {n}{2}}\right]}\equiv {\frac {n!}{\left(n-\left[{\frac {n}{2}}\right]\right)!\left[{\frac {n}{2}}\right]!}}} . A bitmap index scan combines expressions on different indexes, thus requiring only one index per column t

    Read more →
  • Altibase

    Altibase

    Altibase is a hybrid database, relational database management system manufactured by the Altibase Corporation. The software's hybrid architecture allows it to access both memory-resident and disk-resident tables using single interface. It supports both synchronous and asynchronous replication and offers real-time ACID compliance. Support is also offered for a variety of SQL standards and programming languages. Other important capabilities include data import and export, data encryption for security, multiple data access command sets, materialized view and temporary tables, and others. == History == From 1991 through 1997 the Mr. RT project was an in-memory database research project, conducted by the Electronics and Telecommunications Research Institute a government-funded research organization in South Korea. Altibase was incorporated in 1999. Altibase acquired an in-memory database engine from the Electronics and Telecommunications Research Institute in February 2000, and commercialized the database in October of the same year. In 2001, Altibase changed the name of the in-memory database product from "Spiner" to "Altibase" in 2001. In 2004, Altibase integrated the in-memory database with a disk-resident database to create a hybrid DBMS, released version 4.0 and renamed it as ALTIBASE HDB. Altibase released version 5.5.1 and 6.1.1 in 2012, version 6.3.1 in November 2013, and 6.5.1 in May 2015. Altibase claims that this is the world's first hybrid DBMS. Altibase released its open source edition version 7.1, however, closed the source in 2023. In August 2023, Altibase released its cloud-optimized version 7.3. === Awards === In 2006, Received the Presidential Award at the Korea Software Awards In 2007, Selected as World-Class Product by the Ministry of Commerce, Industry and Energy In 2009, Awarded the Outstanding Product Award in China's Telecommunications Industry In 2009, Received Outstanding Product Award at the China Billing China 2009 Telecommunication Industry Awards In 2010, Commendation from the Minister of Knowledge Economy for Technological Practicalization In 2011, Received the Grand Prize at the 10th Software Enterprise Competitiveness Award In 2011, Selected as Top 10 Emerging Technologies and received Special Award at the Korea Technology Grand Prize In 2012, Awarded for Contributions to Military Manpower Administration In 2014~2016, Included in Gartner Magic Quadrant for Operational DBMS In 2015, Selected as Outstanding BSS by China Fujian Mobile. In 2023, Awarded as the Excellent Research and Development Institution by the Korean Ministry Science and ICT In 2023, Won the Global Premium Commercial Software Presidential Award at the 9th Global Commercial Software Grand Exhibition in Korea === Release === The first version, called Spiner, was released in 2000 for commercial use. It took half of the in-memory DBMS market share in South Korea. In 2002 the second version was released renamed to Altibase v2.0. By 2003, Altibase v3.0 was released and it entered the Chinese market. Released version 4.0 with hybrid architecture, combining RAM and disk databases, was released in 2004. In 2005 Altibase began working with Chinese telecommunications providers for billing systems, and some financial companies in Taiwan, China, for home trading systems. The software was certified by the Telecommunications Technology Association. The Ministry of Government Administration and Home Affairs gave it an award in 2006. Offices in China and United States opened in 2009. In 2011, version 5.5.1 was renamed it to HDB (for "hybrid database"). The Altibase Data Stream product for complex event processing was renamed DSM. The product received a Korean technology award. Altibase introduced certification services. In 2012, HDB Zeta and Extreme were announced, and DSM renamed to CEP. In 2013, yet another variant called XDB was announced, and the company received ISO/IEC 20000 certification. In 2018, Altibase went open source. Altibase went open source in February, 2018. Altibase Corp has made the decision to discontinue the Altibase 7.1 open source edition, effective March 17, 2023. As a result, the open-source edition of Altibase 7.1 will no longer be available for download or use. Altibase released version 7.3 in September, 2023, its notable feature is the world’s first hybrid partition, allowing data to be stored in both memory and on disk at the partition level. Version 7.3 also added parallel processing capabilities for high-speed performance in both partitioned and non-partitioned scenarios. Improving potential bottlenecks associated with Commit and logging that impact transaction performance, version 7.3 has achieved an approximately 490% enhancement in performance compared to previous versions. === Release history === == Clients == According to marketing research, Altibase have over 700 customers and more than 8,000 of installations and deployments, including 22 Fortune Global 500 Companies. Altibase's clients in the telecommunications, financial services, manufacturing, and utilities sectors include Bloomberg, AT&T, LG, Intel, LGU+, ETRADE, HP, UAT Inc., POSCO, SK Telecom, KT Corporation, Samsung Electronics, Shinhan Bank, Woori Bank, Canon(Toshiba), Hanhwa, The South Korean Ministry of Defense, G-Market, CJ, and Chung-Ang University. === Global clients === Japan FX Prime, a foreign exchange services company Retela Crea Securities United States AT&T Implemented Altibase for its PS-LTE Safety network, where the Presence service plays a vital role. This service handles the reception and storage of user information, conducting real-time checks for online presence and location as needed. Canada Telus One of the major telecommunication companies. Utilizes Altibase for its operations involving real-time user management, processing high volumes of dedicated terminal data, and managing real-time location information (GIS) for terminals. Altibase contributes to the company's in-house solution for maintaining uninterrupted services during national disasters or similar situations, ensuring efficiency and reliability. China China Mobile, China Unicom, China Telecom The three major telecommunications companies. Utilize ALTIBASE HDB in 29 of 31 Chinese provinces. Turkish Ziraat Bank, Halk Bank, Deniz Bank, Garanti BBVA, TEB, Oyak Bank, QNB, Burgan Bank, and others. In 2018, Altibase entered the market through a partnership with ATP-Tradesoft, a subsidiary of Ata Holdings. Collaborating with ATP-Tradesoft. Altibase integrated into the Online Trading System XFront. This integration was well-received by major financial institutions and securities firms in Turkey. Altibase is currently implemented in the XFront Online Trading System, used by 13 significant financial institutions and banks in the Turkey. Thailand Bualuang Securities Altibase has been supplied its DBMS to support the construction of the online stock trading platform. Mongolia MobiCom The Mongolian telecommunication giant, has adopted Altibase’s 7.0 version for its mobile platform for storing the infrequently used data. Azerbaijan M1 highway Altibase has been supplied as the Database Management System (DBMS) for the electronic toll collection system. One of the most crucial transportation networks in the country. India State-owned Karur Vysya Bank In 2013, Altibase provided its hybrid database solution and was deployed for the online banking system === Industries === Telecommunications LGU+ SK Telecom KT Corporation AT&T Telus Financial services Shinhan Bank Woori Bank KakaoPay Securities Implemented Altibase in its stock trading system Leveraging Altibase's replication feature, along with offline replication through shared disk and adapter functionality, the system ensures a high level of availability and consistency, with a reliability rate of 99.999% even in the event of system failures. COREDAX Cryptocurrency market Altibase has entered into a strategic partnership by signing a database management system (DBMS) supply contract with the cryptocurrency exchange Bloomberg ETRADE Manufacturing Samsung Electronics LG POSCO Hanhwa Canon(Toshiba) Intel HP Utilities South Korean Ministry of Defense G-Market CJ UAT Inc. Chung-Ang University == Features == Altibase is a so-called "hybrid DBMS", meaning that it simultaneously supports access to both memory-resident and disk-resident tables via a single interface. It is compatible with Solaris, HP-UX, AIX, Linux, and Windows. It supports the complete SQL standard, features Multiversion concurrency control (MVCC), implements Fuzzy and Ping-Pong Checkpointing for periodically backing up memory-resident data, and ships with Replication and Database Link functionality. High performance, large -capacity service Fast real-time data processing and large amounts of data stable Provide parallel processing architecture for large data management Developed and provided Hybrid Partitioned Table function for efficiency according to data personality High stability

    Read more →
  • HashClash

    HashClash

    HashClash was a volunteer computing project running on the Berkeley Open Infrastructure for Network Computing (BOINC) software platform to find collisions in the MD5 hash algorithm. It was based at Department of Mathematics and Computer Science at the Eindhoven University of Technology, and Marc Stevens initiated the project as part of his master's degree thesis. The project ended after Stevens defended his M.Sc. thesis in June 2007. However, SHA1 was added later, and the code repository was ported to git in 2017. The project was used to create a rogue certificate authority certificate in 2009.

    Read more →
  • Cloud Data Management Interface

    Cloud Data Management Interface

    ISO/IEC 17826 Information technology — Cloud Data Management Interface (CDMI) Version 2.0.0 is an international standard that specifies a protocol for self-provisioning, administering and managing access to data stored in cloud storage, object storage, storage area network and network attached storage systems. The CDMI standard is developed and maintained by the Storage Networking Industry Association, who makes a publicly accessible version of the specification available. CDMI defines new resource representations to enable standardized management of any URI-accessible data, and defines RESTful HTTP operations using these representations to discover the capabilities of the storage system, discover stored data, access and update management metadata, specify data storage protocols (such as iSCSI and NFS) through which the stored data is accessed, and provide cross-system and cross-cloud import and export in order to enable data portability. Management functions enabled by CDMI include managing data ownership, identity mapping, access controls, user-specified metadata, and to declaratively specify desired data protection, data retention, constraints on geographic placement, desired quality of service, data versioning and security requirements. CDMI also defines utility services to facilitate data management, such the ability to query data matching specific criteria, and includes extensions to perform bulk updates using CDMI Jobs. == Capabilities == Compliant implementations must provide access to a set of configuration parameters known as capabilities. These are either boolean values that represent whether or not a system supports things such as queues, export via other protocols, path-based storage and so on, or numeric values expressing system limits, such as how much metadata may be placed on an object. As a minimal compliant implementation can be quite small, with few features, clients need to check the cloud storage system for a capability before attempting to use the functionality it represents. Resource allocation assignments limited to the data management interface protocols must possess access bypass capabilities which extend beyond the layered framework. This integral function is vital to the prevention of transport layer session hijacking by unauthorized entities which may circumvent standard interfacing security parameters. == Containers == A CDMI client may access objects, including containers, by either name or object id (OID), assuming the CDMI server supports both methods. When storing objects by name, it is natural to use nested named containers; the resulting structure corresponds exactly to a traditional filesystem directory structure. == Objects == Objects are similar to files in a traditional file system, but are enhanced with an increased amount and capacity for metadata. As with containers, they may be accessed by either name or OID. When accessed by name, clients use URLs that contain the full pathname of objects to create, read, update and delete them. When accessed by OID, the URL specifies an OID string in the cdmi-objectid container; this container presents a flat name space conformant with standard object storage system semantics. Subject to system limits, objects may be of any size or type and have arbitrary user-supplied metadata attached to them. Systems that support query allow arbitrary queries to be run against the metadata. == Domains, Users and Groups == CDMI supports the concept of a domain, similar in concept to a domain in the Windows Active Directory model. Users and groups created in a domain share a common administrative database and are known to each other on a "first name" basis, i.e. without reference to any other domain or system. Domains also function as containers for usage and billing summary data. == Access Control == CDMI exactly follows the ACL and ACE model used for file authorization operations by NFSv4. This makes it also compatible with Microsoft Windows systems. == Metadata == CDMI draws much of its metadata model from the XAM specification. Objects and containers have "storage system metadata", "data system metadata" and arbitrary user specified metadata, in addition to the metadata maintained by an ordinary filesystem (atime etc.). == Queries == CDMI specifies a way for systems to support arbitrary queries against CDMI containers, with a rich set of comparison operators, including support for regular expressions. == Queues == CDMI supports the concept of persistent FIFO (first-in, first-out) queues. These are useful for job scheduling, order processing and other tasks in which lists of things must be processed in order. == Compliance == Both retention intervals and retention holds are supported by CDMI. A retention interval consists of a start time and a retention period. During this time interval, objects are preserved as immutable and may not be deleted. A retention hold is usually placed on an object because of judicial action and has the same effect: objects may not be changed nor deleted until all holds placed on them are removed. == Billing == Summary information suitable for billing clients for on-demand services can be obtained by authorized users from systems that support it. == Serialization == Serialization of objects and containers allows export of all data and metadata on a system and importation of that data into another cloud system. == Foreign protocols == CDMI supports export of containers as NFS or CIFS shares. Clients that mount these shares see the container hierarchy as an ordinary filesystem directory hierarchy, and the objects in the containers as normal files. Metadata outside of ordinary filesystem metadata may or may not be exposed. Provisioning of iSCSI LUNs is also supported. == Client SDKs == CDMI Reference Implementation Droplet libcdmi-java libcdmi-python .NET SDK

    Read more →
  • Correlation immunity

    Correlation immunity

    In mathematics, the correlation immunity of a Boolean function is a measure of the degree to which its outputs are uncorrelated with some subset of its inputs. Specifically, a Boolean function is said to be correlation-immune of order m if every subset of m or fewer variables in x 1 , x 2 , … , x n {\displaystyle x_{1},x_{2},\ldots ,x_{n}} is statistically independent of the value of f ( x 1 , x 2 , … , x n ) {\displaystyle f(x_{1},x_{2},\ldots ,x_{n})} . == Definition == A function f : F 2 n → F 2 {\displaystyle f:\mathbb {F} _{2}^{n}\rightarrow \mathbb {F} _{2}} is k {\displaystyle k} -th order correlation immune if for any independent n {\displaystyle n} binary random variables X 0 … X n − 1 {\displaystyle X_{0}\ldots X_{n-1}} , the random variable Z = f ( X 0 , … , X n − 1 ) {\displaystyle Z=f(X_{0},\ldots ,X_{n-1})} is independent from any random vector ( X i 1 … X i k ) {\displaystyle (X_{i_{1}}\ldots X_{i_{k}})} with 0 ≤ i 1 < … < i k < n {\displaystyle 0\leq i_{1}<\ldots Read more →

  • You Only Look Once

    You Only Look Once

    You Only Look Once (YOLO) is a series of real-time object detection systems based on convolutional neural networks. First introduced by Joseph Redmon et al. in 2015, YOLO has undergone several iterations and improvements, becoming one of the most popular object detection frameworks. The name "You Only Look Once" refers to the fact that the algorithm requires only one forward propagation pass through the neural network to make predictions, unlike previous region proposal-based techniques like R-CNN that require thousands for a single image. == Overview == Compared to previous methods like R-CNN and OverFeat, instead of applying the model to an image at multiple locations and scales, YOLO applies a single neural network to the full image. This network divides the image into regions and predicts bounding boxes and probabilities for each region. These bounding boxes are weighted by the predicted probabilities. === OverFeat === OverFeat was an early influential model for simultaneous object classification and localization. Its architecture is as follows: Train a neural network for image classification only ("classification-trained network"). This could be one like the AlexNet. The last layer of the trained network is removed, and for every possible object class, initialize a network module at the last layer ("regression network"). The base network has its parameters frozen. The regression network is trained to predict the ( x , y ) {\displaystyle (x,y)} coordinates of two corners of the object's bounding box. During inference time, the classification-trained network is run over the same image over many different zoom levels and croppings. For each, it outputs a class label and a probability for that class label. Each output is then processed by the regression network of the corresponding class. This results in thousands of bounding boxes with class labels and probability. These boxes are merged until only one single box with a single class label remains. == Versions == There are two parts to the YOLO series. The original part contained YOLOv1, v2, and v3, all released on a website maintained by Joseph Redmon. === YOLOv1 === The original YOLO algorithm, introduced in 2015, divides the image into an S × S {\displaystyle S\times S} grid of cells. If the center of an object's bounding box falls into a grid cell, that cell is said to "contain" that object. Each grid cell predicts B bounding boxes and confidence scores for those boxes. These confidence scores reflect how confident the model is that the box contains an object and how accurate it thinks the box is that it predicts. In more detail, the network performs the same convolutional operation over each of the S 2 {\displaystyle S^{2}} patches. The output of the network on each patch is a tuple as follows: ( p 1 , … , p C , c 1 , x 1 , y 1 , w 1 , h 1 , … , c B , x B , y B , w B , h B ) {\displaystyle (p_{1},\dots ,p_{C},c_{1},x_{1},y_{1},w_{1},h_{1},\dots ,c_{B},x_{B},y_{B},w_{B},h_{B})} where p i {\displaystyle p_{i}} is the conditional probability that the cell contains an object of class i {\displaystyle i} , conditional on the cell containing at least one object. x j , y j , w j , h j {\displaystyle x_{j},y_{j},w_{j},h_{j}} are the center coordinates, width, and height of the j {\displaystyle j} -th predicted bounding box that is centered in the cell. Multiple bounding boxes are predicted to allow each prediction to specialize in one kind of bounding box. For example, slender objects might be predicted by j = 2 {\displaystyle j=2} while stout objects might be predicted by j = 1 {\displaystyle j=1} . c j {\displaystyle c_{j}} is the predicted intersection over union (IoU) of each bounding box with its corresponding ground truth. The network architecture has 24 convolutional layers followed by 2 fully connected layers. During training, for each cell, if it contains a ground truth bounding box, then only the predicted bounding boxes with the highest IoU with the ground truth bounding boxes is used for gradient descent. Concretely, let j {\displaystyle j} be that predicted bounding box, and let i {\displaystyle i} be the ground truth class label, then x j , y j , w j , h j {\displaystyle x_{j},y_{j},w_{j},h_{j}} are trained by gradient descent to approach the ground truth, p i {\displaystyle p_{i}} is trained towards 1 {\displaystyle 1} , other p i ′ {\displaystyle p_{i'}} are trained towards zero. If a cell contains no ground truth, then only c 1 , c 2 , … , c B {\displaystyle c_{1},c_{2},\dots ,c_{B}} are trained by gradient descent to approach zero. === YOLOv2 === Released in 2016, YOLOv2 (also known as YOLO9000) improved upon the original model by incorporating batch normalization, a higher resolution classifier, and using anchor boxes to predict bounding boxes. It could detect over 9000 object categories. It was also released on GitHub under the Apache 2.0 license. === YOLOv3 === YOLOv3, introduced in 2018, contained only "incremental" improvements, including the use of a more complex backbone network, multiple scales for detection, and a more sophisticated loss function. === YOLOv4 and beyond === Subsequent versions of YOLO (v4, v5, etc.) have been developed by different researchers, further improving performance and introducing new features. These versions are not officially associated with the original YOLO authors but build upon their work. As of 2026, versions up to YOLO26 have been released..

    Read more →
  • Tropical cryptography

    Tropical cryptography

    In tropical analysis, tropical cryptography refers to the study of a class of cryptographic protocols built upon tropical algebras. In many cases, tropical cryptographic schemes have arisen from adapting classical (non-tropical) schemes to instead rely on tropical algebras. The case for the use of tropical algebras in cryptography rests on at least two key features of tropical mathematics: in the tropical world, there is no classical multiplication (a computationally expensive operation), and the problem of solving systems of tropical polynomial equations has been shown to be NP-hard. == Basic Definitions == The key mathematical object at the heart of tropical cryptography is the tropical semiring ( R ∪ { ∞ } , ⊕ , ⊗ ) {\displaystyle (\mathbb {R} \cup \{\infty \},\oplus ,\otimes )} (also known as the min-plus algebra), or a generalization thereof. The operations are defined as follows for x , y ∈ R ∪ { ∞ } {\displaystyle x,y\in \mathbb {R} \cup \{\infty \}} : x ⊕ y = min { x , y } {\displaystyle x\oplus y=\min\{x,y\}} x ⊗ y = x + y {\displaystyle x\otimes y=x+y} It is easily verified that with ∞ {\displaystyle \infty } as the additive identity, these binary operations on R ∪ { ∞ } {\displaystyle \mathbb {R} \cup \{\infty \}} form a semiring.

    Read more →
  • Undeniable signature

    Undeniable signature

    An undeniable signature is a digital signature scheme which allows the signer to be selective to whom they allow to verify signatures. The scheme adds explicit signature repudiation, preventing a signer later refusing to verify a signature by omission; a situation that would devalue the signature in the eyes of the verifier. It was invented by David Chaum and Hans van Antwerpen in 1989. == Overview == In this scheme, a signer possessing a private key can publish a signature of a message. However, the signature reveals nothing to a recipient/verifier of the message and signature without taking part in either of two interactive protocols: Confirmation protocol, which confirms that a candidate is a valid signature of the message issued by the signer, identified by the public key. Disavowal protocol, which confirms that a candidate is not a valid signature of the message issued by the signer. The motivation for the scheme is to allow the signer to choose to whom signatures are verified. However, that the signer might claim the signature is invalid at any later point, by refusing to take part in verification, would devalue signatures to verifiers. The disavowal protocol distinguishes these cases removing the signer's plausible deniability. It is important that the confirmation and disavowal exchanges are not transferable. They achieve this by having the property of zero-knowledge; both parties can create transcripts of both confirmation and disavowal that are indistinguishable, to a third-party, of correct exchanges. The designated verifier signature scheme improves upon deniable signatures by allowing, for each signature, the interactive portion of the scheme to be offloaded onto another party, a designated verifier, reducing the burden on the signer. == Zero-knowledge protocol == The following protocol was suggested by David Chaum. A group, G, is chosen in which the discrete logarithm problem is intractable, and all operation in the scheme take place in this group. Commonly, this will be the finite cyclic group of order p contained in Z/nZ, with p being a large prime number; this group is equipped with the group operation of integer multiplication modulo n. An arbitrary primitive element (or generator), g, of G is chosen; computed powers of g then combine obeying fixed axioms. Alice generates a key pair, randomly chooses a private key, x, and then derives and publishes the public key, y = gx. === Message signing === Alice signs the message, m, by computing and publishing the signature, z = mx. === Confirmation (i.e., avowal) protocol === Bob wishes to verify the signature, z, of m by Alice under the key, y. Bob picks two random numbers: a and b, and uses them to blind the message, sending to Alice: c = magb. Alice picks a random number, q, uses it to blind, c, and then signing this using her private key, x, sending to Bob: s1 = cgq ands2 = s1x. Note that s1x = (cgq)x = (magb)xgqx = (mx)a(gx)b+q = zayb+q. Bob reveals a and b. Alice verifies that a and b are the correct blind values, then, if so, reveals q. Revealing these blinds makes the exchange zero knowledge. Bob verifies s1 = cgq, proving q has not been chosen dishonestly, and s2 = zayb+q, proving z is valid signature issued by Alice's key. Note that zayb+q = (mx)a(gx)b+q. Alice can cheat at step 2 by attempting to randomly guess s2. === Disavowal protocol === Alice wishes to convince Bob that z is not a valid signature of m under the key, gx; i.e., z ≠ mx. Alice and Bob have agreed an integer, k, which sets the computational burden on Alice and the likelihood that she should succeed by chance. Bob picks random values, s ∈ {0, 1, ..., k} and a, and sends: v1 = msga and v2 = zsya, where exponentiating by a is used to blind the sent values. Note that v2 = zsya = (mx)s(gx)a = v1x. Alice, using her private key, computes v1x and then the quotient, v1xv2−1 = (msga)x(zsgxa)−1 = msxz−s = (mxz−1)s. Thus, v1xv2−1 = 1, unless z ≠ mx. Alice then tests v1xv2−1 for equality against the values: (mxz−1)i for i ∈ {0, 1, …, k}; which are calculated by repeated multiplication of mxz−1 (rather than exponentiating for each i). If the test succeeds, Alice conjectures the relevant i to be s; otherwise, she conjectures random value. Where z = mx, (mxz−1)i = v1xv2−1 = 1 for all i, s is unrecoverable. Alice commits to i: she picks a random r and sends hash(r, i) to Bob. Bob reveals a. Alice confirms that a is the correct blind (i.e., v1 and v2 can be generated using it), then, if so, reveals r. Revealing these blinds makes the exchange zero knowledge. Bob checks hash(r, i) = hash(r, s), proving Alice knows s, hence z ≠ mx. If Alice attempts to cheat at step 3 by guessing s at random, the probability of succeeding is 1/(k + 1). So, if k = 1023 and the protocol is conducted ten times, her chances are 1 to 2100.

    Read more →