AI Generator Question Paper

AI Generator Question Paper — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • GeoNetwork opensource

    GeoNetwork opensource

    The GeoNetwork opensource (GNOS) project is a free and open source (FOSS) cataloging application for spatially referenced resources. It is a catalog of location-oriented information. == Outline == It is a standardized and decentralized spatial information management environment designed to enable access to geo-referenced databases, cartographic products and related metadata from a variety of sources, enhancing the spatial information exchange and sharing between organizations and their audience, using the capacities of the internet. Using the Z39.50 protocol it both accesses remote catalogs and makes its data available to other catalog services. As of 2007, OGC Web Catalog Service are being implemented. Maps, including those derived from satellite imagery, are effective communicational tools and play an important role in the work of decision makers (e.g., sustainable development planners and humanitarian and emergency managers) in need of quick, reliable and up-to-date user-friendly cartographic products as a basis for action and to better plan and monitor their activities; GIS experts in need of exchanging consistent and updated geographical data; and spatial analysts in need of multidisciplinary data to perform preliminary geographical analysis and make reliable forecasts. == Deployment == The software has been deployed to various organizations, the first being FAO GeoNetwork and WFP VAM-SIE-GeoNetwork, both at their headquarters in Rome, Italy. Furthermore, the WHO, CGIAR, BRGM, ESA, FGDC and the Global Change Information and Research Centre (GCIRC) of China are working on GeoNetwork opensource implementations as their spatial information management capacity. It is used for several risk information systems, in particular in the Gambia. Several related tools are packaged with GeoNetwork, including GeoServer. GeoServer stores geographical data, while GeoNetwork catalogs collections of such data.

    Read more →
  • Data product

    Data product

    In data management and product management, a data product is a reusable, active, and standardized data asset designed to deliver measurable value to its users, whether internal or external, by applying the rigorous principles of product thinking and management. It comprises one or more data artifacts (e.g., datasets, models, pipelines) and is enriched with metadata, including governance policies, data quality rules, data contracts, and, where applicable, a software bill of materials (SBOM) to document its dependencies and components. Ownership of a data product is aligned to a specific domain or use case, ensuring accountability, stewardship, and its continuous evolution throughout its lifecycle. Adhering to the FAIR principles – findable, accessible, interoperable, and reusable – a data product is designed to be discoverable, scalable, reusable, and aligned with both business and regulatory standards, driving innovation and efficiency in modern data ecosystems. == History == In 2012, DJ Patil proposed the first documented definition: a data product is a product that facilitates an end goal through the use of data. In 2019, Zhamak Dehghani introduced Data Mesh, with a strong focus on domain-oriented data products. Later, in 2020, she solidifies Data Mesh around four principles, one being Data as a Product, in which she defines Data Product as the node on the mesh that encapsulates three structural components required for its function, providing access to the domain's analytical data as a product. In 2024, Andrea Gioia published one of the first books specifically on data products post Data Mesh announcement. In his book, Gioia defines the concept of pure data product. In 2025, during the Data Day Texas conference, Jean-Georges Perrin and a collective of product managers and data engineers got together to craft the current definition and make it available to the public domain. In July 2025, Bitol, a project of The Linux Foundation, released and early version of the Open Data Product Standard (ODPS) aiming at normalizing data products

    Read more →
  • TRAME

    TRAME

    TRAME (TRAnsmission of MEssages) was the name of the second computer network in the world similar to the internet to be used in an electric utility. Like the internet, the base technology was packet switching; it was developed by the electric utility ENHER in Barcelona. It was deployed by the same utility, first in Catalonia and Aragón, Spain, and later in other places. Its development started in 1974 and the first routers, called nodes at that time, were deployed by 1978. The network was in operation until 2016 (38 years) with successive technological software and hardware updates. == Beginnings == In 1974, packet switching was a technology known only in research circles. The concept began in 1968 in association with the United States' Advanced Research Projects Agency (ARPA) research project ARPANET. The idea of applying the packet switching concept to electric utilities control communication networks first appeared in 1974 when the Swedish power utility Vattenfall started to create its TIDAS packet-switching network and was followed by the Spanish electric utility ENHER, which aimed to telecontrol and automate its high-voltage power grid. For this purpose, ENHER created a specific team of people to develop both the packet-switching network and the supervisory control and data acquisition (SCADA) system, also called the telecontrol system. By 1978 the first four TRAME routers were available and by 1980, eight of them were deployed and operating. The printed circuit boards (PCBs) controlling the communication lines were connected to a shared memory PCB allowing them to exchange data and messages. The project was developed together with its main initial application, the Telecontrol or SCADA system SICL (Sistema Integral de Control Local) with which initially they shared a very similar hardware. The maximum link capacity was 9600 bit/s, which in 1980 was the maximum possible on a 4 kHz wide voice channel at the time. These channels were the basic unit of the then-analog communication systems in use. By that time power utilities used either telephone calls or low speed (below 1200bit/s) dedicated links for telecontrol, typically shared among ten high-voltage electrical substations. == Services == The basic service provided by the TRAME network was SCADA or Telecontrol to automate the high-voltage power grid, thus improving operational efficiency, which was until then operated manually with telephone communication between human operators. Each TRAME router was associated with one or more remote terminal units (RTUs) of the SICL telecontrol system. It also had connected screens, and later PCs, located in electrical substations to interchange messages between them and with the Control Center located in the well-known Casa Fuster in Barcelona. It was a kind of predecessor to today's e-mail. Later, in the 1990s, other protocols (X.25, IP) were developed to include corporate information technology (IT) terminals, company physical surveillance systems and other services. Additionally, applications and terminals were developed for the transmission of voice and video over the TRAME network. == Protocols == The TRAME routing system, like that of the original ARPANET, was based on the Bellman-Ford algorithm but with "split-horizon" as in the Swedish TIDAS network, but with an original improvement. This protocol allows optimal paths to be found in meshed networks for each packet to be transmitted, allowing the shared use of the same network by multiple services. In contrast, traditional circuit-switched technology used to establish dedicated circuits for each service or communication. The addressing of routers and terminals used a proprietary system with a 16-bit address; it would be the equivalent of the well-known IP (Internet Protocol) version 4 (IPv4), still in use on the internet today, which uses 32-bit addresses. It is necessary to take into account that in 1978, the IPv4 protocol did not yet exist since the IPv4 version used on the internet did not appear until 1981, and in fact, did not reach the general public until much later. The line protocols were also proprietary and were called UCL (Unidad de Control de Línea, 'line control unit'), which linked the routers together, and UTR (Unión TRAME-Remotas), the access protocol. They were designed to offer the highest quality of service required by the telecontrol/SCADA function in terms of data integrity and availability set by the International Electrotechnical Commission (IEC) IEC-870-5-1 and ANSI C37.1. standards, and because the protocol used at the time in corporate computer networks, HDLC (high-level data link control), did not offer enough quality for critical industrial applications. Later on, other protocols like X.25 and IP were also made compatible with the aforementioned TRAME protocols. In 2000, the UTR protocol was replaced by the international standard IEC 60870- 5-101/104. Initially network flow control was based on the management of eight data priorities in head-of-the-line (HOL) waiting queues. Later and after some experimentation, a flow control method based on a bit indicating route congestion and management of the gap between packets when accessing the network was adopted. This required measuring the capacity of the route bottleneck. An end-to-end protocol was also added for some flows requiring order preservation like X.25. == Evolution == To last for 38 years, the technology had to endure intense evolution. There were essentially four TRAME generations which are summarized in the table. A description of the four generations of TRAME is provided below. === TRAME 1 === The project began in 1974 and in 1978 a first network with four routers was already installed and in operation at the electric utility ENHER. In 1980, the network had eight nodes in operation (see Figure I). The hardware was based on the Zilog Z80 processor and had a multiprocessor structure with 16 processors sharing a common memory. The software was developed at ENHER's headquarters located in the well-known Casa Fuster, Passeig de Gràcia, 132, Barcelona, using the Z80 assembly language. Beyond 1980 the software began to be written in C programming language and an HP64000 Logic Development System emulator was used for the purpose. The hardware was produced by ISEL, an INI (Instituto Nacional de Indústria) company. The routing system was a variant of Bellman-Ford with split-horizon. It was an improvement of the original ARPA network routing system consisting of an original update procedure which allowed for a faster reaction to changes. The distance function was the number of packets in the output waiting queues plus one. The line protocols (UCL for internal lines linking routers and UTR for accessing the network) were designed to meet the stringent requirements set for telecontrol (SCADA) of high-voltage power networks (IEC-870-5-1 and ANSI C37.1 standards). At the OSI transport layer, windows with a width of 1 to 8, depending on the required service, residing in the terminals were used. Initially, addresses were only 14 bits long to address both the routers (called nodes by then) and the devices connected to them. They were made up of two fields, an 8-bit field to address the router and a 6-bit sub-address to address the terminals connected to it. The node address was assigned to the nodes and not to the ends of the links as in the internet. The basic advantages of TRAME over other technologies used in electric utilities at the time were in part due to the packet technology itself: ability to manage any network topology, automatic adaptability to topological and traffic changes, integration of different link technologies (digital or analog) and capacities in a single network, open and decentralized intercommunicability between users and devices, simultaneous communication with several users and locations from a single physical connection, and integrated network supervision. In fact, the network was provided from its inception with a supervision center consisting of a computer and a synoptic board located at the company's headquarters (see Figure II). But other advantages were due to the specific design of TRAME: high data integrity, priority support for packets, and ease of including special protocols such as the many SCADA protocols in use at that time. All of the above resulted in improved quality of service, especially with respect to data availability and data integrity, and in the integration of services in a single network. Part of the evolution of its deployment can be seen in Figures II to IV. === TRAME 2 === In 1990, TRAME 2 was fully deployed and TRAME 1 was replaced. The processor of the new hardware was Intel 80286 and the hardware structure and external appearance of the routers was very similar to that of TRAME 1. The software was written in C and the above-mentioned emulator continued to be used. Improvements over TRAME 1 were the introduction of the standardized X.25 access protocol

    Read more →
  • Sharenting

    Sharenting

    "Sharenting" is a portmanteau of "sharing" and "parenting", describing the practice of parents publicizing a large amount of potentially sensitive content about their children on internet platforms, most notably on social media. While the term was coined as recently as 2010, sharenting has become an international phenomenon with widespread presence in the United States, Spain, France, and the United Kingdom. Proponents of sharenting frame the practice as a natural expression of parental pride in their children and argue that critics take sharenting-related posts out of context. Detractors find that it violates child privacy and hurts a parent–child relationship. Academic research has been conducted over the potential social motivations for sharenting and legal frameworks to balance child privacy with this parental practice. Researchers have conducted several psychological surveys, outlining social media accessibility, parental self-identification with children, and social pressure as potential causes for sharenting. Legal scholars have identified international human rights laws, labor protections, and recent online child privacy statutes as potential legal standards to check sharenting abuses. == History == The origins of the term "sharenting" have been attributed to the Wall Street Journal, where they called it "oversharenting," a portmanteau of "oversharing" and "parenting." Priya Kumar suggests that recording life moments of children rearing is not a new practice: people have been using diaries, scrapbooks and baby log books as the media of documentation for centuries. Scholars assert that sharenting has become popular as a result of social media, which has made many people more comfortable with sharing their lives and those of their children online. The trend of oversharing on social media has raised public attention in the 2010s and become the focus of a number of editorials and academic research projects. It was also added to Times Word of the Day in February 2013 and Collins English Dictionary in 2016 given its influence. == Popularity == Several studies describe sharenting as an international phenomenon with widespread prevalence across households. In the United States, researchers at the University of Michigan C.S. Mott Children's Hospital found that almost 75% of American parents were familiar with someone who over-shared information about their child on social media, and an AVG survey determined that 92% of all American two-year-olds had some presence on the internet. In Australia, Fisher-Price conducted a survey which revealed that 90% of Australian parents admitted to over-sharing. In Spain and Czech Republic, a survey of approximately 1,500 parents found that 70-80% participated in sharenting. In the United Kingdom, France, Germany, and Italy, a Research Now report revealed that almost three-quarters of surveyed parents said that they were "willing to share images of their infants". Some claim that sharenting presents a violation of child privacy, and this backlash includes anti-sharenting sites and apps that block baby pictures. One particular outlet of protest was the blog STFU Parents, founded in 2009 to criticize parental oversharing on social media. Some parents felt that these criticisms of sharenting often took posts out of context and neglected some positive aspects of the practice, including advancing a stronger sense of online community. Others, while acknowledging the potential privacy violations of sharenting, suggested a more tailored approach that would only permit posting under certain conditions, notwithstanding audience and identification restrictions for social media posts. == Motivations == Research has suggested that sharenting is associated with a mix of parent self-identification with children, mothering pressures, and the accessibility of social media. Conducting 17 interviews with mothers in the United Kingdom, a London School of Economics study found that parent bloggers often re-explained their sharing practices in terms of expressing their own personal identity, representing their own child as part of themselves. In particular, the report surveyed the use of blogs as a networking vehicle to connect parents with similar family situations and found that sharenting parents, by filtering self-presentation through their parent-child relationship, adopted a more relational identity on social media websites. This included identifying oneself in terms of parental circumstances, whether it be raising a child with a disability or being a single mother. Alternatively, some have suggested that these online expressions indicate the infiltration of individual pride into the sphere of parenting, as family photography becomes a means to "show off" one's children to the others and strengthens a parent's sense of individuated self. Addressing the prevalence of mothers engaging in sharenting, those who purport this view argue that the rise of digital communication has pressured mothers into performing the role of a "good" parent on social media platforms. They claim that these developments may reinforce a dominant vision of a "normal" family, as sharenting posts could be motivated by the need to converge to a normative interpretation of family. == Controversy == While some people assert that online platforms enable parents to establish a community and seek parenting support, others are concerned about the children's data privacy and their lack of informed consent. Sharing content may not only embarrass children but also creates an initial digital footprint, a history of online activity, that the children themselves have no control over. This might bring some negative consequences, such as being ridiculed at school or leaving a negative impression on future employers. === Parental benefits === Many parents use social media to seek parenting advice and share information about their children. With the convenience of online platforms, parent bloggers can easily connect with other people in similar situations as well as those who are willing to contribute meaningful advice. By forming a community, parents can receive encouragement from empathetic peers and assistance from experts in children rearing. Parents whose children need special educational accommodations or have disabilities often found themselves detached from the mainstream parenting style. Therefore, they regard online blogs as a means to gain support from others and support back. Online blogging enables parents of children with disabilities and special needs to connect with other parents. The advice from similarly situated families can open up new possibilities that help the parents "negotiate the complexities of social services, health care, and schools". However, in some cases, posting online about a parent's struggles can cause a backlash, as advocates may accuse the parent of presenting people with that condition in a bad light, or wonder how the child will feel, if they later read these posts and see how much their parents struggled to care for them. Such advantages of social media are not limited to particular groups of parents. In general, most parents benefit from exchanging parenting experience. Statistically speaking, 72% of parents rate social media useful for emotional connection and affirmations, and 74% of them receive support about parenting from friends on social media. Sharenting also plays a role in fostering interpersonal relationships. As the images and words about children's lives initiate conversations, parents use sharenting to stay connected with distant friends and relatives. In particular, mothers, as a research study reveals, are willing to engage in sharenting since they believe that the positive contents can help avoid digital conflicts and maintain close relations with those in their social circles. Researchers also found that female participants in this study carefully chose photos and phrases to express love and present laudable behaviors of children in their updates, which indicates their intention to convey positive messages. These messages also promote a close social network for a child as the parents invites supportive family members and friends into daily life. === Children's privacy === Given the potential misuse of digital data, people are critical about sharenting, and the majority of parents are cautious about the wrongdoing with online posts. The disclosure of minors' personal information, such as geographic location, name, date of birth, pictures, and the schools they attend, might expose them to illegal practices by recipients with malicious intentions. Sharented information is often abused for "identity theft", when imposters manage to track, stalk, commit fraud against children, or even blackmail the family. According to Barclays, online fraud targeting the young generation will contribute to a loss of £670 million (approximately $790 million) by 2030, and two-thirds of identity fraud will be related to s

    Read more →
  • Embedding (machine learning)

    Embedding (machine learning)

    In machine learning, embedding is a representation learning technique that maps complex, high-dimensional data into a lower-dimensional vector space of numerical vectors. == Technique == It also denotes the resulting representation, where meaningful patterns or relationships are preserved. As a technique, it learns these vectors from data like words, images, or user interactions, differing from manually designed methods such as one-hot encoding. This process reduces complexity and captures key features without needing prior knowledge of the domain. == Similarity == In natural language processing, words or concepts may be represented as feature vectors, where similar concepts are mapped to nearby vectors. The resulting embeddings vary by type, including word embeddings for text (e.g., Word2Vec), image embeddings for visual data, and knowledge graph embeddings for knowledge graphs, each tailored to tasks like NLP, computer vision, or recommendation systems. This dual role enhances model efficiency and accuracy by automating feature extraction and revealing latent similarities across diverse applications. To measure the distance between two embeddings, a similarity measure can be used to find the overall similarity of the concepts represented by the embeddings. If the vectors are normalized to have a magnitude of 1, then the similarity measures are proportional to cos ⁡ ( θ a b ) {\displaystyle \cos \left(\theta _{ab}\right)} . The cosine similarity disregards the magnitude of the vector when determining similarity, so it is less biased towards training data that appears very frequently. The dot product includes the magnitude inherently, so it will tend to value more popular data. Generally, for high-dimensional vector spaces, vectors tend to converge in distance, so Euclidean distance becomes less reliable for large embedding vectors.

    Read more →
  • Social Media (Age-Restricted Users) Bill

    Social Media (Age-Restricted Users) Bill

    The Social Media (Age-Restricted Users) Bill is a member's bill by National Party Member of Parliament Catherine Wedd that seeks to ban children under the age of 16 years from accessing social media by forcing social media companies to implement age verification measures. It is modelled after the Australian government's Online Safety Amendment. In mid October 2025, the New Zealand Parliament confirmed plans to introduce the social media age restriction bill. == Background == In late November 2024, the Albanese government of Australia, with support from the opposition Coalition parties, passed the Online Safety Amendment creating a world-first age verification regime targeting social media platforms operating in the country. The ban targets several social media platforms including Facebook, Instagram, Kick, Reddit, Snapchat, Threads, TikTok, Twitch, X (formerly Twitter) and YouTube. These platforms were required to implement age verification systems and to remove under-age users by 10 December 2025, when the law change came into effect. == Draft provisions == The draft Social Media (Age-Restricted Users) Bill defines social media platforms as electronic platforms that enable social media interactions between two or more end-users, facilitates communication between multiple end-users and allows users to post content on the platform. The proposed bill requires social media companies to take action to prevent users under the age of 16 from creating accounts on their platforms. It also creates a framework for courts to impose fines on platforms that fail to take reasonable steps to prevent underaged users from accessing the platform. == Legislative history == === Draft legislation === On 6 May 2025, Wedd announced a private member's bill called the "Social Media (Age-Restricted Users) Bill" that would bar access to social media platforms for people under the age of 16 years. She said that she was motivated as the mother of four children to support families, parents and teachers' efforts to manage their children's online exposure and the passage of the Australian Online Safety Amendment legislation in December 2024. Since National's coalition partner ACT New Zealand had refused to support the bill, the Sixth National Government announce it as a member's bill rather than a government bill. Prime Minister Christopher Luxon has confirmed that National would seek cross-party support for the legislation. ACT MP and the Minister of Internal Affairs Brooke van Velden said that the Government would watch the implementation of the Australian social media age restriction policy. In October 2025, Wedd's bill was drawn from the parliamentary ballot. In addition, Labour Reuben Davidson drafted a similar member's bill that would hold social media providers responsible for restricting "harmful content" and imposed NZ$50,000 fines for non-compliance. In November 2025, Luxon reiterated his support for social media age restriction legislation and said the New Zealand government would introduce a bill in 2026 before the 2026 New Zealand general election. He also confirmed that Education Minister Erica Stanford was leading an investigation into what lessons could be learnt from the Australian legislation. At the request of ACT MP Parmjeet Parmar, Parliament's Education and Workforce Committee held an inquiry into a proposed social media ban in early October 2025. The committee was led by National MP Carl Bates and received 430 submissions from 400 groups and individuals. The committee also heard from 87 in-person submissions. On 10 December 2025, the committee made 12 recommendations including restricting social media access to persons under the age of 16, re-evaluating existing legislation such as the Films, Videos, and Publications Classification Act and the Harmful Digital Communications Act 2015, and regulating online platforms and Internet service providers. The ACT party released a dissenting view disagreeing with the need for a law restricting social media access to under-16 year olds. In mid-May 2026, the Government confirmed that work on the proposed bill to ban under-16 year olds from social media had been paused. The New Zealand Parliament held a debate on the proposed bill on 13 May following a select committee inquiry into the harms caused by social media platforms. While the opposition Labour Party has agreed to support the member's bill, the ACT and Green parties opposed the proposed bill on the grounds that the rules were easy to circumvent, that at-risk groups could become more isolated, and that social media also harmed other age groups. == Responses == === Academia and civil society === In late July 2025, the New Zealand Council for Civil Liberties (NZCCL) expressed concern that the proposed social media age restriction could infringe upon the New Zealand Bill of Rights Act 1990, the Privacy Act 2020 and the United Nations' Convention on the Rights of the Child. The NZCCL also questioned the practicality of age verification software, a social media age limit and whether it would fulfil its stated goal of combating online harm. In August 2025, University of Auckland criminologist and senior lecturer Claire Meehan expressed concern that the social media age restriction legislation would cut children from their friendship and support networks. She also said that children and young people were digital natives who could use VPNs to circumvent the ban. Similar sentiments were echoed by Victoria University of Wellington media and communications lecturer Alex Beattie and "Ocean Today" Instagram social media influencer "Charlie." In October 2025, New Zealand Initiative representative Dr Eric Crampton expressed concern that a social media age restriction would involve the introduction of digital IDs. He argued that a new law was unnecessary and said that parents could limit their children's exposure to social media via Google's Family Link and Apple's equivalent. Similarly, Institute of Economic Affairs public policy fellow Matthew Lesh and the British Free Speech Union expressed concerns that young people could use VPNs to circumvent a social media ban, citing the spike in VPN usage in the United Kingdom following the passage of the Online Safety Act 2023. The advocacy group B416's co-chair Anna Curzon advocated for a social media ban on underage users, stating that social media apps "are made to be addictive" and made it difficult for parents to relate with their children. In late November 2025, B416's co-founder Anna Mowbray expressed support for the Government's social media age restriction bill but expressed disappointment that Luxon had not timed his announcement with the launch of the group's campaign. Generation-Z Aotearoa co-founder Lola Fisher has called on the New Zealand Government to consult with young people on the development of the legislation. === Government agencies and departments === In early October 2025, Privacy Commissioner Michael Webster expressed concern that social media platforms requiring users to prove their age via digital IDs could raise privacy concerns. Webster suggested that age verification systems could relay on various documents including passports. He said that age estimation technologies had high error rates and that age inference technologies relied on data mining. === Political parties === In early May 2025, the National Party government expressed support for a social media age restriction legislation. By contrast, its coalition partner ACT has opposed such legislation. ACT leader David Seymour described the ban as hasty and unworkable since it did not involve parents. Meanwhile, New Zealand First leader Winston Peters expressed support for a social media age restriction but said the bill should be subject to a select committee inquiry. The opposition Labour Party leader Chris Hipkins has expressed interest in a social media age restriction legislation but emphasised the need for consensus. Meanwhile, Green Party co-leader Chlöe Swarbrick said she wanted to learn more about the bill but described it as simplistic. Fellow Greens co-leader Marama Davidson said that the proposed bill would punish children and young people for the harm caused by big tech platforms. === Tech companies === In early October 2025, representatives of TikTok and Meta Platforms cautioned against proposed social media ban on under-16 years olds. During a one-day parliamentary inquiry, Ella Woods-Joyce, TikTok's public policy lead for Australia and New Zealand, and Mia Garlick, Meta's regional director of policy, expressed concern that the social media age restriction could send children and young people to less regulated online spaces. Woods-Joyce highlighted TikTok's policy of closing down accounts belonging to users under the age of 13 years while Garlick highlighted Meta's policy of placing users under the age of 16 in private accounts by default. In early February 2026 Meta's vice president and global head of safety, Antigone Da

    Read more →
  • MIME Object Security Services

    MIME Object Security Services

    MIME Object Security Services (MOSS) is a protocol that uses the multipart/signed and multipart/encrypted framework to apply digital signature and encryption services to MIME objects. == Details == The services are offered through the use of end-to-end cryptography between an originator and a recipient at the application layer. Asymmetric (public key) cryptography is used in support of the digital signature service and encryption key management. Symmetric (secret key) cryptography is used in support of the encryption service. The procedures are intended to be compatible with a wide range of public key management approaches, including both ad hoc and certificate-based schemes. Mechanisms are provided to support many public key management approaches. == Spreading == MOSS was never widely deployed and is now abandoned, largely due to the popularity of PGP.

    Read more →
  • Sumazi

    Sumazi

    Sumazi is a social media and social intelligence platform for enterprises, brands, and celebrities. Its technology performs social data analysis across social networking services including Facebook, Twitter and LinkedIn, to identify key people in his/her network who are experts, influencers or are located in a specific area for marketing, advertising or sales campaigns. The technology company was founded in 2011 by former Sun Microsystems employee Sumaya Kazi. The company was headquartered in San Francisco, California. The company was out of business by 2017. == Reception == Sumazi was one of 25 startups selected out of more than 1,200 to compete at TechCrunch Disrupt Startup Battlefield, where it won the Omidyar Network award for the startup "Most Likely to Change the World." Sumazi, which was based out of San Francisco, California, had been profiled in The New York Times as well as USA Today, which commented the advantages of the startup's location in the Silicon Valley. American Express OPEN Forum also featured Sumazi as a "Startup of the Week". Sumazi has additionally been mentioned in articles by Mashable, The Wall Street Journal, Current Editorials, Harvard Business Review, Smashing Magazine, and TechCrunch.

    Read more →
  • Computer Law & Security Review

    Computer Law & Security Review

    The Computer Law & Security Review is an international peer-reviewed journal published by Elsevier. It has been published six times a year since 1985 and is indexed in Scopus and SSCI. It is accessible to a wide range of professional legal and IT practitioners, businesses, academics, researchers, libraries and organisations in both the public and private sectors. The journal regularly covers: CLSR Briefing with special emphasis on UK/US developments European Union update National news from 10 European jurisdictions Pacific rim news column Refereed practitioner and academic papers on topics such as Web 2.0, IT security, Identity management, ID cards, RFID, interference with privacy, Internet law, telecoms regulation, online broadcasting, intellectual property, software law, e-commerce, outsourcing, data protection and freedom of information and many other topics. The Journal's Correspondent Panel includes more than 40 specialists in IT law and security. Each issue contains articles, case law analysis and current news on information and communications technology. Special Features High quality peer reviewed papers from internationally renowned practitioner and academic experts Latest developments reported in situ by more than 20 leading law firms from around the world Highly experienced and respected editor and correspondents panel Online access to all 23 volumes of CLSR with embedded web links to primary sources Contact details of all authors A pool of expertise that can collectively identify the key topics that need to be examined.

    Read more →
  • Hardware random number generator

    Hardware random number generator

    In computing, a hardware random number generator (HRNG), true random number generator (TRNG), non-deterministic random bit generator (NRBG), or physical random number generator is a device that generates random numbers from a physical process capable of producing entropy, unlike a pseudorandom number generator (PRNG) that utilizes a deterministic algorithm and non-physical nondeterministic random bit generators that do not include hardware dedicated to generation of entropy. Many natural phenomena generate low-level, statistically random "noise" signals, including thermal and shot noise, jitter and metastability of electronic circuits, Brownian motion, and atmospheric noise. Researchers also used the photoelectric effect, involving a beam splitter, other quantum phenomena, and even nuclear decay (due to practical considerations the latter, as well as the atmospheric noise, is not viable except for fairly restricted applications or online distribution services). While "classical" (non-quantum) phenomena are not truly random, an unpredictable physical system is usually acceptable as a source of randomness, so the qualifiers "true" and "physical" are used interchangeably. A hardware random number generator is expected to output near-perfect random numbers ("full entropy"). A physical process usually does not have this property, and a practical TRNG typically includes a few blocks: a noise source that implements the physical process producing the entropy. Usually this process is analog, so a digitizer is used to convert the output of the analog source into a binary representation; a conditioner (randomness extractor) that improves the quality of the random bits; health tests. TRNGs are mostly used in cryptographical algorithms that get completely broken if the random numbers have low entropy, so the testing functionality is usually included. Hardware random number generators generally produce only a limited number of random bits per second. In order to increase the available output data rate, they are often used to generate the "seed" for a faster PRNG. PRNG also helps with the noise source "anonymization" (whitening out the noise source identifying characteristics) and entropy extraction. With a proper PRNG algorithm selected (cryptographically secure pseudorandom number generator, CSPRNG), the combination can satisfy the requirements of Federal Information Processing Standards and Common Criteria standards. == Uses == Hardware random number generators can be used in any application that needs randomness. However, in many scientific applications additional cost and complexity of a TRNG (when compared with pseudo random number generators) provide no meaningful benefits. TRNGs have additional drawbacks for data science and statistical applications: impossibility to re-run a series of numbers unless they are stored, reliance on an analog physical entity can obscure the failure of the source. The TRNGs therefore are primarily used in the applications where their unpredictability and the impossibility to re-run the sequence of numbers are crucial to the success of the implementation: in cryptography and gambling machines. === Cryptography === The major use for hardware random number generators is in the field of data encryption, for example to create random cryptographic keys and nonces needed to encrypt and sign data. In addition to randomness, there are at least two additional requirements imposed by the cryptographic applications: forward secrecy guarantees that the knowledge of the past output and internal state of the device should not enable the attacker to predict future data; backward secrecy protects the "opposite direction": knowledge of the output and internal state in the future should not divulge the preceding data. A typical way to fulfill these requirements is to use a TRNG to seed a cryptographically secure pseudorandom number generator. == History == Physical devices were used to generate random numbers for thousands of years, primarily for gambling. Dice in particular have been known for more than 5000 years (found on locations in modern Iraq and Iran), and flipping a coin (thus producing a random bit) dates at least to the times of ancient Rome. The first documented use of a physical random number generator for scientific purposes was by Francis Galton (1890). He devised a way to sample a probability distribution using a common gambling die. In addition to the top digit, Galton also looked at the face of a die closest to him, thus creating 64 = 24 outcomes (about 4.6 bits of randomness). Kendall and Babington-Smith (1938) used a fast-rotating 10-sector disk that was illuminated by periodic bursts of light. The sampling was done by a human who wrote the number under the light beam onto a pad. The device was utilized to produce a 100,000-digit random number table (at the time such tables were used for statistical experiments, like PRNG nowadays). On 29 April 1947, the RAND Corporation began generating random digits with an "electronic roulette wheel", consisting of a random frequency pulse source of about 100,000 pulses per second gated once per second with a constant frequency pulse and fed into a five-bit binary counter. Douglas Aircraft built the equipment, implementing Cecil Hasting's suggestion (RAND P-113) for a noise source (most likely the well known behavior of the 6D4 miniature gas thyratron tube, when placed in a magnetic field). Twenty of the 32 possible counter values were mapped onto the 10 decimal digits and the other 12 counter values were discarded. The results of a long run from the RAND machine, filtered and tested, were converted into a table, which originally existed only as a deck of punched cards, but was later published in 1955 as a book, 50 rows of 50 digits on each page (A Million Random Digits with 100,000 Normal Deviates). The RAND table was a significant breakthrough in delivering random numbers because such a large and carefully prepared table had never before been available. It has been a useful source for simulations, modeling, and for deriving the arbitrary constants in cryptographic algorithms to demonstrate that the constants had not been selected maliciously ("nothing up my sleeve numbers"). Since the early 1950s, research into TRNGs has been highly active, with thousands of research works published and about 2000 patents granted by 2017. == Physical phenomena with random properties == Multiple different TRNG designs were proposed over time with a large variety of noise sources and digitization techniques ("harvesting"). However, practical considerations (size, power, cost, performance, robustness) dictate the following desirable traits: use of a commonly available inexpensive silicon process; exclusive use of digital design techniques. This allows an easier system-on-chip integration and enables the use of FPGAs; compact and low-power design. This discourages use of analog components (e.g., amplifiers); mathematical justification of the entropy collection mechanisms. Stipčević & Koç in 2014 classified the physical phenomena used to implement TRNG into four groups: electrical noise; free-running oscillators; chaos; quantum effects. === Electrical noise-based RNG === Noise-based RNGs generally follow the same outline: the source of a noise generator is fed into a comparator. If the voltage is above threshold, the comparator output is 1, otherwise 0. The random bit value is latched using a flip-flop. Sources of noise vary and include: Johnson–Nyquist noise ("thermal noise"); Zener noise; avalanche breakdown. The drawbacks of using noise sources for an RNG design are: noise levels are hard to control, they vary with environmental changes and device-to-device; calibration processes needed to ensure a guaranteed amount of entropy are time-consuming; noise levels are typically low, thus the design requires power-hungry amplifiers. The sensitivity of amplifier inputs enables manipulation by an attacker; circuitry located nearby generates a lot of non-random noise thus lowering the entropy; a proof of randomness is near-impossible as multiple interacting physical processes are involved. === Chaos-based RNG === The idea of chaos-based noise stems from the use of a complex system that is hard to characterize by observing its behavior over time. For example, lasers can be put into (undesirable in other applications) chaos mode with chaotically fluctuating power, with power detected using a photodiode and sampled by a comparator. The design can be quite small, as all photonics elements can be integrated on-chip. Stipčević & Koç characterize this technique as "most objectionable", mostly due to the fact that chaotic behavior is usually controlled by a differential equation and no new randomness is introduced, thus there is a possibility of the chaos-based TRNG producing a limited subset of possible output strings. === Free-running oscillators-based RNG === The TRNGs based on a free-running oscilla

    Read more →
  • Malleability (cryptography)

    Malleability (cryptography)

    Malleability is a property of some cryptographic algorithms. An encryption algorithm is said to be malleable if it is possible to transform a ciphertext into another ciphertext which decrypts to a related plaintext. That is, given an encryption of a plaintext m {\displaystyle m} , it is possible to generate another ciphertext which decrypts to f ( m ) {\displaystyle f(m)} , for a known function f {\displaystyle f} , without necessarily knowing or learning m {\displaystyle m} . Malleability is often an undesirable property in a general-purpose cryptosystem, since it allows an attacker to modify the contents of a message. For example, suppose that a bank uses a stream cipher to hide its financial information, and a user sends an encrypted message containing, say, "TRANSFER $0000100.00 TO ACCOUNT #199." If an attacker can modify the message on the wire, and can guess the format of the unencrypted message, the attacker could change the amount of the transaction, or the recipient of the funds, e.g. "TRANSFER $0100000.00 TO ACCOUNT #227". Malleability does not refer to the attacker's ability to read the encrypted message. Both before and after tampering, the attacker cannot read the encrypted message. On the other hand, some cryptosystems are malleable by design. In other words, in some circumstances it may be viewed as a feature that anyone can transform an encryption of m {\displaystyle m} into a valid encryption of f ( m ) {\displaystyle f(m)} (for some restricted class of functions f {\displaystyle f} ) without necessarily learning m {\displaystyle m} . Such schemes are known as homomorphic encryption schemes. A cryptosystem may be semantically secure against chosen-plaintext attacks or even non-adaptive chosen-ciphertext attacks (CCA1) while still being malleable. However, security against adaptive chosen-ciphertext attacks (CCA2) is equivalent to non-malleability. == Example malleable cryptosystems == In a stream cipher, the ciphertext is produced by taking the exclusive or of the plaintext and a pseudorandom stream based on a secret key k {\displaystyle k} , as E ( m ) = m ⊕ S ( k ) {\displaystyle E(m)=m\oplus S(k)} . An adversary can construct an encryption of m ⊕ t {\displaystyle m\oplus t} for any t {\displaystyle t} , as E ( m ) ⊕ t = m ⊕ t ⊕ S ( k ) = E ( m ⊕ t ) {\displaystyle E(m)\oplus t=m\oplus t\oplus S(k)=E(m\oplus t)} . In the RSA cryptosystem, a plaintext m {\displaystyle m} is encrypted as E ( m ) = m e mod n {\displaystyle E(m)=m^{e}{\bmod {n}}} , where ( e , n ) {\displaystyle (e,n)} is the public key. Given such a ciphertext, an adversary can construct an encryption of m t {\displaystyle mt} for any t {\displaystyle t} , as E ( m ) ⋅ t e mod n = ( m t ) e mod n = E ( m t ) {\textstyle E(m)\cdot t^{e}{\bmod {n}}=(mt)^{e}{\bmod {n}}=E(mt)} . For this reason, RSA is commonly used together with padding methods such as OAEP or PKCS1. In the ElGamal cryptosystem, a plaintext m {\displaystyle m} is encrypted as E ( m ) = ( g b , m A b ) {\displaystyle E(m)=(g^{b},mA^{b})} , where ( g , A ) {\displaystyle (g,A)} is the public key. Given such a ciphertext ( c 1 , c 2 ) {\displaystyle (c_{1},c_{2})} , an adversary can compute ( c 1 , t ⋅ c 2 ) {\displaystyle (c_{1},t\cdot c_{2})} , which is a valid encryption of t m {\displaystyle tm} , for any t {\displaystyle t} . In contrast, the Cramer-Shoup system (which is based on ElGamal) is not malleable. In the Paillier, ElGamal, and RSA cryptosystems, it is also possible to combine several ciphertexts together in a useful way to produce a related ciphertext. In Paillier, given only the public key and an encryption of m 1 {\displaystyle m_{1}} and m 2 {\displaystyle m_{2}} , one can compute a valid encryption of their sum m 1 + m 2 {\displaystyle m_{1}+m_{2}} . In ElGamal and in RSA, one can combine encryptions of m 1 {\displaystyle m_{1}} and m 2 {\displaystyle m_{2}} to obtain a valid encryption of their product m 1 m 2 {\displaystyle m_{1}m_{2}} . Block ciphers in the cipher block chaining mode of operation, for example, are partly malleable: flipping a bit in a ciphertext block will completely mangle the plaintext it decrypts to, but will result in the same bit being flipped in the plaintext of the next block. This allows an attacker to 'sacrifice' one block of plaintext in order to change some data in the next one, possibly managing to maliciously alter the message. This is essentially the core idea of the padding oracle attack on CBC, which allows the attacker to decrypt almost an entire ciphertext without knowing the key. For this and many other reasons, a message authentication code is required to guard against any method of tampering. == Complete non-malleability == Fischlin, in 2005, defined the notion of complete non-malleability as the ability of the system to remain non-malleable while giving the adversary additional power to choose a new public key which could be a function of the original public key. In other words, the adversary shouldn't be able to come up with a ciphertext whose underlying plaintext is related to the original message through a relation that also takes public keys into account.

    Read more →
  • Cryptovirology

    Cryptovirology

    Cryptovirology refers to the study of cryptography use in malware, such as ransomware and asymmetric backdoors. Traditionally, cryptography and its applications are defensive in nature, and provide privacy, authentication, and security to users. Cryptovirology employs a twist on cryptography, showing that it can also be used offensively. It can be used to mount extortion based attacks that cause loss of access to information, loss of confidentiality, and information leakage, tasks which cryptography typically prevents. The field was born with the observation that public-key cryptography can be used to break the symmetry between what an antivirus analyst sees regarding malware and what the attacker sees. The antivirus analyst sees a public key contained in the malware, whereas the attacker sees the public key contained in the malware as well as the corresponding private key (outside the malware) since the attacker created the key pair for the attack. The public key allows the malware to perform trapdoor one-way operations on the victim's computer that only the attacker can undo. == Overview == The field encompasses covert malware attacks in which the attacker securely steals private information such as symmetric keys, private keys, PRNG state, and the victim's data. Examples of such covert attacks are asymmetric backdoors. An asymmetric backdoor is a backdoor (e.g., in a cryptosystem) that can be used only by the attacker, even after it is found. This contrasts with the traditional backdoor that is symmetric, i.e., anyone that finds it can use it. Kleptography, a subfield of cryptovirology, is the study of asymmetric backdoors in key generation algorithms, digital signature algorithms, key exchanges, pseudorandom number generators, encryption algorithms, and other cryptographic algorithms. The NIST Dual EC DRBG random bit generator has an asymmetric backdoor in it. The EC-DRBG algorithm utilizes the discrete-log kleptogram from kleptography, which by definition makes the EC-DRBG a cryptotrojan. Like ransomware, the EC-DRBG cryptotrojan contains and uses the attacker's public key to attack the host system. The cryptographer Ari Juels indicated that NSA effectively orchestrated a kleptographic attack on users of the Dual EC DRBG pseudorandom number generation algorithm and that, although security professionals and developers have been testing and implementing kleptographic attacks since 1996, "you would be hard-pressed to find one in actual use until now." Due to public outcry about this cryptovirology attack, NIST rescinded the EC-DRBG algorithm from the NIST SP 800-90 standard. Covert information leakage attacks carried out by cryptoviruses, cryptotrojans, and cryptoworms that, by definition, contain and use the public key of the attacker is a major theme in cryptovirology. In "deniable password snatching," a cryptovirus installs a cryptotrojan that asymmetrically encrypts host data and covertly broadcasts it. This makes it available to everyone, noticeable by no one (except the attacker), and only decipherable by the attacker. An attacker caught installing the cryptotrojan claims to be a virus victim. An attacker observed receiving the covert asymmetric broadcast is one of the thousands, if not millions of receivers, and exhibits no identifying information whatsoever. The cryptovirology attack achieves "end-to-end deniability." It is a covert asymmetric broadcast of the victim's data. Cryptovirology also encompasses the use of private information retrieval (PIR) to allow cryptoviruses to search for and steal host data without revealing the data searched for even when the cryptotrojan is under constant surveillance. By definition, such a cryptovirus carries within its own coding sequence the query of the attacker and the necessary PIR logic to apply the query to host systems. == History == The first cryptovirology attack and discussion of the concept was by Adam L. Young and Moti Yung, at the time called "cryptoviral extortion" and it was presented at the 1996 IEEE Security & Privacy conference. In this attack, a cryptovirus, cryptoworm, or cryptotrojan contains the public key of the attacker and hybrid encrypts the victim's files. The malware prompts the user to send the asymmetric ciphertext to the attacker who will decipher it and return the symmetric decryption key it contains for a fee. The victim needs the symmetric key to decrypt the encrypted files if there is no way to recover the original files (e.g., from backups). The 1996 IEEE paper predicted that cryptoviral extortion attackers would one day demand e-money, long before Bitcoin even existed. Many years later, the media relabeled cryptoviral extortion as ransomware. In 2016, cryptovirology attacks on healthcare providers reached epidemic levels, prompting the U.S. Department of Health and Human Services to issue a Fact Sheet on Ransomware and HIPAA. The fact sheet states that when electronic protected health information is encrypted by ransomware, a breach has occurred, and the attack therefore constitutes a disclosure that is not permitted under HIPAA, the rationale being that an adversary has taken control of the information. Sensitive data might never leave the victim organization, but the break-in may have allowed data to be sent out undetected. California enacted a law that defines the introduction of ransomware into a computer system with the intent of extortion as being against the law. == Examples == === Tremor virus === While viruses in the wild have used cryptography in the past, the only purpose of such usage of cryptography was to avoid detection by antivirus software. For example, the tremor virus used polymorphism as a defensive technique in an attempt to avoid detection by anti-virus software. Though cryptography does assist in such cases to enhance the longevity of a virus, the capabilities of cryptography are not used in the payload. The One-half virus was amongst the first viruses known to have encrypted affected files. === Tro_Ransom.A virus === An example of a virus that informs the owner of the infected machine to pay a ransom is the virus nicknamed Tro_Ransom.A. This virus asks the owner of the infected machine to send $10.99 to a given account through Western Union. Virus.Win32.Gpcode.ag is a classic cryptovirus. This virus partially uses a version of 660-bit RSA and encrypts files with many different extensions. It instructs the owner of the machine to email a given mail ID if the owner desires the decryptor. If contacted by email, the user will be asked to pay a certain amount as ransom in return for the decryptor. === CAPI === It has been demonstrated that using just 8 different calls to Microsoft's Cryptographic API (CAPI), a cryptovirus can satisfy all its encryption needs. == Other uses of cryptography-enabled malware == Apart from cryptoviral extortion, there are other potential uses of cryptoviruses, such as deniable password snatching, cryptocounters, private information retrieval, and in secure communication between different instances of a distributed cryptovirus.

    Read more →
  • ShareMethods

    ShareMethods

    ShareMethods is a Web 2.0 document management and collaboration service with a focus on sales, marketing, and the extended selling network. It offers a software as a service (SaaS) subscription to companies and is available as a stand-alone application or as an integrated program with CRM tools such as Oracle CRM On Demand or salesforce.com. == History == ShareMethods was launched in 2004 to provide collaboration and communication services for sales and marketing teams, business partners, and customers. The founders have a background of building software-as-a-service applications and creating digital media applications. In September 2005, ShareMethods launched "ShareNow" as one of the first applications on the salesforce.com AppExchange. In September 2006, ShareMethods moved its operations into a SAS 70 Type II data center owned by SunGard. In March 2009, ShareMethods launched "ShareSpaces" to provide on-demand portals or workspaces. In 2013, ShareMethods announced that its platform is available in a private cloud (on-premises) version. == Products == ShareMethods: Combines document management, collaboration, analytics, and CRM integration into a single solution. Key content can be centrally managed and delivered to sales channels, while providing feedback to marketing. ShareMethods is often used as a sales portal for internal sales and a partner portal for external partners. ShareNow: Integrates ShareMethods with salesforce.com providing Single Sign On for salesforce.com users and access to files related to accounts opportunities, etc. including custom objects. Also facilitates collaboration between salesforce.com users and non-users. ShareMethods for Oracle CRM On Demand: Integrates ShareMethods with Oracle CRM On Demand providing Single Sign On for Oracle users and easy access to files related to accounts opportunities, etc. ShareOffice: An on-demand intranet/extranet solution. Features include full-text search, version history, server sync-up, email updates, audit trail/analytics, check-in/check-out, multilingual user interface. ShareSpaces: Independent workspaces or portals where users can collaborate with business partners, teammates, or individuals to work together on content and documents. == Integration and interoperability == ShareMethods is available on Salesforce.com's AppExchange platform. ShareMethods also integrates with Oracle CRM On Demand to provide document management within the CRM application. Customers also can integrate proprietary systems via single-sign-on and self-registration. In addition, developers can make use of the ShareMethods API based on WebDAV to integrate document management functionality.

    Read more →
  • Merit Network

    Merit Network

    Merit Network, Inc., is a nonprofit member-governed organization providing high-performance computer networking and related services to educational, government, health care, and nonprofit organizations, primarily in Michigan. Created in 1966, Merit operates the longest running regional computer network in the United States. == Organization == Created in 1966 as the Michigan Educational Research Information Triad by Michigan State University (MSU), the University of Michigan (U-M), and Wayne State University (WSU), Merit was created to investigate resource sharing by connecting the mainframe computers at these three Michigan public research universities. Merit's initial three node packet-switched computer network was operational in October 1972 using custom hardware based on DEC PDP-11 minicomputers and software developed by the Merit staff and the staffs at the three universities. Over the next dozen years the initial network grew as new services such as dial-in terminal support, remote job submission, remote printing, and file transfer were added; as gateways to the national and international Tymnet, Telenet, and Datapac networks were established, as support for the X.25 and TCP/IP protocols was added; as additional computers such as WSU's MVS system and the UM's electrical engineering's VAX running UNIX were attached; and as new universities became Merit members. Merit's involvement in national networking activities started in the mid-1980s with connections to the national supercomputing centers and work on the 56 kbit/s National Science Foundation Network (NSFNET), the forerunner of today's Internet. From 1987 until April 1995, Merit re-engineered and managed the NSFNET backbone service. MichNet, Merit's regional network in Michigan was attached to NSFNET and in the early 1990s Merit began extending "the Internet" throughout Michigan, offering both direct connect and dial-in services, and upgrading the statewide network from 56 kbit/s to 1.5 Mbit/s, and on to 45, 155, 622 Mbit/s, and eventually 1 and 10 Gbit/s. In 2003 Merit began its transition to a facilities based network, using fiber optic facilities that it shares with its members, that it purchases or leases under long-term agreements, or that it builds. In addition to network connectivity services, Merit offers a number of related services within Michigan and beyond, including: Internet2 connectivity, VPN, Network monitoring, Voice over IP (VOIP), Cloud storage, E-mail, Domain Name, Network Time, VMware and Zimbra software licensing, Colocation, and professional development seminars, workshops, classes, conferences, and meetings. == History == === Creating the network: 1966 to 1973 === The Michigan Educational Research Information Triad (MERIT) was formed in the fall of 1966 by Michigan State University (MSU), University of Michigan (U-M), and Wayne State University (WSU). More often known as the Merit Computer Network or simply Merit, it was created to design and implement a computer network connecting the mainframe computers at the universities. In the fall of 1969, after funding for the initial development of the network had been secured, Bertram Herzog was named director for MERIT. Eric Aupperle was hired as senior engineer, and was charged with finding hardware to make the network operational. The National Science Foundation (NSF) and the State of Michigan provided the initial funding for the network. In June 1970, the Applied Dynamics Division of Reliance Electric in Saline, Michigan was contracted to build three Communication Computers or CCs. Each would consist of a Digital Equipment Corporation (DEC) PDP-11 computer, dataphone interfaces, and interfaces that would attach them directly to the mainframe computers. The cost was to be slightly less than the $300,000 ($2,487,100, adjusted for inflation) originally budgeted. Merit staff wrote the software that ran on the CCs, while staff at each of the universities wrote the mainframe software to interface to the CCs. The first completed connection linked the IBM S/360-67 mainframe computers running the Michigan Terminal System at WSU and U-M, and was publicly demonstrated on December 14, 1971. The MSU node was completed in October 1972, adding a CDC 6500 mainframe running Scope/Hustler. The network was officially dedicated on May 15, 1973. === Expanding the network: 1974 to 1985 === In 1974, Herzog returned to teaching in the University of Michigan's Industrial Engineering Department, and Aupperle was appointed as director. Use of the all uppercase name "MERIT" was abandoned in favor of the mixed case "Merit". The first network connections were host to host interactive connections which allowed person to remote computer or local computer to remote computer interactions. To this, terminal to host connections, batch connections (remote job submission, remote printing, batch file transfer), and interactive file copy were added. And, in addition to connecting to host computers over custom hardware interfaces, the ability to connect to hosts or other networks over groups of asynchronous ports and via X.25 were added. Merit interconnected with Telenet (later SprintNet) in 1976 to give Merit users dial-in access from locations around the United States. Dial-in access within the U.S. and internationally was further expanded via Merit's interconnections to Tymnet, ADP's Autonet, and later still the IBM Global Network as well as Merit's own expanding network of dial-in sites in Michigan, New York City, and Washington, D.C. In 1978, Western Michigan University (WMU) became the fourth member of Merit (prompting a name change, as the acronym Merit no longer made sense as the group was no longer a triad). To expand the network, the Merit staff developed new hardware interfaces for the Digital PDP-11 based on printed circuit technology. The new system became known as the Primary Communications Processor (PCP), with the earliest PCPs connecting a PDP-10 located at WMU and a DEC VAX running UNIX at U-M's Electrical Engineering department. A second hardware technology initiative in 1983 produced the smaller Secondary Communication Processors (SCP) based on DEC LSI-11 processors. The first SCP was installed at the Michigan Union in Ann Arbor, creating UMnet, which extended Merit's network connectivity deeply into the U-M campus. In 1983 Merit's PCP and SCP software was enhanced to support TCP/IP and Merit interconnected with the ARPANET. === National networking, NSFNET, and the Internet: 1986 to 1995 === In 1986 Merit engineered and operated leased lines and satellite links that allowed the University of Michigan to access the supercomputing facilities at Pittsburgh, San Diego, and NCAR. In 1987, Merit, IBM and MCI submitted a winning proposal to NSF to implement a new NSFNET backbone network. The new NSFNET backbone network service began July 1, 1988. It interconnected supercomputing centers around the country at 1.5 megabits per second (T1), 24 times faster than the 56 kilobits-per-second speed of the previous network. The NSFNET backbone grew to link scientists and educators on university campuses nationwide and connect them to their counterparts around the world. The NSFNET project caused substantial growth at Merit, nearly tripling the staff and leading to the establishment of a new 24-hour Network Operations Center at the U-M Computer Center. In September 1990 in anticipation of the NSFNET T3 upgrade and the approaching end of the 5-year NSFNET cooperative agreement, Merit, IBM, and MCI formed Advanced Network and Services (ANS), a new non-profit corporation with a more broadly based Board of Directors than the Michigan-based Merit Network. Under its cooperative agreement with NSF, Merit remained ultimately responsible for the operation of NSFNET, but subcontracted much of the engineering and operations work to ANS. In 1991 the NSFNET backbone service was expanded to additional sites and upgraded to a more robust 45 Mbit/s (T3) based network. The new T3 backbone was named ANSNet and provided the physical infrastructure used by Merit to deliver the NSFNET Backbone Service. On April 30, 1995, the NSFNET project came to an end, when the NSFNET backbone service was decommissioned and replaced by a new Internet architecture with commercial Internet service providers (ISPs) interconnected at Network Access Points provided by multiple providers across the country. === Bringing the Internet to Michigan: 1985 to 2001 === During the 1980s, Merit Network grew to serve eight member universities, with Oakland University joining in 1985 and Central Michigan University, Eastern Michigan University, and Michigan Technological University joining in 1987. In 1990, Merit's board of directors formally changed the organization's name to Merit Network, Inc., and created the name MichNet to refer to Merit's statewide network. The board also approved a staff proposal to allow organizations other than publicly supported universities, referred to as aff

    Read more →
  • Data transformation (computing)

    Data transformation (computing)

    In computing, data transformation is the process of converting data from one format or structure into another format or structure. It is a fundamental aspect of most data integration and data management tasks such as data wrangling, data warehousing, data integration and application integration. Data transformation can be simple or complex based on the required changes to the data between the source (initial) data and the target (final) data. Data transformation is typically performed via a mixture of manual and automated steps. Tools and technologies used for data transformation can vary widely based on the format, structure, complexity, and volume of the data being transformed. A master data recast is another form of data transformation where the entire database of data values is transformed or recast without extracting the data from the database. All data in a well-designed database is directly or indirectly related to a limited set of master database tables by a network of foreign key constraints. Each foreign key constraint is dependent upon a unique database index from the parent database table. Therefore, when the proper master database table is recast with a different unique index, the directly and indirectly related data are also recast or restated. The directly and indirectly related data may also still be viewed in the original form since the original unique index still exists with the master data. Also, the database recast must be done in such a way as to not impact the applications architecture software. When the data mapping is indirect via a mediating data model, the process is also called data mediation. == Data transformation process == Data transformation can be divided into the following steps, each applicable as needed based on the complexity of the transformation required. Data discovery Data mapping Code generation Code execution Data review These steps are often the focus of developers or technical data analysts who may use multiple specialized tools to perform their tasks. The steps can be described as follows: Data discovery is the first step in the data transformation process. Typically the data is profiled using profiling tools or sometimes using manually written profiling scripts to better understand the structure and characteristics of the data and decide how it needs to be transformed. Data mapping is the process of defining how individual fields are mapped, modified, joined, filtered, aggregated etc. to produce the final desired output. Developers or technical data analysts traditionally perform data mapping since they work in the specific technologies to define the transformation rules (e.g. visual ETL tools, transformation languages). Code generation is the process of generating executable code (e.g. SQL, Python, R, or other executable instructions) that will transform the data based on the desired and defined data mapping rules. Typically, the data transformation technologies generate this code based on the definitions or metadata defined by the developers. Code execution is the step whereby the generated code is executed against the data to create the desired output. The executed code may be tightly integrated into the transformation tool, or it may require separate steps by the developer to manually execute the generated code. Data review is the final step in the process, which focuses on ensuring the output data meets the transformation requirements. It is typically the business user or final end-user of the data that performs this step. Any anomalies or errors in the data that are found and communicated back to the developer or data analyst as new requirements to be implemented in the transformation process. == Types of data transformation == === Batch data transformation === Traditionally, data transformation has been a bulk or batch process, whereby developers write code or implement transformation rules in a data integration tool, and then execute that code or those rules on large volumes of data. This process can follow the linear set of steps as described in the data transformation process above. Batch data transformation is the cornerstone of virtually all data integration technologies such as data warehousing, data migration and application integration. When data must be transformed and delivered with low latency, the term "microbatch" is often used. This refers to small batches of data (e.g. a small number of rows or a small set of data objects) that can be processed very quickly and delivered to the target system when needed. === Benefits of batch data transformation === Traditional data transformation processes have served companies well for decades. The various tools and technologies (data profiling, data visualization, data cleansing, data integration etc.) have matured and most (if not all) enterprises transform enormous volumes of data that feed internal and external applications, data warehouses and other data stores. === Limitations of traditional data transformation === This traditional process also has limitations that hamper its overall efficiency and effectiveness. The people who need to use the data (e.g. business users) do not play a direct role in the data transformation process. Typically, users hand over the data transformation task to developers who have the necessary coding or technical skills to define the transformations and execute them on the data. This process leaves the bulk of the work of defining the required transformations to the developer, which often in turn do not have the same domain knowledge as the business user. The developer interprets the business user requirements and implements the related code/logic. This has the potential of introducing errors into the process (through misinterpreted requirements), and also increases the time to arrive at a solution. This problem has given rise to the need for agility and self-service in data integration (i.e. empowering the user of the data and enabling them to transform the data themselves interactively). There are companies that provide self-service data transformation tools. They are aiming to efficiently analyze, map and transform large volumes of data without the technical knowledge and process complexity that currently exists. While these companies use traditional batch transformation, their tools enable more interactivity for users through visual platforms and easily repeated scripts. Still, there might be some compatibility issues (e.g. new data sources like IoT may not work correctly with older tools) and compliance limitations due to the difference in data governance, preparation and audit practices. === Interactive data transformation === Interactive data transformation (IDT) is an emerging capability that allows business analysts and business users the ability to directly interact with large datasets through a visual interface, understand the characteristics of the data (via automated data profiling or visualization), and change or correct the data through simple interactions such as clicking or selecting certain elements of the data. Although interactive data transformation follows the same data integration process steps as batch data integration, the key difference is that the steps are not necessarily followed in a linear fashion and typically don't require significant technical skills for completion. There are a number of companies that provide interactive data transformation tools, including Trifacta, Alteryx and Paxata. They are aiming to efficiently analyze, map and transform large volumes of data while at the same time abstracting away some of the technical complexity and processes which take place under the hood. Interactive data transformation solutions provide an integrated visual interface that combines the previously disparate steps of data analysis, data mapping and code generation/execution and data inspection. That is, if changes are made at one step (like for example renaming), the software automatically updates the preceding or following steps accordingly. Interfaces for interactive data transformation incorporate visualizations to show the user patterns and anomalies in the data so they can identify erroneous or outlying values. Once they've finished transforming the data, the system can generate executable code/logic, which can be executed or applied to subsequent similar data sets. By removing the developer from the process, interactive data transformation systems shorten the time needed to prepare and transform the data, eliminate costly errors in the interpretation of user requirements and empower business users and analysts to control their data and interact with it as needed. == Transformational languages == There are numerous languages available for performing data transformation. Many transformation languages require a grammar to be provided. In many cases, the grammar is structured using something closely resembling Backus–Naur form (BNF). There are numerous languages

    Read more →