AI Content Understanding

AI Content Understanding — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Night Sky (app)

    Night Sky (app)

    Night Sky (app) is an application developed and published by indie studio iCandi Apps Ltd. from the UK. Night Sky is a stargazing reference app, where the user can explore a virtual representation of the night sky to identify stars, planets, constellations and satellites. The app is developed specifically for iOS, tvOS and watchOS devices. Night Sky was first released on November 1, 2011 for iOS, and has had multiple updates since launch. Night Sky was mentioned in the September 2016 Apple Keynote during the Apple Watch Series 2 announcement. In October 2016, Night Sky was featured as the Free App of The Week on the Apple App Store. == Reception == Night Sky was featured in Apple's 'Best of 2012' and has also been pre-installed onto iPads in Apple retail stores worldwide.

    Read more →
  • List of broadband over power line deployments

    List of broadband over power line deployments

    This is a list of broadband over power line deployments. In this sense, "broadband" usually refers to Internet access using power line communication technology. == BPL pilot projects - 1st Gen (UPA) == === Inactive pilot projects === North America: United States: The United Telecom Council publishes the Federal Communications Commission (FCC)-mandated BPL Interference Resolution website, which provides a list of all BPL deployments in the US. Canada: Quebec: As of 2005, PLC communication technology developed by Ariane Controls is being installed inside and outside existing buildings to control lights and other energy-hungry devices. The cheap devices allow energy consumption to be better managed, and so save much energy and bring a clear return on investment. Western Europe: Sweden: Vattenfall is using PLC technology at 1200 baud for automatic meter reading based on an Iskraemeco product. Central and Eastern Europe, and Eurasia: Russian Federation: Electro-com has deployed widely BPL/PLC technology and offers internet access service in Moscow, Nizhny Novgorod, Ryazan, Kaluga and Rostov-on-Don, planning to extend coverage to main Russian cities. Currently the company does not provide other services, though plans to start providing telephone, and television services someday. Base equipment is a DefiDev modem with a DS2 chipset. The company had 35,000 subscribers and an annual growth of 15-20%. The company has, however, halted operations in Moscow in September, 2008, having sold its client network to an IDSL internet provider. Romania: In January, 2006, the Ministry of Communications and Information Technology introduced a PLC trial in the rural locality of Band, Mureș County, offering phone and broadband internet access for €7 per month. The technology was introduced to 50 households. Montenegro: In March, 2002, the Internet Crna Gora biggest internet provider in Montenegro launched a pilot project in town of Cetinje. Serbia: In August 2002, the Star Engineering from Niš launched a pilot project to show a completely new way to access the Internet, which is a new in that time in most countries around the world. Hungary: The first powerline service in Hungary was realized in September, 2003, in the Riverside apartment house in Budapest by 23Vnet Ltd. The PLC equipment was supplied by ASCOM Powerline. After four months the service was counting 100 users from 450 apartment owners. The bandwidth is 4.5 Mbit/s. Asia, Pacific, and Oceania: Indonesia: PT Kejora Gemilang Internusa "KEJORA", under their banner PLANET BROADBAND, is currently rolling out broadband over power line, with over 300,000 homes expected to be enabled by August 2010. PT. Kejora Gemilang Internusa signed an 8-year Joint Venture concession agreement with ICON+ a division of PT. Perusahaan Listrik Negara (Indonesia electricity company). Under the terms of the agreement PLAnet Broadband are to supply BPL/PLC to Jakarta West and West Java. Another company, PT. Broadband Powerline Indonesia, has been developing broadband over power line in apartment buildings since 2006. PT. BPI also produces data couplers to make broadband over powerline possible in three phases (R, S, T) with a single master. India : In India IIIT Allahabad has completed a project in co-operation with Corinex Communications Canada to implement a prototype of BPL for University campus and nearby villages. Africa and the Middle East: Egypt: The Engineering Office for Integrated Projects (EOIP) has deployed PLC technology widely in Alexandria, Fayed, and Tanta. Based on a locally developed system, the company provides AMR for electricity utilities. Currently, the company has about 70,000 subscribers. South Africa: Goal Technology Solutions (GTS) trialled the technology and is offering service in the suburbs of Pretoria, and plans to extend it to other areas. The tests were done with Mitsubishi equipment using a DS2 chipset, and the company claims a maximum throughput of 90 Mbit/s although initially only "512 Kbits/s ADSL equivalent speeds" are available. Now it uses DefiDev's equipment, and according to GTS's website, it will expand available bandwidth up to 5-20 Mbit/s. Ghana: Cactel Communications, Ltd. successfully deployed an MV solution pilot project in the Graphic Communications Group in Accra in June, 2005. A Cactel Remote Energy Management System (REMS) pilot project for the Electricity Company of Ghana (ECG) is running a 40-user pilot project at the University of Ghana in Legon. The current project combines fiber, radio link, Wi-Fi and PLC to provide broadband internet access and telephony. It showcases the interoperability of PLC technology and the company's expertise in emerging market design and deployment. Cactel hopes to deploy nationally, and is in deliberations with the national stakeholders and with Ghana's Ministry of Communications (MoC). AllTerra Communications successfully implemented a pilot test of broadband over power lines in Akosombo. In partnership with VRA, this test involves demonstrating transmission of broadband from medium to low voltage signals. AllTerra is working with VRA to expand the pilot project to include essential grid management utilities that will help balance and manage the current electricity transmission throughout their various substations. Using IT as a catalyst for economic development, AllTerra is expanding into numerous areas throughout Ghana. Vobiss Solutions Ltd successfully implemented a Hybrid Fibre BPL pilot network within EMEFS Hillview Estate in collaboration with ECG. Saudi Arabia: ElectroNet has been working with the Saudi Electric Company since 2005 on a pilot project using broadband over power lines over medium voltage cables and linking into low voltage distribution within a shopping mall. The pilot project also integrates automatic meter readers. Powerlines Communications Co. Ltd. implemented an AMR pilot project for Saudi Electricity Company in 2006. The project was located in the city of Jeddah on the west coast of Saudi Arabia. Digital KWh meters were installed in parallel with analog KWh meters. Readings taken by the Saudi Electricity Company showed variations of less than 1%. A BPL pilot project was included. Saudi Arabian Computer Management Consultants (SACMAC) has signed a deal to become an official system integrator and distributor for Mitsubishi PLC. It is expected to become a great success, because the existing broadband service, monopolized by the Saudi Telecom Company, is expensive and has poor customer service (some clients report that company techs arrive months after ordering). SACMAC has declined to talk about specifics of availability and price but says it will start rolling out the service in a few months (as of May 2006) and its price will be lower than current broadband providers. === Concluded pilot projects === The following pilot projects have ended: Australia, Tasmania: In November 2007, electricity retailer Aurora Energy ended its involvement with BPL and announced it was switching to Optical Fiber. This ended their commercial trial begun in September 2005, offering BPL services to 500 homes in the suburb of Tolmans Hill near Hobart, which had followed a successful technological trial earlier that year. Portugal ended BPL/PLC deployments in the country in October 2006, reportedly for economic reasons., Russian Federation: In September 2008, Russia's only BPL provider Electro-com ended deployments in Moscow for economic reasons. Spain: In May 2007 Iberdrola and Endesa (the main power companies in Spain) ended their projects to deploy PLC. United States: As of July 2010, the City of Manassas, VA has shut down their BPL deployment, which was the largest in the country. As of April 2007, Motorola has shuttered its Powerline LV Access BPL and reportedly plans to re-purpose the technology to a new system called Powerline MU, which is for use within multiple-unit dwellings. Motorola's system uses only residential-side low-voltage power lines for transmission to reduce the antenna effect, and successfully demonstrated frequency-notching for reduced potential for interference over the Amperion Inc. and Current Technologies LLC systems. Motorola invited the American Radio Relay League to participate with these tests, and even installed the Motorola system at their headquarters. Preliminary results were very positive with regard to interference, because the Motorola system does not use BPL on the powerlines leading up to the neighborhood. The BPL carrier is only used for the last leg of the trip from the pole to the house, and gets the signal to the pole via radio. This limits the interference to the area surrounding the last leg to the house. === Dismantled pilot projects === The following other BPL trials in the US are dismantled as of May 2008:

    Read more →
  • Blocknots

    Blocknots

    Blocknots were random sequences of numbers contained in a book and organized by numbered rows and columns and were used as additives in the reciphering of Soviet Union codes, during World War II. The Blocknot consisted of a booklet of fifty sheets of 5-figure random additive, 100 additive groups to a sheet. No sheet was used more than once, thus the blocknots were in effect a form of one-time pad. The Soviet Unions highest grade ciphers that were used in the East, were the 5-figure codebook enciphered with the Blocknot book, and were generally considered unbreakable. == Technical Description == Blocknots were distributed centrally from an office in Moscow. Every Blocknot contained 5-figure groups in a number of sheets, for the enciphering of 5-figure messages. The encipherment was effected by applying additives taken from the pad, of which 50-100 5-figure groups appeared. Each pad had a 5-figure number and each sheet had a 2-figure number running consecutively. There were 5 different types of Blocknots, in two different categories The Individual in which each table of random numbers was used only once. The General in which each page of the Blocknot was valid for one day. The security of the additive sequence rested on the choice of different starting points for each message. In 5-figure messages, the blocknot was one of the first 10 Groups in the message. Its position changed at long intervals, but was always easy to re-identify. The Russians differentiated between three types of blocks: The 3-block, DRIERBLOCK. I-block for Individual Block: 50 pages, additive read off in one direction only. The messages could be used and read only between 2 wireless telegraphy stations on one net. The 6-block, SECHSERBLOCK. Z-block for Circular Block: 30 pages, additive read off in either direction. The messages could be used and read, between all W/T stations in a net. The 2-block, ZWEIERBLOCK. OS-block. Used only in traffic from lower to higher formations. Two other types were used, in lower echelons. Notblock: Used in an emergency. Blocknot used for passing on traffic. The distribution of Blocknots was carried out centrally from Moscow to Army Groups then to Armies. The Army was responsible for their distribution throughout the lower levels of the army down to company level. Independent units took their cipher material with them. Occasionally the same blocknot was distributed to two units on different parts of the front, which enabled Depth to be established. Records of all Blocknots used were kept in Berlin and when a repeat was noticed a BLOCKNOT ANGEBOT message was sent out to all German Signals units, to indicate that it may have been possible to break the code using it. There was no certainty in this. A cryptanalyst with the General der Nachrichtenaufklärung stated while being interrogated by TICOM: It seems that depths of up to 8 were established at the beginning of the Russian Campaign but that no 5-figure code was broken after May 1943 German cryptanalysts who were prisoners of war stated under interrogation, that each of the figures 0 to 9 were placed en clair usually within the first ten groups of the text or sometimes at the end. One indicator was the Blocknot number and the consisted of two random figures, the figure representing the type, and the remaining two, the page of the Blocknot being used. In long messages, 000000 was placed in the message when the end of a page had been reached. == Chi number == The Chi-number was the serial numbering of all 5-figure messages passing through the hands of the Cipher Officer, starting on the first of January and ending on thirty-first December of the current year. It always appeared as the last group in an intercepted message, e.g. 00001 on the 1st January, or when the unit was newly set up. The progression of Chi-numbers was carefully observed and recorded in the form of a graph. A Russian corps had about 10 5-figure messages per day, and Army about 20-30 and a Front about 60–100. After only a relatively short time, the individual curves separated sharply and the type of formation could be recognized by the height of the Chi-number alone. == Monitoring == Blocknots were tracked in a card index, that was maintained by the Signal Intelligence Evaluation Centre (NAAS). The NAAS functionality included evaluation and traffic analysis, cryptanalysis, collation and dissemination of intelligence. The card index, which was one amongst several Card Indexes. A careful recording and study of blocks provided the positive clues in the identification and tracking of formations using 5-figure ciphers. The index was subdivided into two files: Search card index, contained all blocknots and chi-numbers whether or not they were known. Unit card index, contained only known Block and Chi-numbers. Inspector Berger, who was the chief cryptanalyst of NAAS 1 stated that the two files formed: The most important and surest instruments for identifying Russian radio nets, known to him. The Blocknots were also used in the Stationary Intercept Company (Feste), the military unit that were designed to work at a lower level to the NAAS, at the Army level and were semi-motorized, and closer to the front. The Feste used the Blocknot value along with several other parameters to build a network diagram. The network diagram was studied extensively, as part of a 6-stage process, that involved several departments within the Feste. The outcome was a metric which determined the most interesting circuit for traffic monitoring, and least interesting, where monitoring of traffic should cease. == Analysis == Johannes Marquart was a mathematician and cryptanalyst who initially worked for Inspectorate 7/VI and later led Referat Ia of Group IV of the General der Nachrichtenaufklärung. Marquart was assigned the study of the Soviet Union Blocknot traffic. Marquart and his unit conducted extensive research in an attempt to discover the method by which they were produced. All the counts which they made, however, failed to reveal any non-random characteristics in the design of the tables, and while they thought the Blocknots must have been generated by machine, they were never able to draw any concrete deductions as a result of their research. == Example == The Soviet 3rd Guard Tank Army transmits a 5-figure message with the Blocknot of 37581 (one of the first 10 groups in the message). On the same day the Block 37582 was used by the same formation. The next day 37583 appeared. Thereafter, for a period, the Army was not heard by German Wireless telegraphy intercept operators, as it was maintaining wireless silence. After a few days, an unidentified net with the Blocknot 37588 is picked up. This message net is claimed, because of the proximity of the blocks (88/83) to be the 3rd Guard Tank Army. The missing Blocknots 84-87 were presumably used in telegraphic, telephonic or courier communications. The Chi number provides confirmation of the first assumption, based on proximity of blocknots in most cases.

    Read more →
  • Data recovery

    Data recovery

    In computing, data recovery is a process of retrieving deleted, inaccessible, lost, corrupted, damaged, or overwritten data from secondary storage, removable media or files, when the data stored in them cannot be accessed in a usual way. The data is most often salvaged from storage media such as internal or external hard disk drives (HDDs), solid-state drives (SSDs), USB flash drives, magnetic tapes, CDs, DVDs, RAID subsystems, and other electronic devices. Recovery may be required due to physical damage to the storage devices or logical damage to the file system that prevents it from being mounted by the host operating system (OS). Logical failures occur when the hard drive devices are functional but the user or automated-OS cannot retrieve or access data stored on them. Logical failures can occur due to corruption of the engineering chip, lost partitions, firmware failure, or failures during formatting/re-installation. Data recovery can be a very simple or technical challenge. This is why there are specific software companies specialized in this field that help to get back data on your system. == About == The most common data recovery scenarios involve an operating system failure, malfunction of a storage device, logical failure of storage devices, accidental damage or deletion, etc. (typically, on a single-drive, single-partition, single-OS system), in which case the ultimate goal is simply to copy all important files from the damaged media to another new drive. This can be accomplished using a Live CD, or DVD by booting directly from a ROM or a USB drive instead of the corrupted drive in question. Many Live CDs or DVDs provide a means to mount the system drive and backup drives or removable media, and to move the files from the system drive to the backup media with a file manager or optical disc authoring software. Such cases can often be mitigated by disk partitioning and consistently storing valuable data files (or copies of them) on a different partition from the replaceable OS system files. Another scenario involves a drive-level failure, such as a compromised file system or drive partition, or a hard disk drive failure. In any of these cases, the data is not easily read from the media devices. Depending on the situation, solutions involve repairing the logical file system, partition table, or master boot record, or updating the firmware or drive recovery techniques ranging from software-based recovery of corrupted data, to hardware- and software-based recovery of damaged service areas (also known as the hard disk drive's "firmware"), to hardware replacement on a physically damaged drive which allows for the extraction of data to a new drive. If a drive recovery is necessary, the drive itself has typically failed permanently, and the focus is rather on a one-time recovery, salvaging whatever data can be read. In a third scenario, files have been accidentally "deleted" from a storage medium by the users. Typically, the contents of deleted files are not removed immediately from the physical drive; instead, references to them in the directory structure are removed, and thereafter space the deleted data occupy is made available for later data overwriting. In the mind of end users, deleted files cannot be discoverable through a standard file manager, but the deleted data still technically exists on the physical drive. In the meantime, the original file contents remain, often several disconnected fragments, and may be recoverable if not overwritten by other data files. The term "data recovery" is also used in the context of forensic applications or espionage, where data which have been encrypted, hidden, or deleted, rather than damaged, are recovered. Sometimes data present in the computer gets encrypted or hidden due to reasons like virus attacks which can only be recovered by some computer forensic experts. == Physical damage == A wide variety of failures can cause physical damage to storage media, which may result from human errors and natural disasters. CD-ROMs can have their metallic substrate or dye layer scratched off; hard disks can suffer from a multitude of mechanical failures, such as head crashes, PCB failure, and failed motors; tapes can simply break. Physical damage to a hard drive, even in cases where a head crash has occurred, does not necessarily mean permanent data loss. However, in extreme cases, such as prolonged exposure to moisture and corrosion —like the lost Bitcoin hard drive of James Howells, buried in the Newport landfill for over a decade — recovery is usually impossible. In rare cases, forensic techniques such as magnetic force microscopy (MFM) have been explored to detect residual magnetic traces when data holds exceptional value. Other techniques employed by many professional data recovery companies can typically salvage most, if not all, of the data that had been lost when the failure occurred. Of course, there are exceptions to this, such as cases where severe damage to the hard drive platters may have occurred. However, if the hard drive can be repaired and a full image or clone created, then the logical file structure can be rebuilt in most instances. Most physical damage cannot be repaired by end users. For example, opening a hard disk drive in a normal environment can allow airborne dust to settle on the platter and become caught between the platter and the read/write head. During normal operation, read/write heads float 3 to 6 nanometers above the platter surface, and the average dust particles found in a normal environment are typically around 30,000 nanometers in diameter. When these dust particles get caught between the read/write heads and the platter, they can cause new head crashes that further damage the platter and thus compromise the recovery process. Furthermore, end users generally do not have the hardware or technical expertise required to make these repairs. Consequently, data recovery companies are often employed to salvage important data with the more reputable ones using class 100 dust- and static-free cleanrooms. === Recovery techniques === Recovering data from physically damaged hardware can involve multiple techniques. Some damage can be repaired by replacing parts in the hard disk. This alone may make the disk usable, but there may still be logical damage. A specialized disk-imaging procedure is used to recover every readable bit from the surface. Once this image is acquired and saved on a reliable medium, the image can be safely analyzed for logical damage and will possibly allow much of the original file system to be reconstructed. ==== Hardware repair ==== A common misconception is that a damaged printed circuit board (PCB) may be simply replaced during recovery procedures by an identical PCB from a healthy drive. While this may work in rare circumstances on hard disk drives manufactured before 2003, it will not work on newer drives. Electronics boards of modern drives usually contain drive-specific adaptation data (generally a map of bad sectors and tuning parameters) and other information required to properly access data on the drive. Replacement boards often need this information to effectively recover all of the data. The replacement board may need to be reprogrammed. Some manufacturers (Seagate, for example) store this information on a serial EEPROM chip, which can be removed and transferred to the replacement board. Each hard disk drive has what is called a system area or service area; this portion of the drive, which is not directly accessible to the end user, usually contains drive's firmware and adaptive data that helps the drive operate within normal parameters. One function of the system area is to log defective sectors within the drive; essentially telling the drive where it can and cannot write data. The sector lists are also stored on various chips attached to the PCB, and they are unique to each hard disk drive. If the data on the PCB do not match what is stored on the platter, then the drive will not calibrate properly. In most cases the drive heads will click because they are unable to find the data matching what is stored on the PCB. == Logical damage == The term "logical damage" refers to situations in which the error is not a problem in the hardware and requires software-level solutions. === Corrupt partitions and file systems, media errors === In some cases, data on a hard disk drive can be unreadable due to damage to the partition table or file system, or to (intermittent) media errors. In the majority of these cases, at least a portion of the original data can be recovered by repairing the damaged partition table or file system using specialized data recovery software such as TestDisk; software like ddrescue can image media despite intermittent errors, and image raw data when there is partition table or file system damage. This type of data recovery can be performed by people without expertise in drive hardware as it requires no special physica

    Read more →
  • AI nationalism

    AI nationalism

    AI nationalism is the idea that nations should develop and control their own artificial intelligence technologies to advance their own interests and ensure technological sovereignty. This concept is gaining traction globally, leading countries to implement new laws, form strategic alliances, and invest significantly in domestic AI capabilities. == Global trends and national strategies == In 2018, British technology investor Ian Hogarth published an influential essay titled AI Nationalism. He argued that as AI gains more power and its economic and military significance expands, governments will take measures to bolster their own domestic AI industries, and predicted that the advancement of machine learning systems would lead to what he termed "AI nationalism." He anticipated that this rise in AI would accelerate a global arms race, resulting in more closed economies, restrictions on foreign acquisitions, and limitations on the movement of talent. Hogarth predicted that AI policy would become a central focus of government agendas. He also criticized Britain’s approach to AI strategy, citing the sale of London-based DeepMind—one of the leading AI laboratories, acquired by Google for a relatively modest £400 million in 2014—as a significant misstep. AI nationalism is chiefly reflected in the escalating rhetoric of an artificial intelligence arms race, portraying AI development as a zero-sum game where the winner gains significant economic, political, and military advantages. This mindset, as highlighted in a 2017 Pentagon report, warns that sharing AI technology could erode technological supremacy and enhance rivals' capabilities. The winner-takes-all mentality of AI nationalism poses risks including unsafe AI development, increased geopolitical tension, and potential military aggression (such as cyberattacks or targeting AI professionals). Several countries, including Canada, France, and India, have formulated national strategies to advance their positions in AI. In the United States, a leading player in the global AI arena, trade policies have been enacted to restrict China's access to critical microchips, reflecting a strategic effort to maintain a technological edge. The United States’ National Security Commission on Artificial Intelligence (NSCAI) frames AI development as a critical aspect of a broader technology competition crucial for national success. It emphasizes the need to outpace China in AI to maintain strategic advantage, reflecting AI nationalism by linking geopolitical power directly to advancements in AI. France has seen notable governmental support for local AI startups, particularly those specializing in language technologies that cater to French and other non-English languages. In Saudi Arabia, Crown Prince Mohammed bin Salman is investing billions in AI research and development. The country has actively collaborated with major technology firms such as Amazon, IBM, and Microsoft to establish itself as a prominent AI hub. == Historical and cultural context == AI nationalism is seen as deeply connected to historical racism and imperialism. It is viewed not merely as a technological competition but as a contest over racial and civilizational superiority. Historically, technological achievements were often used to justify colonialism and racial hierarchies, with Western societies perceiving their advancements as evidence of superiority. In the context of AI, this historical context continues to shape views on intelligence and development. Some argue that AI nationalism reinforces the idea of fundamental civilizational divides, especially between the Western world and China. This perspective often frames China's progress in AI as a direct challenge to Western values, presenting the AI competition as a struggle over values. AI nationalism is said to draw from long-standing anti-Asian stereotypes, such as the "Yellow Peril," which portray Asian nations as threats to Western civilization. This viewpoint links Asian technological advances with dehumanization and artificiality, reflecting persistent anxieties about China's growing role in the global tech landscape. == Implications == AI nationalism is seen as a component of a broader trend towards the fragmentation of the internet, where digital services are increasingly influenced by local regulations and national interests. This shift is creating a new technological landscape in which the impact of artificial intelligence on individuals' lives can vary significantly depending on their geographic location. J. Paul Goode argues that AI nationalism may exacerbate existing societal divisions by promoting the development of systems that embed cultural biases, thereby privileging certain groups while disadvantaging others.

    Read more →
  • Sumazi

    Sumazi

    Sumazi is a social media and social intelligence platform for enterprises, brands, and celebrities. Its technology performs social data analysis across social networking services including Facebook, Twitter and LinkedIn, to identify key people in his/her network who are experts, influencers or are located in a specific area for marketing, advertising or sales campaigns. The technology company was founded in 2011 by former Sun Microsystems employee Sumaya Kazi. The company was headquartered in San Francisco, California. The company was out of business by 2017. == Reception == Sumazi was one of 25 startups selected out of more than 1,200 to compete at TechCrunch Disrupt Startup Battlefield, where it won the Omidyar Network award for the startup "Most Likely to Change the World." Sumazi, which was based out of San Francisco, California, had been profiled in The New York Times as well as USA Today, which commented the advantages of the startup's location in the Silicon Valley. American Express OPEN Forum also featured Sumazi as a "Startup of the Week". Sumazi has additionally been mentioned in articles by Mashable, The Wall Street Journal, Current Editorials, Harvard Business Review, Smashing Magazine, and TechCrunch.

    Read more →
  • Data thinking

    Data thinking

    Data Thinking is a framework that integrates data science with the design process. It combines computational thinking, statistical thinking, and domain-specific knowledge to guide the development of data-driven solutions in product development. The framework is used to explore, design, develop, and validate solutions, with a focus on user experience and data analytics, including data collection and interpretation The framework aims to apply data literacy and inform decision-making through data-driven insights. == Major components == According to "Computational thinking in the era of data science": Data thinking involves understanding that solutions require both data-driven and domain-knowledge-driven rules. Data thinking evaluates whether data accurately represents real-life scenarios and improves data collection where necessary. The framework highlights the importance of preserving domain-specific meaning during data analysis. Data thinking incorporates statistical and logical analysis to identify patterns and irregularities. Data thinking involves testing solutions in real-life contexts and iteratively improving models based on new data. The process requires evaluating problems from multiple abstraction levels and understanding the potential for biases in generalizations. == Major phases == === Strategic context and risk analysis === Analyzing the broader digital strategy and assessing risks and opportunities is a common step before beginning a project. Techniques like coolhunting, trend analysis, and scenario planning can be used to assist with this. === Ideation and exploration === In this phase, focus areas are identified, and use cases are developed by integrating organizational goals, user needs, and data requirements. Design thinking methods, such as personas and customer journey mapping, are applied. === Prototyping === A proof of concept is created to test feasibility and refine solutions through iterative evaluation to optimize for effective performance. === Implementation and monitoring === Solutions are tested and monitored for performance and continual improvement. == Implementing Data Thinking == The following resources explain more about data thinking and its applications: "Data Thinking: Framework for data-based solutions" by StackFuel "What is Data Thinking? A modern approach to designing a data strategy" by Mantel Group "Data Science Thinking" by SpringerLink These sources provide detailed insights into the methodology, phases, and benefits of adopting Data Thinking in organizational processes.

    Read more →
  • Why We Post

    Why We Post

    Why We Post is a research project funded by the European Research Council and launched in 2012 by Daniel Miller with the objective of examining the global impact of new social media. The study is based on ethnographic data collected through the course of 15 months in China, India, Turkey, Italy, United Kingdom, Trinidad, Chile and Brazil. The results of this project were released on 29 February 2016. This included the first three of eleven Open Access books (available via UCL Press), a five-week e-course (MOOC) on FutureLearn in English, also available in Chinese, Portuguese, Hindi, Tamil, Italian, Turkish, and Spanish on UCLeXtend. In addition a website containing key discoveries, stories and over 100 films is available in the same 8 languages.

    Read more →
  • BERT (language model)

    BERT (language model)

    Bidirectional encoder representations from transformers (BERT) is a language model introduced in October 2018 by researchers at Google. It learns to represent text as a sequence of vectors using self-supervised learning. It uses the encoder-only transformer architecture. BERT dramatically improved the state of the art for large language models. As of 2020, BERT is a ubiquitous baseline in natural language processing (NLP) experiments. BERT is trained by masked token prediction and next sentence prediction. With this training, BERT learns contextual, latent representations of tokens in their context, similar to ELMo and GPT-2. It found applications for many natural language processing tasks, such as coreference resolution and polysemy resolution. It improved on ELMo and spawned the study of "BERTology", which attempts to interpret what is learned by BERT. BERT was originally implemented in the English language at two model sizes, BERTBASE (110 million parameters) and BERTLARGE (340 million parameters). Both were trained on the Toronto BookCorpus (800M words) and English Wikipedia (2,500M words). The weights were released on GitHub. On March 11, 2020, 24 smaller models were released, the smallest being BERTTINY with just 4 million parameters. == Architecture == BERT is an "encoder-only" transformer architecture. At a high level, BERT consists of 4 modules: Tokenizer: This module converts a piece of English text into a sequence of integers ("tokens"). Embedding: This module converts the sequence of tokens into an array of real-valued vectors representing the tokens. It represents the conversion of discrete token types into a lower-dimensional Euclidean space. Encoder: a stack of Transformer blocks with self-attention, but without causal masking. Task head: This module converts the final representation vectors into one-shot encoded tokens again by producing a predicted probability distribution over the token types. It can be viewed as a simple decoder, decoding the latent representation into token types, or as an "un-embedding layer". The task head is necessary for pre-training, but it is often unnecessary for so-called "downstream tasks," such as question answering or sentiment classification. Instead, one removes the task head and replaces it with a newly initialized module suited for the task, and finetune the new module. The latent vector representation of the model is directly fed into this new module, allowing for sample-efficient transfer learning. === Embedding === This section describes the embedding used by BERTBASE. The other one, BERTLARGE, is similar, just larger. The tokenizer of BERT is WordPiece, which is a sub-word strategy like byte-pair encoding. Its vocabulary size is 30,000, and any token not appearing in its vocabulary is replaced by [UNK] ("unknown"). The first layer is the embedding layer, which contains three components: token type embeddings, position embeddings, and segment type embeddings. Token type: The token type is a standard embedding layer, translating a one-hot vector into a dense vector based on its token type. Position: The position embeddings are based on a token's position in the sequence. BERT uses absolute position embeddings, where each position in a sequence is mapped to a real-valued vector. Each dimension of the vector consists of a sinusoidal function that takes the position in the sequence as input. Segment type: Using a vocabulary of just 0 or 1, this embedding layer produces a dense vector based on whether the token belongs to the first or second text segment in that input. In other words, type-1 tokens are all tokens that appear after the [SEP] special token. All prior tokens are type-0. The three embedding vectors are added together representing the initial token representation as a function of these three pieces of information. After embedding, the vector representation is normalized using a LayerNorm operation, outputting a 768-dimensional vector for each input token. After this, the representation vectors are passed forward through 12 Transformer encoder blocks, and are decoded back to 30,000-dimensional vocabulary space using a basic affine transformation layer. === Architectural family === The encoder stack of BERT has 2 free parameters: L {\displaystyle L} , the number of layers, and H {\displaystyle H} , the hidden size. There are always H / 64 {\displaystyle H/64} self-attention heads, and the feed-forward/filter size is always 4 H {\displaystyle 4H} . By varying these two numbers, one obtains an entire family of BERT models. For BERT: the feed-forward size and filter size are synonymous. Both of them denote the number of dimensions in the middle layer of the feed-forward network. the hidden size and embedding size are synonymous. Both of them denote the number of real numbers used to represent a token. The notation for encoder stack is written as L/H. For example, BERTBASE is written as 12L/768H, BERTLARGE as 24L/1024H, and BERTTINY as 2L/128H. == Training == === Pre-training === BERT was pre-trained simultaneously on two tasks: Masked language modeling (MLM): In this task, BERT ingests a sequence of words, where one word may be randomly changed ("masked"), and BERT tries to predict the original words that had been changed. For example, in the sentence "The cat sat on the [MASK]," BERT would need to predict "mat." This helps BERT learn bidirectional context, meaning it understands the relationships between words not just from left to right or right to left but from both directions at the same time. Next sentence prediction (NSP): In this task, BERT is trained to predict whether one sentence logically follows another. For example, given two sentences, "The cat sat on the mat" and "It was a sunny day", BERT has to decide if the second sentence is a valid continuation of the first one. This helps BERT understand relationships between sentences, which is important for tasks like question answering or document classification. ==== Masked language modeling ==== In masked language modeling, 15% of tokens would be randomly selected for masked-prediction task, and the training objective was to predict the masked token given its context. In more detail, the selected token is: replaced with a [MASK] token with probability 80%, replaced with a random word token with probability 10%, not replaced with probability 10%. The reason not all selected tokens are masked is to avoid the dataset shift problem. The dataset shift problem arises when the distribution of inputs seen during training differs significantly from the distribution encountered during inference. A trained BERT model might be applied to word representation (like Word2Vec), where it would be run over sentences not containing any [MASK] tokens. It is later found that more diverse training objectives are generally better. As an illustrative example, consider the sentence "my dog is cute". It would first be divided into tokens like "my1 dog2 is3 cute4". Then a random token in the sentence would be picked. Let it be the 4th one "cute4". Next, there would be three possibilities: with probability 80%, the chosen token is masked, resulting in "my1 dog2 is3 [MASK]4"; with probability 10%, the chosen token is replaced by a uniformly sampled random token, such as "happy", resulting in "my1 dog2 is3 happy4"; with probability 10%, nothing is done, resulting in "my1 dog2 is3 cute4". After processing the input text, the model's 4th output vector is passed to its decoder layer, which outputs a probability distribution over its 30,000-dimensional vocabulary space. ==== Next sentence prediction ==== Given two sentences, the model predicts if they appear sequentially in the training corpus, outputting either [IsNext] or [NotNext]. During training, the algorithm sometimes samples two sentences from a single continuous span in the training corpus, while at other times, it samples two sentences from two discontinuous spans. The first sentence starts with a special token, [CLS] (for "classify"). The two sentences are separated by another special token, [SEP] (for "separate"). After processing the two sentences, the final vector for the [CLS] token is passed to a linear layer for binary classification into [IsNext] and [NotNext]. For example: Given "[CLS] my dog is cute [SEP] he likes playing [SEP]", the model should predict [IsNext]. Given "[CLS] my dog is cute [SEP] how do magnets work [SEP]", the model should predict [NotNext]. === Fine-tuning === BERT is meant as a general pretrained model for various applications in natural language processing. That is, after pre-training, BERT can be fine-tuned with fewer resources on smaller datasets to optimize its performance on specific tasks such as natural language inference and text classification, and sequence-to-sequence-based language generation tasks such as question answering and conversational response generation. The original BERT paper published results demonstrating that a small amount of fine

    Read more →
  • Viber

    Viber

    Rakuten Viber, commonly known as Viber, is a cross-platform voice over IP (VoIP) and instant messaging (IM) software application owned by the Japanese technology company Rakuten Group. The service is available as freeware for Android, iOS, Microsoft Windows, macOS and Linux. Users are registered and identified through a mobile phone number, although the service can also be accessed on desktop platforms without mobile connectivity. In addition to instant messaging, the platform allows users to exchange media such as images, videos and files, and provides a paid international calling service called Viber Out. The software was launched in 2010 by the company Viber Media, founded by Talmon Marco and Igor Magazinnik. Rakuten acquired Viber Media in 2014 and later renamed the company Rakuten Viber. The company is headquartered in Cyprus and maintains offices in London, Manila, Paris, San Francisco, Singapore, Tokyo and Beijing. == History == === Founding (2010) === Viber Media was founded in Tel Aviv, Israel, in 2010 by Talmon Marco and Igor Magazinnik. Marco and Magazinnik are also co-founders of the peer-to-peer media and file-sharing client iMesh. The company was run from Israel and was registered in Cyprus. Sani Maroli and Ofer Smocha soon joined the company as well. Marco said Viber allows instant calling and synchronization with contacts because the ID is the user's cell number. In its early days, Viber relied on a patchwork of outsourcing partners from different countries, commissioning specific solutions from external vendors — including teams based in Cyprus and Belarus. According to the company's statements, development of Viber's core functionality historically originated from its Tel Aviv office — a testament to its roots — even though the legal entity was registered elsewhere. === Early monetisation (2011) === In its first two years of availability, Viber did not generate revenues. It began doing so in 2013, via user payments for Viber Out voice calling and the Viber graphical messaging "sticker market". The company was originally funded by individual investors, described by Marco as "friends and family". They invested $20 million in the company, which had 120 employees as of May 2013. On 24 July 2013, Viber's support system was defaced by the Syrian Electronic Army. According to Viber, no sensitive user information was accessed. By the time Rakuten came forward with its acquisition deal in 2014, Viber had already stopped working with external vendors, choosing instead to consolidate development under its own offices. === Rakuten acquires Viber (2014) === On 13 February 2014, Rakuten announced they had acquired Viber Media for $900 million, and since then Viber has been owned by Rakuten, Inc., an e-commerce conglomerate headquartered in Tokyo. The sale of Viber earned the Shabtai family (Benny, his brother Gilad, and Gilad's son Ofer) some $500 million from their 55.2% stake in the company. At that sale price, the founders each realized over 30 times return on their investments. Later that year, the company established a UK presence with the incorporation of Viber UK Limited in London. Djamel Agaoua became Viber Media CEO in February 2017, replacing co-founder Marco who left in 2015. In July 2017 the corporate name of Viber Media was changed to Rakuten Viber and a new wordmark logo was introduced. Its legal name remains Viber Media, S.à r.l. based in Luxembourg. === Post-acquisition === In August 2015 Viber opened a regional office for Central and Eastern Europe in Sofia to support growth in the region. In 2017, Rakuten Viber and the World Wildlife Fund engaged in a commercial transaction aimed at raising awareness and protecting wildlife. After first using Viber to spread its message in June 2020, the International Federation of the Red Cross launched an official chatbot and community on the messaging app to combat the spread of false information, which they termed an infodemic, about COVID-19. The chatbot is still active as of June 2022, with over 1.4 million subscribers. In 2020, Rakuten Viber and the World Health Organization (the WHO) engaged in a commercial transaction for a chatbot to inform users of issues such as women's health. and an anti-smoking campaign. In the wake of the July–August 2020 Belarusian election protests, to avoid sanctions and harassment from monopolies the company closed its office in Minsk. In 2022, Ofir Eyal became Viber CEO, replacing Djamel Agaoua. Eyal is a Viber veteran; he worked as Vice President of Product in 2014 before his promotion to Chief Operating Officer in 2019. Shortly after the appointment of a new CEO, Viber continued its international expansion. In March 2022, Rakuten announced the opening of a development center in Tbilisi, Georgia, intended to support work on mobile applications and technology projects in the region. In July 2022, Rakuten Viber partnered with Rapyd to launch instant cross-border P2P payments. The company launched payments on the Viber app first in Greece and Germany, and then in other countries. In August, Mineski teamed up with Viber to develop a social minigame platform that can play off Viber's application. In May 2022, Rakuten Viber launched the premium chat service Viber Plus that offers exclusive features, including sticker market privileges, ad-free use, priority Viber support, exclusive badge, unique Viber icon, large file sharing, and more. In 2022, Viber joined the European Union’s Code of Conduct on countering illegal hate speech online. As part of this framework, the company undertook to review reported content and remove material identified as hate speech in accordance with the Code and its platform rules. In January 2024 Rakuten (the company behind Viber) established an office in Kyiv to bring together engineering and marketing departments. Alongside launching its Kyiv office the company joined Diia.City as a resident. Subsequently in October 2024 Rakuten Viber inaugurated an office in Manila to broaden its operations, in the Philippines. The company’s legal entity remains Viber Media S.à r.l., registered in Luxembourg. Viber’s engineering work has been carried out across multiple countries and through external partners, including outsourcing and near-shore vendors. As a result, its development operations are distributed internationally rather than concentrated in a single location. In December 2024, Viber was blocked in Russia. Roskomnadzor announced the nationwide blocking of the messaging app due to non-compliance with local legal requirements. == Security audit == On 4 November 2014, Viber scored 1 out of 7 points on the Electronic Frontier Foundation's "Secure Messaging Scorecard". Viber received a point for encryption during transit but lost points because communications were not encrypted with keys that the provider did not have access to (i.e. the communications were not end-to-end encrypted), users could not verify contacts' identities, past messages were not secure if the encryption keys were stolen (i.e. the service did not provide forward secrecy), the code was not open to independent review (i.e. the code was not open-source), the security design was not properly documented, and there had not been a recent independent security audit. On 14 November 2014, the EFF changed Viber's score to 2 out of 7 after it had received an external security audit from Ernst & Young's Advanced Security Centre. On 19 April 2016, with the announcement of Viber version 6.0, Rakuten added end-to-end encryption to their service. The company said that the encryption protocol had only been audited internally, and promised to commission external audits "in the coming weeks". In May 2016, Viber published an overview of their encryption protocol, saying that it is a custom implementation that "uses the same concepts" as the Signal Protocol. In 2022, Rakuten Viber won a Security Award, by test.de, a tech firm based in Germany where there are over 3 million Viber users. In 2024, Rakuten Viber received SOC certification following an audit conducted by Ernst & Young. The certification relates to the company’s controls for data protection and information security. == Market share == As of December 2016, Viber had 800 million registered users. According to Statista, there are 260 million monthly active users as of January 2019. The Viber messenger is very popular in the Philippines, Greece, Eastern Europe, Russia, the Middle East, and some Asian markets. India was the largest market for Viber as of December 2014 with 33 million registered users, the fifth most popular instant messenger in the country. At the same time there were 30 million users in the United States, 28 million in Russia and 18 million in Brazil. Viber is particularly popular in Eastern Europe, being the most downloaded messaging app on Android in Belarus, Moldova and Ukraine as of 2016. It is also popular in Iraq, Libya and Nepal. Viber is translated in 44 languages and used in more than 190 co

    Read more →
  • SocialIQ

    SocialIQ

    Social IQ (formerly Soovox Inc.) was a San Diego-based influencer marketing platform that measured users' online social influence and connected them with brands for word-of-mouth marketing campaigns. The company was founded in 2009 by Akram Benmbarek and was headquartered in San Diego, California. == History == Akram Benmbarek, who had previously worked in technology finance at Advanced Equities Financial Corp and in wealth management at Morgan Stanley, Merrill Lynch, and UBS, founded the company in mid-2009 under the name Soovox. In October 2011, Benmbarek rebranded the company as SocialIQ. At that time, the company was seeking a Series A round of venture capital, having raised under $1 million in angel seed funding. == Similar metrics == Klout PeerIndex

    Read more →
  • Chunked transfer encoding

    Chunked transfer encoding

    Chunked transfer encoding is a streaming data transfer mechanism available in Hypertext Transfer Protocol (HTTP) version 1.1, defined in RFC 9112 §7.1. In chunked transfer encoding, the data stream is divided into a series of non-overlapping "chunks". The chunks are sent out and received independently of one another. At any given time, no knowledge of the data stream outside the currently-being-processed chunk is necessary for either the sender or the receiver. Each chunk is preceded by its size in bytes and transmission ends when a zero-length chunk is received. The chunked keyword in the Transfer-Encoding header is used to indicate chunked transfer. Chunked transfer encoding is not supported in HTTP/2, which provides its own mechanisms for data streaming. == Rationale == The introduction of chunked encoding provided various benefits: Chunked transfer encoding allows a server to maintain an HTTP persistent connection for dynamically generated content. In this case, the HTTP Content-Length header cannot be used to delimit the content and the next HTTP request/response, as the content size is not yet known. Chunked encoding has the benefit that it is not necessary to generate the full content before writing the header, as it allows streaming of content as chunks and explicitly signaling the end of the content, making the connection available for the next HTTP request/response. Chunked encoding allows the sender to send additional header fields after the message body. This is important in cases where values of a field cannot be known until the content has been produced, such as when the content of the message must be digitally signed. Without chunked encoding, the sender would have to buffer the content until it was complete in order to calculate a field value and send it before the content. == Applicability == For version 1.1 of the HTTP protocol, the chunked transfer mechanism is considered to be always and anyway acceptable, even if not listed in the Transfer-Encoding (TE) request header field, and when used with other transfer mechanisms, should always be applied last to the transferred data and never more than one time. This transfer encoding method also allows additional entity header fields to be sent after the last chunk if the client specified the "trailers" parameter as an argument of the TE request field. The origin server of the response can also decide to send additional entity trailers even if the client did not specify the "trailers" parameter, but only if the metadata is optional (i.e. the client can use the received entity without them). Whenever the trailers are used, the server should list their names in the Trailer header field; three header field types are specifically prohibited from appearing as a trailer field: Content-Length, Trailer, and Transfer-Encoding. == Format == If a Transfer-Encoding field with a value of "chunked" is specified in an HTTP message (either a request sent by a client or the response from the server), the body of the message consists of one or more chunks and one terminating chunk with an optional trailer before the final ␍␊ sequence (i.e. carriage return followed by line feed). Each chunk starts with the number of octets of the data it embeds expressed as a hexadecimal number in ASCII followed by optional parameters (chunk extension) and a terminating ␍␊ sequence, followed by the chunk data. The chunk is terminated by ␍␊. If chunk extensions are provided, the chunk size is terminated by a semicolon and followed by the parameters, each also delimited by semicolons. Each parameter is encoded as an extension name followed by an optional equal sign and value. These parameters could be used for a running message digest or digital signature, or to indicate an estimated transfer progress, for instance. The terminating chunk is a special chunk of zero length. It may contain a trailer, which consists of a (possibly empty) sequence of entity header fields. Normally, such header fields would be sent in the message's header; however, it may be more efficient to determine them after processing the entire message entity. In that case, it is useful to send those headers in the trailer. Header fields that regulate the use of trailers are Transfer-Encoding with the "trailers" parameter (used in requests) and Trailer (used in responses). == Use with compression == HTTP servers often use compression to optimize transmission, for example with Content-Encoding: gzip or Content-Encoding: deflate. If both compression and chunked encoding are enabled, then the content stream is first compressed, then chunked; so the chunk encoding itself is not compressed, and the data in each chunk is compressed holistically (i.e. based on the whole content). The remote endpoint then decodes the stream by concatenating the chunks and uncompressing the result. == Example == === Encoded data === The following example contains three chunks of size 4, 7, and 11 (hexadecimal "B") octets of data. 4␍␊Wiki␍␊7␍␊pedia i␍␊B␍␊n ␍␊chunks.␍␊0␍␊␍␊ Below is an annotated version of the encoded data. 4␍␊ (chunk size is four octets) Wiki (four octets of data) ␍␊ (end of chunk) 7␍␊ (chunk size is seven octets) pedia i (seven octets of data) ␍␊ (end of chunk) B␍␊ (chunk size is eleven octets) n ␍␊chunks. (eleven octets of data) ␍␊ (end of chunk) 0␍␊ (chunk size is zero octets, no more chunks) ␍␊ (end of final chunk with zero data octets) Note: Each chunk's size excludes the two ␍␊ bytes that terminate the data of each chunk. === Decoded data === Decoding the above example produces the following octets: Wikipedia in ␍␊chunks. The bytes above are typically displayed as Wikipedia in chunks.

    Read more →
  • Bitstrips

    Bitstrips

    Bitstrips, Inc. was a Canadian media and technology company based in Toronto, founded in 2007 by Jacob Blackstock, David Kennedy, Shahan Panth, Dorian Baldwin, and Jesse Brown. The company created and offered a web application, Bitstrips.com, which allowed users to create comic strips using personalized avatars, and preset templates and poses. Brown and Blackstock explained that the service was meant to enable self-expression without the need to have artistic skills. Bitstrips was first presented in 2008 at South by Southwest in Austin, Texas, and the service later piloted and launched a version designed for use as educational software. The service achieved increasing prominence following the launch of versions for Facebook and mobile platforms. In 2014, Bitstrips launched a spin-off app known as Bitmoji, which allows users to create personalized stickers for use in instant messaging. In July 2016, Snapchat Inc. announced that it had acquired the company; the Bitstrips comic service was shut down, but Bitmoji remains operational, and has subsequently been given greater prominence within Snapchat's overall platform. == History == Bitstrips was co-developed by Toronto-based comic artist Jacob Blackstock and his high school friend, journalist Jesse Brown. The service was originally envisioned as a means to allow anyone to create their own comic strip without needing artistic skills. Brown explained that "it's so difficult and time-consuming to tell a story in comic book form, drawing the same characters again and again in these tiny little panels, and just the amount of craftsmanship required. And even if you can do it well, which I never could, it takes years to make a story." Brown stated that the service would be "groundwork for a whole new way to communicate", and went as far as describing the service as being a "YouTube for comics". Blackstock explained that the concept of Bitstrips was influenced by his own use of comics as a form of socialization; a student, Blackstock and his friends drew comics featuring each other and shared them during classes. He felt that Bitstrips was a "medium for self-expression", stating that "It's not just about you making the comics, but since you and your friends star in these comics, it's like you're the medium. The visual nature of comics just speaks so much louder than text." The service was publicly unveiled at South by Southwest in 2008. In 2009, the service introduced a version oriented towards the educational market, Bitstrips for Schools, which was initially piloted at a number of schools in Ontario. The service was praised by educators for being engaging to students, especially within language classes. Brown noted that students were using the service to create comics outside of class as well, stating that it was "so gratifying and shocking what people do with your tool to make their own stories in ways that you never would have anticipated. Some of them are just brilliant." In December 2012, Bitstrips launched a version for Facebook; by July 2013, Bitstrips had 10 million unique users on Facebook, having created over 50 million comics. In October 2013, Bitstrips launched a mobile app; in two months, Bitstrips became a top-downloaded app in 40 countries, and over 30 million avatars had been created with it. In November 2013, Bitstrips secured a round of funding from Horizons Ventures and Li Ka-shing. In October 2014, Bitstrips launched Bitmoji, a spin-off app that allows users to create stickers featuring Bitstrips characters in various templates. In July 2016, following unconfirmed reports earlier in the year, Snapchat Inc. announced that it had acquired Bitstrips. The company's staff continue to operate out of Toronto, but the original Bitstrips comic service was shut down in favour of focusing exclusively on Bitmoji, leaving many Bitstrips users to call for a reboot of the comic service.

    Read more →
  • Omni-Path

    Omni-Path

    Omni-Path Architecture (OPA) is a high-performance communication architecture developed by Intel. It aims for low communication latency, low power consumption and a high throughput. It directly competes with InfiniBand. Intel planned to develop technology based on this architecture for exascale computing. The current owner of Omni-Path is Cornelis Networks. == History == Production of Omni-Path products started in 2015 and delivery of these products started in the first quarter of 2016. In November 2015, adapters based on the 2-port "Wolf River" ASIC were announced, using QSFP28 connectors with channel speeds up to 100 Gbit/s. Simultaneously, switches based on the 48-port "Prairie River" ASIC were announced. First models of that series were available starting in 2015. In April 2016, implementation of the InfiniBand "verbs" interface for the Omni-Path fabric was discussed. In October 2016, IBM, Hewlett Packard Enterprise, Dell, Lenovo, Samsung, Seagate Technology, Micron Technology, Western Digital and SK Hynix announced a joint consortium called Gen-Z to develop an open specification and architecture for non-volatile storage and memory products—including Intel's 3D Xpoint technology—which might in part compete against Omni-Path. Intel offered their Omni-Path products and components via other (hardware) vendors. For example, Dell EMC offered Intel Omni-Path as Dell Networking H-series, following the naming-standard of Dell Networking in 2017. In July 2019, Intel announced it would not continue development of Omni-Path networks and canceled OPA 200 series (200-Gbps variant of Omni-Path). In September 2020, Intel announced that the Omni-Path network products and technology would be spun out into a new venture with Cornelis Networks. Intel would continue to maintain support for legacy Omni-Path products, while Cornelis Networks continues the product line, leveraging existing Intel intellectual property related to Omni-Path architecture. In 2021, Cornelis announced Omni-Path Express, which replaces PSM2-based drivers and middleware, which trace back to PathScale's PSM created in 2003, for the existing Omni-Path hardware, with a native libfabric provider.

    Read more →
  • Reverse proxy

    Reverse proxy

    In computer networks, a reverse proxy or surrogate server is a proxy server that appears to any client to be an ordinary web server, but in reality merely acts as an intermediary that forwards the client's requests to one or more ordinary web servers. Reverse proxies help increase scalability, performance, resilience, and security, but they also carry a number of risks. Companies that run web servers often set up reverse proxies to facilitate the communication between an Internet user's browser and the web servers. An important advantage of doing so is that the web servers can be hidden behind a firewall on a company-internal network, and only the reverse proxy needs to be directly exposed to the Internet. Reverse proxy servers are implemented in popular open-source web servers. Dedicated reverse proxy servers are used by some of the biggest websites on the Internet. A reverse proxy is capable of tracking IP addresses of requests that are relayed through it as well as reading and/or modifying any non-encrypted traffic. However, this implies that anyone who has compromised the server could do so as well. Reverse proxies differ from forward proxies, which are used when the client is restricted to a private, internal network and asks a forward proxy to retrieve resources from the public Internet. == Uses == Large websites and content delivery networks use reverse proxies, together with other techniques, to balance the load between internal servers. Reverse proxies can keep a cache of static content, which further reduces the load on these internal servers and the internal network. It is also common for reverse proxies to add features such as compression or TLS encryption to the communication channel between the client and the reverse proxy. Reverse proxies can inspect HTTP headers, which, for example, allows them to present a single IP address to the Internet while relaying requests to different internal servers based on the URL of the HTTP request. Reverse proxies can hide the existence and characteristics of origin servers. This can make it more difficult to determine the actual location of the origin server / website and, for instance, more challenging to initiate legal action such as takedowns or block access to the website, as the IP address of the website may not be immediately apparent. Additionally, the reverse proxy may be located in a different jurisdiction with different legal requirements, further complicating the takedown process. Application firewall features can protect against common web-based attacks, like a denial-of-service attack (DoS) or distributed denial-of-service attacks (DDoS). Without a reverse proxy, removing malware or initiating takedowns (while simultaneously dealing with the attack) on one's own site, for example, can be difficult. In the case of secure websites, a web server may not perform TLS encryption itself, but instead offload the task to a reverse proxy that may be equipped with TLS acceleration hardware. (See TLS termination proxy.) A reverse proxy can distribute the load from incoming requests to several servers, with each server supporting its own application area. In the case of reverse proxying web servers, the reverse proxy may have to rewrite the URL in each incoming request in order to match the relevant internal location of the requested resource. A reverse proxy can reduce load on its origin servers by caching static content and dynamic content, known as web acceleration. Proxy caches of this sort can often satisfy a considerable number of website requests, greatly reducing the load on the origin server(s). A reverse proxy can optimize content by compressing it in order to speed up loading times. In a technique named "spoon-feeding", a dynamically generated page can be produced in its entirety and served to the reverse proxy, which can feed the page to the client as the connection allows. The program that generates the page need not remain open, thus releasing server resources during the possibly extended time the client requires to complete the transfer. Reverse proxies can operate wherever multiple web-servers must be accessible via a single public IP address. The web servers listen on different ports in the same machine, with the same local IP address or, possibly, on different machines with different local IP addresses. The reverse proxy analyzes each incoming request and delivers it to the right server within the local area network. Reverse proxies can perform A/B testing and multivariate testing without requiring application code to handle the logic of which version is served to a client. A reverse proxy can add access authentication to a web server that does not have any authentication. == Risks == When the transit traffic is encrypted and the reverse proxy needs to filter/cache/compress or otherwise modify or improve the traffic, the proxy first must decrypt and re-encrypt communications. This requires the proxy to possess the TLS certificate and its corresponding private key, extending the number of systems that can have access to non-encrypted data and making it a more valuable target for attackers. The vast majority of external data breaches happen either when hackers succeed in abusing an existing reverse proxy that was intentionally deployed by an organization, or when hackers succeed in converting an existing Internet-facing server into a reverse proxy server. Compromised or converted systems allow external attackers to specify where they want their attacks proxied to, enabling their access to internal networks and systems. Applications that were developed for the internal use of a company are not typically hardened to public standards and are not necessarily designed to withstand all hacking attempts. When an organization allows external access to such internal applications via a reverse proxy, they might unintentionally increase their own attack surface and invite hackers. If a reverse proxy is not configured to filter attacks or it does not receive daily updates to keep its attack signature database up to date, a zero-day vulnerability can pass through unfiltered, enabling attackers to gain control of the system(s) that are behind the reverse proxy server. Giving the reverse proxy of a third party access to private keys (for caching or optimizing content) places the entire triad of confidentiality, integrity and availability in the hands of the third party who operates the proxy. A reverse proxy is a single point of failure for the back-end services it fronts: an outage caused by misconfiguration, a denial-of-service attack, or a software fault can make every fronted service unreachable to outside clients, even when the back-end services themselves remain healthy. For example, a 2020 outage at Cloudflare briefly took down major sites and services that relied on its reverse-proxy edge, including Discord.

    Read more →