AI Chatbot Addiction Reddit

AI Chatbot Addiction Reddit — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Blobotics

    Blobotics

    Blobotics is a term describing research into chemical-based computer processors based on ions rather than electrons. Andrew Adamatzky, a computer scientist at the University of the West of England, Bristol used the term in an article in New Scientist March 28, 2005 [1]. The aim is to create 'liquid logic gates' which would be 'infinitely reconfigurable and self-healing'. The process relies on the Belousov–Zhabotinsky reaction, a repeating cycle of three separate sets of reactions. Such a processor could form the basis of a robot which, using artificial sensors, interact with its surroundings in a way which mimics living creatures. The coining of the term was featured by ABC radio in Australia [2].

    Read more →
  • Data governance

    Data governance

    Data governance is a term used on both a macro and a micro level. The former is a political concept and forms part of international relations and Internet governance; the latter is a data management concept and forms part of corporate/organizational data governance. Data governance involves delegating authority over data and exercising that authority through decision-making processes. It plays a role in enhancing the value of data assets. == Macro level == Data governance at the macro level involves regulating cross-border data flows among countries, which is more precisely termed international data governance. This field was first formed in the early 2000s, and consists of "norms, principles and rules governing various types of data." There have been several international groups established by research organizations that aim to grant access to their data. These groups that enable an exchange of data are, as a result, exposed to domestic and international legal interpretations that ultimately decide how data is used. However, as of 2023, there are no international laws or agreements specifically focused on data protection. == Data governance (Data Management) == Data governance is the set of principles, policies, and processes that guide the effective and responsible use of data within an organization. It creates a framework for decision making, accountability, and oversight across the data lifecycle, from creation and storage to sharing and disposal. Data governance is closely linked with data management, which provides the practical methods to carry out governance objectives. These methods include data quality assurance, metadata management, master data management, security controls, and compliance monitoring. Together, governance and management aim to maximize the value of data as a strategic asset, reduce risks from misuse or inaccuracy, and ensure compliance with regulatory, ethical, and business requirements. The importance of this discipline has grown with the rise of big data, cloud computing, and artificial intelligence, where consistent standards and stewardship are essential for privacy protection, interoperability, and informed decision making. == Data governance drivers == While data governance initiatives can be driven by a desire to improve data quality, they are often driven by C-level leaders responding to external regulations. In a recent report conducted by the CIO WaterCooler community, 54% stated the key driver was efficiencies in processes; 39% - regulatory requirements; and only 7% customer service. Examples of these regulations include Sarbanes–Oxley Act, Basel I, Basel II, HIPAA, GDPR, cGMP, and a number of data privacy regulations. To achieve compliance with these regulations, business processes and controls require formal management processes to govern the data subject to these regulations. Successful programs identify drivers that are meaningful to both supervisory and executive leadership. Common themes among the external regulations center on the need to manage risk. The risks can be financial misstatement, inadvertent release of sensitive data, or poor data quality for key decisions. Methods to manage these risks vary from industry to industry. Examples of commonly referenced best practices and guidelines include COBIT, ISO/IEC 38500, and others. The proliferation of regulations and standards creates challenges for data governance professionals, particularly when multiple regulations overlap the data being managed. Organizations often launch data governance initiatives to address these challenges. == Data governance initiatives (Dimensions) == Data governance initiatives improve the quality of data by assigning a team responsible for data's accuracy, completeness, consistency, timeliness, validity, and uniqueness. This team usually consists of executive leadership, project management, line-of-business managers, and data stewards. The team usually employs a methodology for tracking and improving enterprise data, such as Six Sigma, and tools for data mapping, profiling, cleansing, and monitoring data. Data governance initiatives may be aimed at achieving a number of objectives including offering better visibility to internal and external customers (such as supply chain management), compliance with regulatory law, improving operations after rapid company growth or corporate mergers, or to aid the efficiency of enterprise knowledge workers by reducing confusion and error and increasing their scope of knowledge. Many data governance initiatives are also inspired by past attempts to fix information quality at the departmental level, which can lead to incongruent and redundant data quality processes. Most large companies have many applications and databases that can not easily share information. Therefore, knowledge workers within large organizations may not have access to the data they need to best do their jobs. When they do have access to the data, the data quality may be poor. By setting up a data governance practice or corporate data authority (individual or area responsible for determining how to proceed, in the best interest of the business, when a data issue arises), these problems can be mitigated. == Implementation == Implementation of a data governance initiative may vary in scope as well as origin. Sometimes, an executive mandate will arise to initiate an enterprise-wide effort. Sometimes the mandate will be to create a pilot project or projects, limited in scope and objectives, aimed at either resolving existing issues or demonstrating value. Sometimes, an initiative originates from lower down in the organization's hierarchy and will be deployed in a limited scope to demonstrate value to potential sponsors higher up in the organization. The initial scope of an implementation can vary greatly as well, from review of a one-off IT system to a cross-organization initiative. == Data governance tools == Leaders of successful data governance programs declared at the Data Governance Conference in Orlando, FL, in December 2006, that data governance is about 80 to 95 percent communication. That stated, it is a given that many of the objectives of a data governance program must be accomplished with appropriate tools. Many vendors are now positioning their products as data governance tools. Due to the different focus areas of various data governance initiatives, a given tool may or may not be appropriate. Additionally, many tools that are not marketed as governance tools address governance needs and demands.

    Read more →
  • Kerckhoffs's principle

    Kerckhoffs's principle

    Kerckhoffs's principle (also called Kerckhoffs's desideratum, assumption, axiom, doctrine or law) of cryptography was stated by the Dutch cryptographer Auguste Kerckhoffs in the 19th century. The principle holds that a cryptosystem should be secure, even if everything about the system, except the key, is public knowledge. This concept is widely embraced by cryptographers, in contrast to security through obscurity, which is not. Kerckhoffs's principle was phrased by the American mathematician Claude Shannon as "the enemy knows the system", i.e., "one ought to design systems under the assumption that the enemy will immediately gain full familiarity with them". In that form, it is called Shannon's maxim. Another formulation by American researcher and professor Steven M. Bellovin is: In other words—design your system assuming that your opponents know it in detail. (A former official at NSA's National Computer Security Center told me that the standard assumption there was that serial number 1 of any new device was delivered to the Kremlin.) == Origins == The invention of telegraphy radically changed military communications and increased the number of messages that needed to be protected from the enemy dramatically, leading to the development of field ciphers which had to be easy to use without large confidential codebooks prone to capture on the battlefield. It was this environment which led to the development of Kerckhoffs's requirements. Auguste Kerckhoffs was a professor of German language at Ecole des Hautes Etudes Commerciales (HEC) in Paris. In early 1883, Kerckhoffs's article, La Cryptographie Militaire, was published in two parts in the Journal of Military Science, in which he stated six design rules for military ciphers. Translated from French, they are: The system must be practically, if not mathematically, indecipherable; It should not require secrecy, and it should not be a problem if it falls into enemy hands; It must be possible to communicate and remember the key without using written notes, and correspondents must be able to change or modify it at will; It must be applicable to telegraph communications; It must be portable, and should not require several persons to handle or operate; Lastly, given the circumstances in which it is to be used, the system must be easy to use and should not be stressful to use or require its users to know and comply with a long list of rules. Some are no longer relevant given the ability of computers to perform complex encryption. The second rule, now known as Kerckhoffs's principle, is still critically important. == Explanation of the principle == Kerckhoffs viewed cryptography as a rival to, and a better alternative than, steganographic encoding, which was common in the nineteenth century for hiding the meaning of military messages. One problem with encoding schemes is that they rely on humanly-held secrets such as "dictionaries" which disclose for example, the secret meaning of words. Steganographic-like dictionaries, once revealed, permanently compromise a corresponding encoding system. Another problem is that the risk of exposure increases as the number of users holding the secrets increases. Nineteenth century cryptography, in contrast, used simple tables which provided for the transposition of alphanumeric characters, generally given row-column intersections which could be modified by keys which were generally short, numeric, and could be committed to human memory. The system was considered "indecipherable" because tables and keys do not convey meaning by themselves. Secret messages can be compromised only if a matching set of table, key, and message falls into enemy hands in a relevant time frame. Kerckhoffs viewed tactical messages as only having a few hours of relevance. Systems are not necessarily compromised, because their components (i.e. alphanumeric character tables and keys) can be easily changed. === Advantage of secret keys === Using secure cryptography is supposed to replace the difficult problem of keeping messages secure with a much more manageable one, keeping relatively small keys secure. A system that requires long-term secrecy for something as large and complex as the whole design of a cryptographic system obviously cannot achieve that goal. It only replaces one hard problem with another. However, if a system is secure even when the enemy knows everything except the key, then all that is needed is to manage keeping the keys secret. There are a large number of ways the internal details of a widely used system could be discovered. The most obvious is that someone could bribe, blackmail, or otherwise threaten staff or customers into explaining the system. In war, for example, one side will probably capture some equipment and people from the other side. Each side will also use spies to gather information. If a method involves software, someone could do memory dumps or run the software under the control of a debugger in order to understand the method. If hardware is being used, someone could buy or steal some of the hardware and build whatever programs or gadgets needed to test it. Hardware can also be dismantled so that the chip details can be examined under the microscope. === Maintaining security === A generalization some make from Kerckhoffs's principle is: "The fewer and simpler the secrets that one must keep to ensure system security, the easier it is to maintain system security." Bruce Schneier ties it in with a belief that all security systems must be designed to fail as gracefully as possible: Kerckhoffs's principle applies beyond codes and ciphers to security systems in general: every secret creates a potential failure point. Secrecy, in other words, is a prime cause of brittleness—and therefore something likely to make a system prone to catastrophic collapse. Conversely, openness provides ductility. Any security system depends crucially on keeping some things secret. However, Kerckhoffs's principle points out that the things kept secret ought to be those least costly to change if inadvertently disclosed. For example, a cryptographic algorithm may be implemented by hardware and software that is widely distributed among users. If security depends on keeping that secret, then disclosure leads to major logistic difficulties in developing, testing, and distributing implementations of a new algorithm – it is "brittle". On the other hand, if keeping the algorithm secret is not important, but only the keys used with the algorithm must be secret, then disclosure of the keys simply requires the simpler, less costly process of generating and distributing new keys. == Applications == In accordance with Kerckhoffs's principle, the majority of civilian cryptography makes use of publicly known algorithms. By contrast, ciphers used to protect classified government or military information are often kept secret (see Type 1 encryption). However, it should not be assumed that government/military ciphers must be kept secret to maintain security. It is possible that they are intended to be as cryptographically sound as public algorithms, and the decision to keep them secret is in keeping with a layered security posture. == Security through obscurity == It is moderately common for companies to keep the inner workings of a system secret. Some argue this "security by obscurity" makes the product safer and less vulnerable to attack. A counter-argument is that keeping the innards secret may improve security in the short term, but in the long run, only systems that have been published and analyzed should be trusted. Steven Bellovin and Randy Bush commented: Security Through Obscurity Considered Dangerous Hiding security vulnerabilities in algorithms, software, and/or hardware decreases the likelihood they will be repaired and increases the likelihood that they can and will be exploited. Discouraging or outlawing discussion of weaknesses and vulnerabilities is extremely dangerous and deleterious to the security of computer systems, the network, and its citizens. Open Discussion Encourages Better Security The long history of cryptography and cryptoanalysis has shown time and time again that open discussion and analysis of algorithms exposes weaknesses not thought of by the original authors, and thereby leads to better and more secure algorithms. As Kerckhoffs noted about cipher systems in 1883 [Kerc83], "Il faut qu'il n'exige pas le secret, et qu'il puisse sans inconvénient tomber entre les mains de l'ennemi." (Roughly, "the system must not require secrecy and must be able to be stolen by the enemy without causing trouble.")

    Read more →
  • Key (cryptography)

    Key (cryptography)

    A key in cryptography is a piece of information, usually a string of numbers or letters that are stored in a file, which, when processed through a cryptographic algorithm, can encode or decode cryptographic data. Based on the used method, the key can be different sizes and varieties, but in all cases, the strength of the encryption relies on the security of the key being maintained. A key's security strength is dependent on its algorithm, the size of the key, the generation of the key, and the process of key exchange. == Scope == The key is what is used to encrypt data from plaintext to ciphertext. There are different methods for utilizing keys and encryption. === Symmetric cryptography === Symmetric cryptography refers to the practice of the same key being used for both encryption and decryption. === Asymmetric cryptography === Asymmetric cryptography has separate keys for encrypting and decrypting. These keys are known as the public and private keys, respectively. == Purpose == Since the key protects the confidentiality and integrity of the system, it is important to be kept secret from unauthorized parties. With public key cryptography, only the private key must be kept secret, but with symmetric cryptography, it is important to maintain the confidentiality of the key. Kerckhoff's principle states that the entire security of the cryptographic system relies on the secrecy of the key. == Key sizes == Key size is the number of bits in the key defined by the algorithm. This size defines the upper bound of the cryptographic algorithm's security. The larger the key size, the longer it will take before the key is compromised by a brute force attack. Since perfect secrecy is not feasible for key algorithms, researches are now more focused on computational security. In the past, keys were required to be a minimum of 40 bits in length, however, as technology advanced, these keys were being broken quicker and quicker. As a response, restrictions on symmetric keys were enhanced to be greater in size. Currently, 2048 bit RSA is commonly used, which is sufficient for current systems. However, current RSA key sizes would all be cracked quickly with a powerful quantum computer. "The keys used in public key cryptography have some mathematical structure. For example, public keys used in the RSA system are the product of two prime numbers. Thus public key systems require longer key lengths than symmetric systems for an equivalent level of security. 3072 bits is the suggested key length for systems based on factoring and integer discrete logarithms which aim to have security equivalent to a 128 bit symmetric cipher." == Key generation == To prevent a key from being guessed, keys need to be generated randomly and contain sufficient entropy. The problem of how to safely generate random keys is difficult and has been addressed in many ways by various cryptographic systems. A key can directly be generated by using the output of a Random Bit Generator (RBG), a system that generates a sequence of unpredictable and unbiased bits. A RBG can be used to directly produce either a symmetric key or the random output for an asymmetric key pair generation. Alternatively, a key can also be indirectly created during a key-agreement transaction, from another key or from a password. Some operating systems include tools for "collecting" entropy from the timing of unpredictable operations such as disk drive head movements. For the production of small amounts of keying material, ordinary dice provide a good source of high-quality randomness. == Establishment scheme == The security of a key is dependent on how a key is exchanged between parties. Establishing a secured communication channel is necessary so that outsiders cannot obtain the key. A key establishment scheme (or key exchange) is used to transfer an encryption key among entities. Key agreement and key transport are the two types of a key exchange scheme that are used to be remotely exchanged between entities . In a key agreement scheme, a secret key, which is used between the sender and the receiver to encrypt and decrypt information, is set up to be sent indirectly. All parties exchange information (the shared secret) that permits each party to derive the secret key material. In a key transport scheme, encrypted keying material that is chosen by the sender is transported to the receiver. Either symmetric key or asymmetric key techniques can be used in both schemes. The Diffie–Hellman key exchange and Rivest-Shamir-Adleman (RSA) are the most two widely used key exchange algorithms. In 1976, Whitfield Diffie and Martin Hellman constructed the Diffie–Hellman algorithm, which was the first public key algorithm. The Diffie–Hellman key exchange protocol allows key exchange over an insecure channel by electronically generating a shared key between two parties. On the other hand, RSA is a form of the asymmetric key system which consists of three steps: key generation, encryption, and decryption. Key confirmation delivers an assurance between the key confirmation recipient and provider that the shared keying materials are correct and established. The National Institute of Standards and Technology recommends key confirmation to be integrated into a key establishment scheme to validate its implementations. == Management == Key management concerns the generation, establishment, storage, usage and replacement of cryptographic keys. A key management system (KMS) typically includes three steps of establishing, storing and using keys. The base of security for the generation, storage, distribution, use and destruction of keys depends on successful key management protocols. == Key vs password == A password is a memorized series of characters including letters, digits, and other special symbols that are used to verify identity. It is often produced by a human user or a password management software to protect personal and sensitive information or generate cryptographic keys. Passwords are often created to be memorized by users and may contain non-random information such as dictionary words. On the other hand, a key can help strengthen password protection by implementing a cryptographic algorithm which is difficult to guess or replace the password altogether. A key is generated based on random or pseudo-random data and can often be unreadable to humans. A password is less safe than a cryptographic key due to its low entropy, randomness, and human-readable properties. However, the password may be the only secret data that is accessible to the cryptographic algorithm for information security in some applications such as securing information in storage devices. Thus, a deterministic algorithm called a key derivation function (KDF) uses a password to generate the secure cryptographic keying material to compensate for the password's weakness. Various methods such as adding a salt or key stretching may be used in the generation.

    Read more →
  • History of machine translation

    History of machine translation

    Machine translation is a sub-field of computational linguistics that investigates the use of software to translate text or speech from one natural language to another. In the 1950s, machine translation became a reality in research, although references to the subject can be found as early as the 17th century. The Georgetown experiment, which involved successful fully automatic translation of more than sixty Russian sentences into English in 1954, was one of the earliest recorded projects. Researchers of the Georgetown experiment asserted their belief that machine translation would be a solved problem within a few years. In the Soviet Union, similar experiments were performed shortly after. Consequently, the success of the experiment ushered in an era of significant funding for machine translation research in the United States. The achieved progress was much slower than expected; in 1966, the ALPAC report found that ten years of research had not fulfilled the expectations of the Georgetown experiment and resulted in dramatically reduced funding. Interest grew in statistical models for machine translation, which became more common and also less expensive in the 1980s as available computational power increased. Although there exists no autonomous system of "fully automatic high quality translation of unrestricted text," there are many programs now available that are capable of providing useful output within strict constraints. Several of these programs are available online, such as Google Translate and the SYSTRAN system that powers AltaVista's BabelFish (which was replaced by Microsoft Bing translator in May 2012). == The beginning == The origins of machine translation can be traced back to the work of Al-Kindi, a 9th-century Arabic cryptographer who developed techniques for systemic language translation, including cryptanalysis, frequency analysis, and probability and statistics, which are used in modern machine translation. The idea of machine translation later appeared in the 17th century. In 1629, René Descartes proposed a universal language, with equivalent ideas in different tongues sharing one symbol. In the mid-1930s the first patents for "translating machines" were applied for by Georges Artsrouni, for an automatic bilingual dictionary using punched tape. Russian Peter Troyanskii submitted a more detailed proposal that included both the bilingual dictionary and a method for dealing with grammatical roles between languages, based on the grammatical system of Esperanto. This system was separated into three stages: stage one consisted of a native-speaking editor in the source language to organize the words into their logical forms and to exercise the syntactic functions; stage two required the machine to "translate" these forms into the target language; and stage three required a native-speaking editor in the target language to normalize this output. Troyanskii's proposal remained unknown until the late 1950s, by which time computers were well-known and utilized. == The early years == The first set of proposals for computer based machine translation was presented in 1949 by Warren Weaver, a researcher at the Rockefeller Foundation, "Translation memorandum". These proposals were based on information theory, successes in code breaking during the Second World War, and theories about the universal principles underlying natural language. A few years after Weaver submitted his proposals, research began in earnest at many universities in the United States. On 7 January 1954 the Georgetown–IBM experiment was held in New York at the head office of IBM. This was the first public demonstration of a machine translation system. The demonstration was widely reported in the newspapers and garnered public interest. The system itself, however, was no more than a "toy" system. It had only 250 words and translated 49 carefully selected Russian sentences into English – mainly in the field of chemistry. Nevertheless, it encouraged the idea that machine translation was imminent and stimulated the financing of the research, not only in the US but worldwide. Early systems used large bilingual dictionaries and hand-coded rules for fixing the word order in the final output which was eventually considered too restrictive in linguistic developments at the time. For example, generative linguistics and transformational grammar were exploited to improve the quality of translations. During this period operational systems were installed. The United States Air Force used a system produced by IBM and Washington University in St. Louis, while the Atomic Energy Commission and Euratom, in Italy, used a system developed at Georgetown University. While the quality of the output was poor it met many of the customers' needs, particularly in terms of speed. At the end of the 1950s, Yehoshua Bar-Hillel was asked by the US government to look into machine translation, to assess the possibility of fully automatic high-quality translation by machines. Bar-Hillel described the problem of semantic ambiguity or double-meaning, as illustrated in the following sentence: Little John was looking for his toy box. Finally he found it. The box was in the pen. The word pen may have two meanings: the first meaning, something used to write in ink with; the second meaning, a container of some kind. To a human, the meaning is obvious, but Bar-Hillel claimed that without a "universal encyclopedia" a machine would never be able to deal with this problem. At the time, this type of semantic ambiguity could only be solved by writing source texts for machine translation in a controlled language that uses a vocabulary in which each word has exactly one meaning. == The 1960s, the ALPAC report and the seventies == Research in the 1960s in both the Soviet Union and the United States concentrated mainly on the Russian–English language pair. The objects of translation were chiefly scientific and technical documents, such as articles from scientific journals. The rough translations produced were sufficient to get a basic understanding of the articles. If an article discussed a subject deemed to be confidential, it was sent to a human translator for a complete translation; if not, it was discarded. A great blow came to machine-translation research in 1966 with the publication of the ALPAC report. The report was commissioned by the US government and delivered by ALPAC, the Automatic Language Processing Advisory Committee, a group of seven scientists convened by the US government in 1964. The US government was concerned that there was a lack of progress being made despite significant expenditure. The report concluded that machine translation was more expensive, less accurate and slower than human translation, and that despite the expenditures, machine translation was not likely to reach the quality of a human translator in the near future. The report recommended, however, that tools be developed to aid translators – automatic dictionaries, for example – and that some research in computational linguistics should continue to be supported. The publication of the report had a profound impact on research into machine translation in the United States, and to a lesser extent the Soviet Union and United Kingdom. Research, at least in the US, was almost completely abandoned for over a decade. In Canada, France and Germany, however, research continued. In the US the main exceptions were the founders of SYSTRAN (Peter Toma) and Logos (Bernard Scott), who established their companies in 1968 and 1970 respectively and served the US Department of Defense. In 1970, the SYSTRAN system was installed for the United States Air Force, and subsequently by the Commission of the European Communities in 1976. The METEO System, developed at the Université de Montréal, was installed in Canada in 1977 to translate weather forecasts from English to French, and was translating close to 80,000 words per day or 30 million words per year until it was replaced by a competitor's system on 30 September 2001. While research in the 1960s concentrated on limited language pairs and input, demand in the 1970s was for low-cost systems that could translate a range of technical and commercial documents. This demand was spurred by the increase of globalisation and the demand for translation in Canada, Europe, and Japan. == The 1980s and early 1990s == By the 1980s, both the diversity and the number of installed systems for machine translation had increased. A number of systems relying on mainframe technology were in use, such as SYSTRAN, Logos, Ariane-G5, and Metal. As a result of the improved availability of microcomputers, there was a market for lower-end machine translation systems. Many companies took advantage of this in Europe, Japan, and the USA. Systems were also brought onto the market in China, Eastern Europe, Korea, and the Soviet Union. During the 1980s there was a lot of activity in MT in Japan especially. With the fifth-generation co

    Read more →
  • Social media use in the financial services sector

    Social media use in the financial services sector

    Social media in the financial services sector refers to the use of social media by the financial services sector to promote and distribute financial services. Social media is used in various aspects of the financial industry including customer service, marketing, and product development. It has enabled financial institutions to extend their reach through direct and real-time communication with customers, fostering more personal connections. It also allows individuals to talk to other individuals creating lending and trading via social groups as well as developing new financial services by fintech startup companies. In terms of marketing, social media is utilized by both traditional financial companies as well as disruptive fintech companies such as peer-to-peer lending (P2P) companies. The financial industry has used information technology since its inception in the 1960s and social media fits in with this ongoing development. Larger, traditional financial firms have integrated social media into their marketing strategies. Companies in the financial sector are subject to strict regulations that include how they use social media. In the United States, the Financial Industry Regulatory Authority (FINRA) is a key regulator that sets rules how financial firms can interact with consumers. This includes ensuring that social media posts follow financial advertising rules, such as being fair and balanced and not providing misleading information, and that financial advice is not provided by unqualified personnel, such as influencers. == History == In 2003, at the beginning of social media development, MySpace was founded as a "social networking service." It allowed people to create a profile, connect with other people, and post videos, pictures, and songs. As MySpace grew in popularity, it attracted interest from companies wishing to promote their brands on the social platform. They were joined by Facebook and in 2010 by Instagram. Financial service firms were initially slow to adapt to promotion via social media but soon joined other big firms after they saw the success other industries had in engaging with younger people. == Uses == === Branding === While companies are able to connect with more people remotely through providing online financial services, their branding strategy has shifted from customized to standardized. Prior to the outbreak of technology, most banks used customized branding where they targeted only customers in their regions. Businesses can now use technology to operate beyond their geographic location and maintain a consistent image across multiple countries with standardized branding. By being able to extend a consistent brand reputation across a wider geographic location, financial services companies can take advantage of economies of scale in advertising cost, lower administrative complexity, lower entry into new markets, and improved cross-border learning within the company. === Customer engagement === Online banking reduced face-to-face interaction between customers and their banks. Most banking transactions can now be conducted online or through mobile devices, rather than at a local branch with a teller. Social media provides a channel for firms to maintain personal contact with customers, replicating some of the interaction that was previously available at local branches. For example, a bank's Facebook page may feature an employee profile describing their job duties, which serves to present a more human face for larger institutions. === Lending === Social media is a core marketing channel for online peer-to-peer lending as well as small business lenders. Since these companies operate exclusively online, it makes sense for them to market online through social media channels. They are able to grow and find new lenders and buyers by utilizing social networks. === Trading === Social trading is an alternative way of analyzing financial data by looking at what other traders are doing and comparing, copying and discussing their techniques and strategies. Prior to the advent of social trading, investors and traders were relying on fundamental or technical analysis to form their investment decisions. Using social trading investors and traders could integrate into their investment decision-process social indicators from trading data-feeds of other traders. Investors also use platform like Reddit, Signal messaging or WeChat to create social communities to discuss investments and finance. In some cases they use this to join together using meme stocks to move financial markets, such as the 2021 GameStop short squeeze incident. They can also use social groups to launch and promote new products such as cryptocurrencies. Investing application like WeBull incorporate a forum style messaging system on each stock that is available for trading. Financial brokers such as Fidelity Investments, Interactive Brokers, and E-Trade have moved to incorporate community features in their investment apps. == Regulations == The use of social media by investors and financial services professionals for business purposes is subject to regulatory oversight, in the United States this is done primarily by the Financial Industry Regulatory Authority (FINRA). FINRA's rules, designed to protect investors from misleading information in all communications and this also applies to social media communications. This includes ensuring that social media posts follow financial advertising rules, such as being fair and balanced and not providing misleading information, and that advice is not provided by unqualified personnel, such as influencers and bank staff acting in a personal capacity. Financial firms have to maintain books and records of all interaction with customers and this includes social media. == New products and services == Social media has created entirely new products for the financial services sector, revolutionizing products and developing new industries through the merging of social technology and financial services. Fintech startups use social media to promote products to get them established. Several developing nations have used social media to leapfrog traditional financial technology; for example, WeChat Pay, which developed from the Chinese WeChat social media platform, became a major payment system in China within a few years. In 2015, according to consulting firm Accenture, 390 million people in China had registered to use mobile banking. This figure is more than the population of the United States. In the United States, the fintech company Venmo combines technology and financial services on a social platform. Other financial technology companies that have used social media to develop or promote financial products include: Lending Club – One of the first peer-to-peer lending businesses OnDeck Capital – A US online-only lending business Funding Circle – A UK-based online lending company Wise – A global online money transfers company Kabbage – A US online unsecured loan company later acquired by American Express Avant – A US online unsecured loan company Zopa – A UK online neobank providing peer-to-peer lending == Risks == === Reputational damage === Due to the real-time nature of social media, financial services companies can be impacted by potential reputational issues. Any negative experience by customers can easily be shared online and could become a viral phenomenon, those comments could likely have a detrimental effect on the company’s stock price and reputation. On the other hand, any positive experience a customer has can also be shared online. However, positive experiences are much less likely to become viral. === Scams === The nature of social media makes it easy to target individuals without being seen by the wider community, this allows scammers to target individuals. Example include romance scams such as the pig butchering scam where an individual is tricked to transfer funds or assets to the scammer over social media making it hard for law enforcement to track them or recover funds. === Customer privacy === Customer privacy is important for the financial services industry. It is critical that customer information such as a bank account numbers and other personal information is kept private. However, this information can be leaked if for example, a customer is unhappy with a bank’s service, they may tweet at the bank expressing their frustrations and include their name and account number.

    Read more →
  • Multiple encryption

    Multiple encryption

    Multiple encryption is the process of encrypting an already encrypted message one or more times, either using the same or a different algorithm. It is also known as cascade encryption, cascade ciphering, cipher stacking, multiple encryption, and superencipherment. Superencryption refers to the outer-level encryption of a multiple encryption. Some cryptographers, like Matthew Green of Johns Hopkins University, say multiple encryption addresses a problem that mostly doesn't exist: Modern ciphers rarely get broken... You’re far more likely to get hit by malware or an implementation bug than you are to suffer a catastrophic attack on Advanced Encryption Standard (AES). However, from the previous quote an argument for multiple encryption can be made, namely poor implementation. Using two different cryptomodules and keying processes from two different vendors requires both vendors' wares to be compromised for security to fail completely. == Independent keys == Picking any two ciphers, if the key used is the same for both, the second cipher could possibly undo the first cipher, partly or entirely. This is true of ciphers where the decryption process is exactly the same as the encryption process (a reciprocal cipher) – the second cipher would completely undo the first. If an attacker were to recover the key through cryptanalysis of the first encryption layer, the attacker could possibly decrypt all the remaining layers, assuming the same key is used for all layers. To prevent that risk, one can use keys that are statistically independent for each layer (e.g. independent RNGs). Ideally each key should have separate and different generation, sharing, and management processes. == Independent Initialization Vectors == For en/decryption processes that require sharing an Initialization Vector (IV) / nonce these are typically, openly shared or made known to the recipient (and everyone else). Its good security policy never to provide the same data in both plaintext and ciphertext when using the same key and IV. Therefore, its recommended (although at this moment without specific evidence) to use separate IVs for each layer of encryption. == Importance of the first layer == With the exception of the one-time pad, no cipher has been theoretically proven to be unbreakable. Furthermore, some recurring properties may be found in the ciphertexts generated by the first cipher. Since those ciphertexts are the plaintexts used by the second cipher, the second cipher may be rendered vulnerable to attacks based on known plaintext properties (see references below). This is the case when the first layer is a program P that always adds the same string S of characters at the beginning (or end) of all ciphertexts (commonly known as a magic number). When found in a file, the string S allows an operating system to know that the program P has to be launched in order to decrypt the file. This string should be removed before adding a second layer. To prevent this kind of attack, one can use the method provided by Bruce Schneier: Generate a random pad R of the same size as the plaintext. Encrypt R using the first cipher and key. XOR the plaintext with the pad, then encrypt the result using the second cipher and a different (!) key. Concatenate both ciphertexts in order to build the final ciphertext. A cryptanalyst must break both ciphers to get any information. This will, however, have the drawback of making the ciphertext twice as long as the original plaintext. Note, however, that a weak first cipher may merely make a second cipher that is vulnerable to a chosen plaintext attack also vulnerable to a known plaintext attack. However, a block cipher must not be vulnerable to a chosen plaintext attack to be considered secure. Therefore, the second cipher described above is not secure under that definition, either. Consequently, both ciphers still need to be broken. The attack illustrates why strong assumptions are made about secure block ciphers and ciphers that are even partially broken should never be used. == The Rule of Two == The Rule of Two is a data security principle from the NSA's Commercial Solutions for Classified Program (CSfC). It specifies two completely independent layers of cryptography to protect data. For example, data could be protected by both hardware encryption at its lowest level and software encryption at the application layer. It could mean using two FIPS-validated software cryptomodules from different vendors to en/decrypt data. The importance of vendor and/or model diversity between the layers of components centers around removing the possibility that the manufacturers or models will share a vulnerability. This way if one components is compromised there is still an entire layer of encryption protecting the information at rest or in transit. The CSfC Program offers solutions to achieve diversity in two ways. "The first is to implement each layer using components produced by different manufacturers. The second is to use components from the same manufacturer, where that manufacturer has provided NSA with sufficient evidence that the implementations of the two components are independent of one another." The principle is practiced in the NSA's secure mobile phone called Fishbowl. The phones use two layers of encryption protocols, IPsec and Secure Real-time Transport Protocol (SRTP), to protect voice communications. The Samsung Galaxy S9 Tactical Edition is also an approved CSfC Component.

    Read more →
  • Outfit of the day

    Outfit of the day

    Outfit of the day (commonly abbreviated OOTD) is a phrase used online by users sharing what outfits (or "fits") they wear on a particular day or occasion. The video or post often mentions where each article of clothing, shoes, jewelry, and other accessories is from. OOTD posts are typically found on social media websites, such as Tumblr, Instagram, and Pinterest, and OOTD videos on YouTube and TikTok. Motives for sharing OOTD content vary, from encouraging viewers to buy a certain product, showing off personal style, or giving outfit inspiration. == History == "Outfit of the Day" videos started as early as 2010 but gained popularity in 2019. By 2016, the hashtag "OOTD" on Instagram had over 80 million posts. OOTD videos have become popular with the average internet user, as they express one's fashion sense and style to their followers. == Use in marketing == Brands will use famous influencers to promote their products using the "outfit of the day" tactic in hopes that users will buy the product to emulate the influencer. This tactic has increased sales for many brands. Creators of OOTD content can also profit, often through brand deals or affiliate links. Vogue has a recurring segment on YouTube that shows "Every outfit (fill in celebrity name here) wears in a week." == Variants == A variant is "outfit(s) of the week" (OOTW), where a user will share multiple outfits to be worn over the course of several days or a week. OOTDs are often seen in "Get ready with me" (GRWM) videos, where a user films their morning routine. In these videos, the filmers talk about their plans for the day, what makeup products they are using to get ready, and the "Outfit of the day" they are wearing. == Criticism == Some fashion writers have suggested that the proliferation of OOTD content encourages people to buy new clothing rather than to wear already owned items. Some stylists have also proposed that OOTD content encourages users to follow trends rather than explore and find their own style.

    Read more →
  • Google Tasks

    Google Tasks

    Google Tasks is a task management application developed by Google and included with Google Workspace. Included initially as a feature in Gmail and Google Calendar, Google Tasks launched as a core product with a standalone app in 2018. It is available for Android and iOS, as well as in the right-hand side panel on Google Workspace apps on the web and in Google Calendar. == History and development == Google Tasks began as an integration within other apps in G Suite (now Google Workspace), allowing to-do items to be created in Calendar and Gmail. Upon graduating to a core service on June 28, 2018, Google Tasks launched as a dedicated mobile app in which tasks can be sorted into lists, managed, and completed. Google Tasks launched the ability to create tasks from Google Chat messages in 2022.

    Read more →
  • Social media measurement

    Social media measurement

    Social media measurement, also called social media controlling, is the management practice of evaluating successful social media communications of brands, companies, or other organizations. Key performance indicators may be measured by extracting information from social media channels, such as blogs, wikis, micro-blogs such as Twitter, social networking sites, or video/photo sharing websites, forums from time to time. It is also used by companies to gauge current trends in the industry. The process first gathers data from different websites and then performs analysis based on different metrics like time spent on the page, click through rate, content share, comments, text analytics to identify positive or negative emotions about the brand. Some other social media metrics include share of voice, owned mentions, and earned mentions. The social media measurement process starts with defining a goal that needs to be achieved and defining the expected outcome of the process. The expected outcome varies per the goal and is usually measured by a variety of metrics. This is followed by defining possible social strategies to be used to achieve the goal. Then the next step is designing strategies to be used and setting up configuration tools that ease the process of collecting the data. In the next step, strategies and tools are deployed in real-time. This step involves conducting Quality Assurance tests of the methods deployed to collect the data. And in the final step, data collected from the system is analyzed and if the need arises, it is refined on the run time to enhance the methodologies used. The last step ensures that the result obtained is more aligned with the goal defined in the first step. == Data Acquisition == Acquiring data from social media is in demand of an exploring the user participation and population with the purpose of retrieving and collecting so many kinds of data(ex: comments, downloads etc.). There are several prevalent techniques to acquire data such as Network traffic analysis, Ad-hoc application and Crawling Network Traffic Analysis - Network traffic analysis is the process of capturing network traffic and observing it closely to determine what is happening in the network. It is primarily done to improve the performance, security and other general management of the network. However concerned about the potential tort of privacy on the Internet, network traffic analysis is always restricted by the government. Furthermore, high-speed links are not adaptable to traffic analysis because of the possible overload problem according to the packet sniffing mechanism Ad-hoc Application - Ad-hoc application is a kind of application that provides services and games to social network users by developing the APIs offered by social network companies (Facebook Developer Platform). The infrastructure of Ad-hoc application allows the user to interact with the interface layer instead of the application servers. The API provides a path for application to access information after the user login. Moreover, the size of the data set collected vary with the popularity of the social media platform i.e. social media platforms having high number of users will have more data than platforms having less user base. Scraping is a process in which the APIs collect online data from social media. The data collected from Scraping is in raw format. However, having access to these types of data is a bit difficult because of its commercial value. Crawling - Crawling is a process in which a web crawler creates indexes of all the words in a web-page, stores them, then follows all the hyperlinks and indexes on that page and again stores them. It is the most popular technique for data acquisition and is also well known for its easy operation based on prevalent Object-Orientated Programming Language (Java or Python etc.). And most important, social network companies (YouTube, Flicker, Facebook, Instagram, etc.) are friendly to crawling techniques by providing public APIs == Applications == === For branding === Monitoring social media allows researchers to find insights into a brand's overall visibility on social media, to measure the impact of campaigns, to identify opportunities for engagement, to assess competitor activity and share of voice, and to detect impending crises. It can also provide valuable information about emerging trends and what consumers and clients think about specific topics, brands or products. This is the work of a cross-section of groups that include market researchers, PR staff, marketing teams, social-engagement, and community staff, agencies and sales teams. Several different providers have developed tools to facilitate the monitoring of a variety of social media channels - from blogging to internet video to internet forums. This allows companies to track what consumers say about their brands and actions. Companies can then react to these conversations and interact with consumers through social media platforms. === In government === Apart from commercial applications, social media monitoring has become a pervasive technique applied by public organizations and governments. Monitoring is a tradition within the public sector, and social-media monitoring provides a real-time approach to detecting and responding to social developments. Governments have come to realize the need for strategies to cope with surprises from the rapid expansion of public issues. Sobkowicz introduced a framework with three blocks of social-media opinion tracking, simulating and forecasting. It includes: real-time detection of emotions, topics and opinions information-flow modelling and agent-based simulation modeling of opinion networks Bekkers introduced the application of social media monitoring in the Netherlands. Public organizations in the Netherlands (such as the Tax Agency and the Education Ministry) have started to use social media monitoring to obtain better insights into the sentiments of target groups. On the one hand, the public sector will be enabled to provide timely and efficient answers to the public by using social media monitoring techniques, but on the other hand, they also have to deal with concerns about ethical issues such as transparency and privacy. == Quantifying social media == Social media management software (SMMS) is an application program or software that facilitates an organization's ability to successfully engage in social media across different communication channels. SMMS is used to monitor inbound and outbound conversations, support customer interaction, audit or document social marketing initiatives and evaluate the usefulness of a social media presence. It can be difficult to measure all social media conversations. Due to privacy settings and other issues, not all social media conversations can be found and reported by monitoring tools. However, whilst social media monitoring cannot give absolute figures, it can be extremely useful for identifying trends and for benchmarking, in addition to the uses mentioned above. These findings can, in turn, influence and shape future business decisions. In order to access social media data (posts, Tweets, and meta-data) and to analyze and monitor social media, many companies use software technologies built for business. These range from in-platform analytics dashboards to dedicated third-party platforms, which offer more advanced capabilities including cross-platform audience intelligence, sentiment analysis, and trend detection at scale. == Location-based == Most social media networks allow users to add a location to their posts (reference all of our feeds). The location can be classified as either 'at-the-location' or 'about-the-location'. "'At-the-location' services can be defined as services where location-based content is created at the geographic location. 'About-the-location' services can be defined as services which are referring to a particular location but the content is not necessarily created in this particular physical place." The added information available from geotagged (link to Geotagging article) posts means that they can be displayed on a map. This means that a location can be used as the start of a social media search rather than a keyword or hashtag. This has major implications for disaster relief, event monitoring, safety and security professionals since a large portion of their job is related to tracking and monitoring specific locations. == Technologies used == Various monitoring platforms use different technologies for social media monitoring and measurement. These technology providers may connect to the API provided by social platforms that are created for 3rd party developers to develop their own applications and services that access data. Facebook's Graph API is one such API that social media monitoring solution products would connect to pull data from. Some social media monitoring and analytics companies use calls to data providers each time an end-user d

    Read more →
  • Social profiling

    Social profiling

    Social profiling is the process of constructing a social media user's profile using their social data. In general, profiling refers to the data science process of generating a person's profile with computerized algorithms and technology. There are various platforms for sharing this information with the proliferation of growing popular social networks, including but not limited to LinkedIn, Google+, Facebook and Twitter. == Social profile and social data == A person's social data refers to the personal data that they generate either online or offline (for more information, see social data revolution). A large amount of these data, including one's language, location and interest, is shared through social media and social network. Users join multiple social media platforms and their profiles across these platforms can be linked using different methods to obtain their interests, locations, content, and friend list. Altogether, this information can be used to construct a person's social profile. Meeting the user's satisfaction level for information collection is becoming more challenging. This is because of too much "noise" generated, which affects the process of information collection due to explosively increasing online data. Social profiling is an emerging approach to overcome the challenges faced in meeting user's demands by introducing the concept of personalized search while keeping in consideration user profiles generated using social network data. A study reviews and classifies research inferring users social profile attributes from social media data as individual and group profiling. The existing techniques along with utilized data sources, the limitations, and challenges were highlighted. The prominent approaches adopted include machine learning, ontology, and fuzzy logic. Social media data from Twitter and Facebook have been used by most of the studies to infer the social attributes of users. The literature showed that user social attributes, including age, gender, home location, wellness, emotion, opinion, relation, influence are still need to be explored. === Personalized meta-search engines === The ever-increasing online content has resulted in the lack of proficiency of centralized search engine's results. It can no longer satisfy user's demand for information. A possible solution that would increase coverage of search results would be meta-search engines, an approach that collects information from numerous centralized search engines. A new problem thus emerges, that is too much data and too much noise is generated in the collection process. Therefore, a new technique called personalized meta-search engines was developed. It makes use of a user's profile (largely social profile) to filter the search results. A user's profile can be a combination of a number of things, including but not limited to, "a user's manual selected interests, user's search history", and personal social network data. == Social media profiling == According to Samuel D. Warren II and Louis Brandeis (1890), disclosure of private information and the misuse of it can hurt people's feelings and cause considerable damage in people's lives. Social networks provide people access to intimate online interactions; therefore, information access control, information transactions, privacy issues, connections and relationships on social media have become important research fields and are subjects of concern to the public. Ricard Fogues and other co-authors state that "any privacy mechanism has at its base an access control", that dictate "how permissions are given, what elements can be private, how access rules are defined, and so on". Current access control for social media accounts tend to still be very simplistic: there is very limited diversity in the category of relationships on for social network accounts. User's relationships to others are, on most platforms, only categorized as "friend" or "non-friend" and people may leak important information to "friends" inside their social circle but not necessarily users to they consciously want to share the information to. The below section is concerned with social media profiling and what profiling information on social media accounts can achieve. === Privacy leaks === A lot of information is voluntarily shared on online social networks, such as photos and updates on life activities (new job, hobbies, etc.). People rest assured that different social network accounts on different platforms will not be linked as long as they do not grant permission to these links. However, according to Diane Gan, information gathered online enables "target subjects to be identified on other social networking sites such as Foursquare, Instagram, LinkedIn, Facebook and Google+, where more personal information was leaked". The majority of social networking platforms use the "opt out approach" for their features. If users wish to protect their privacy, it is user's own responsibility to check and change the privacy settings as a number of them are set to default option. A major social network platforms have developed geo-tag functions and are in popular usage. This is concerning because 39% of users have experienced profiling hacking; 78% burglars have used major social media networks and Google Street-view to select their victims; and an astonishing 54% of burglars attempted to break into empty houses when people posted their status updates and geo-locations. === Facebook === Formation and maintenance of social media accounts and their relationships with other accounts are associated with various social outcomes. In 2015, for many firms, customer relationship management is essential and is partially done through Facebook. Before the emergence and prevalence of social media, customer identification was primarily based upon information that a firm could directly acquire: for example, it may be through a customer's purchasing process or voluntary act of completing a survey/loyalty program. However, the rise of social media has greatly reduced the approach of building a customer's profile/model based on available data. Marketers now increasingly seek customer information through Facebook; this may include a variety of information users disclose to all users or partial users on Facebook: name, gender, date of birth, e-mail address, sexual orientation, marital status, interests, hobbies, favorite sports team(s), favorite athlete(s), or favorite music, and more importantly, Facebook connections. However, due to the privacy policy design, acquiring true information on Facebook is no trivial task. Often, Facebook users either refuse to disclose true information (sometimes using pseudonyms) or setting information to be only visible to friends, Facebook users who "LIKE" your page are also hard to identify. To do online profiling of users and cluster users, marketers and companies can and will access the following kinds of data: gender, the IP address and city of each user through the Facebook Insight page, who "LIKED" a certain user, a page list of all the pages that a person "LIKED" (transaction data), other people that a user follow (even if it exceeds the first 500, which we usually can not see) and all the publicly shared data. === Twitter === First launched on the Internet in March 2006, Twitter is a platform on which users can connect and communicate with any other user in just 280 characters. Like Facebook, Twitter is also a crucial tunnel for users to leak important information, often unconsciously, but able to be accessed and collected by others. According to Rachel Nuwer, in a sample of 10.8 million tweets by more than 5,000 users, their posted and publicly shared information are enough to reveal a user's income range. A postdoctoral researcher from the University of Pennsylvania, Daniel Preoţiuc-Pietro and his colleagues were able to categorize 90% of users into corresponding income groups. Their existing collected data, after being fed into a machine-learning model, generated reliable predictions on the characteristics of each income group. The mobile app called Streamd.in displays live tweets on Google Maps by using geo-location details attached to the tweet, and traces the user's movement in the real world. === Profiling photos on social network === The advent and universality of social media networks have boosted the role of images and visual information dissemination. Many types of visual information on social media transmit messages from the author, location information and other personal information. For example, a user may post a photo of themselves in which landmarks are visible, which can enable other users to determine where they are. In a study done by Cristina Segalin, Dong Seon Cheng and Marco Cristani, they found that profiling user posts' photos can reveal personal traits such as personality and mood. In the study, convolutional neural networks (CNNs) is introduced. It builds on the main characteristics of computational

    Read more →
  • Ciphertext

    Ciphertext

    In cryptography, ciphertext or cyphertext is the result of encryption performed on plaintext using an algorithm, called a cipher. Ciphertext is also known as encrypted or encoded information because it contains a form of the original plaintext that is unreadable by a human or computer without the proper cipher to decrypt it. This process prevents the loss of sensitive information via hacking. Decryption, the inverse of encryption, is the process of turning ciphertext into readable plaintext. Ciphertext is not to be confused with codetext, because the latter is a result of a code, not a cipher. == Conceptual underpinnings == Let m {\displaystyle m\!} be the plaintext message that Alice wants to secretly transmit to Bob and let E k {\displaystyle E_{k}\!} be the encryption cipher, where k {\displaystyle _{k}\!} is a cryptographic key. Alice must first transform the plaintext into ciphertext, c {\displaystyle c\!} , in order to securely send the message to Bob, as follows: c = E k ( m ) . {\displaystyle c=E_{k}(m).\!} In a symmetric-key system, Bob knows Alice's encryption key. Once the message is encrypted, Alice can safely transmit it to Bob (assuming no one else knows the key). In order to read Alice's message, Bob must decrypt the ciphertext using E k − 1 {\displaystyle {E_{k}}^{-1}\!} which is known as the decryption cipher, D k : {\displaystyle D_{k}:\!} D k ( c ) = D k ( E k ( m ) ) = m . {\displaystyle D_{k}(c)=D_{k}(E_{k}(m))=m.\!} Alternatively, in a non-symmetric key system, everyone, not just Alice and Bob, knows the encryption key; but the decryption key cannot be inferred from the encryption key. Only Bob knows the decryption key D k , {\displaystyle D_{k},} and decryption proceeds as D k ( c ) = m . {\displaystyle D_{k}(c)=m.} == Types of ciphers == The history of cryptography began thousands of years ago. Cryptography uses a variety of different types of encryption. Earlier algorithms were performed by hand and are substantially different from modern algorithms, which are generally executed by a machine. === Historical ciphers === Historical pen and paper ciphers used in the past are sometimes known as classical ciphers. They include: Substitution cipher: the units of plaintext are replaced with ciphertext (e.g., Caesar cipher and one-time pad) Polyalphabetic substitution cipher: a substitution cipher using multiple substitution alphabets (e.g., Vigenère cipher and Enigma machine) Polygraphic substitution cipher: the unit of substitution is a sequence of two or more letters rather than just one (e.g., Playfair cipher) Transposition cipher: the ciphertext is a permutation of the plaintext (e.g., rail fence cipher) Historical ciphers are not generally used as a standalone encryption technique because they are quite easy to crack. Many of the classical ciphers, with the exception of the one-time pad, can be cracked using brute force. === Modern ciphers === Modern ciphers are more secure than classical ciphers and are designed to withstand a wide range of attacks. An attacker should not be able to find the key used in a modern cipher, even if they know any specifics about the plaintext and its corresponding ciphertext. Modern encryption methods can be divided into the following categories: Private-key cryptography (symmetric key algorithm): one shared key is used for encryption and decryption Public-key cryptography (asymmetric key algorithm): two different keys are used for encryption and decryption In a symmetric key algorithm (e.g., DES, AES), the sender and receiver have a shared key established in advance: the sender uses the shared key to perform encryption; the receiver uses the shared key to perform decryption. Symmetric key algorithms can either be block ciphers or stream ciphers. Block ciphers operate on fixed-length groups of bits, called blocks, with an unvarying transformation. Stream ciphers encrypt plaintext digits one at a time on a continuous stream of data, with the transformation of successive digits varying during the encryption process. In an asymmetric key algorithm (e.g., RSA), there are two different keys: a public key and a private key. The public key is published, thereby allowing any sender to perform encryption. The private key is kept secret by the receiver, thereby allowing only the receiver to correctly perform decryption. == Cryptanalysis == Cryptanalysis (also referred to as codebreaking or cracking the code) is the study of applying various methodologies to obtain the meaning of encrypted information, without having access to the cipher required to correctly decrypt the information. This typically involves gaining an understanding of the system design and determining the cipher. Cryptanalysts can follow one or more attack models to crack a cipher, depending upon what information is available and the type of cipher being analyzed. Ciphertext is generally the most easily obtained part of a cryptosystem and therefore is an important part of cryptanalysis. === Attack models === Ciphertext-only: the cryptanalyst has access only to a collection of ciphertexts or code texts. This is the weakest attack model because the cryptanalyst has limited information. Modern ciphers rarely fail under this attack. Known-plaintext: the attacker has a set of ciphertexts to which they know the corresponding plaintext Chosen-plaintext attack: the attacker can obtain the ciphertexts corresponding to an arbitrary set of plaintexts of their own choosing Batch chosen-plaintext attack: where the cryptanalyst chooses all plaintexts before any of them are encrypted. This is often the meaning of an unqualified use of "chosen-plaintext attack". Adaptive chosen-plaintext attack: where the cryptanalyst makes a series of interactive queries, choosing subsequent plaintexts based on the information from the previous encryptions. Chosen-ciphertext attack: the attacker can obtain the plaintexts corresponding to an arbitrary set of ciphertexts of their own choosing Adaptive chosen-ciphertext attack Indifferent chosen-ciphertext attack Related-key attack: similar to a chosen-plaintext attack, except the attacker can obtain ciphertexts encrypted under two different keys. The keys are unknown, but the relationship between them is known (e.g., two keys that differ in the one bit). == Famous ciphertexts == The Babington Plot ciphers The Shugborough inscription The Zimmermann Telegram The Magic Words are Squeamish Ossifrage The cryptogram in "The Gold-Bug" Beale ciphers Kryptos Zodiac Killer ciphers

    Read more →
  • Database index

    Database index

    A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure. Indexes are used to quickly locate data without having to search every row in a database table every time said table is accessed. Indexes can be created using one or more columns of a database table, providing the basis for both rapid random lookups and efficient access of ordered records. An index is a copy of selected columns of data, from a table, that is designed to enable very efficient search. An index normally includes a "key" or direct link to the original row of data from which it was copied, to allow the complete row to be retrieved efficiently. Some databases extend the power of indexing by letting developers create indexes on column values that have been transformed by functions or expressions. For example, an index could be created on upper(last_name), which would only store the upper-case versions of the last_name field in the index. Another option sometimes supported is the use of partial index, where index entries are created only for those records that satisfy some conditional expression. A further aspect of flexibility is to permit indexing on user-defined functions, as well as expressions formed from an assortment of built-in functions. == Usage == === Support for fast lookup === Most database software includes indexing technology that enables sub-linear time lookup to improve performance, as linear search is inefficient for large databases. Suppose a database contains N data items and one must be retrieved based on the value of one of the fields. A simple implementation retrieves and examines each item according to the test. If there is only one matching item, this can stop when it finds that single item, but if there are multiple matches, it must test everything. This means that the number of operations in the average case is O(N) or linear time. Since databases may contain many objects, and since lookup is a common operation, it is often desirable to improve performance. An index is any data structure that improves the performance of lookup. There are many different data structures used for this purpose. There are complex design trade-offs involving lookup performance, index size, and index-update performance. Many index designs exhibit logarithmic (O(log(N))) lookup performance and in some applications it is possible to achieve flat (O(1)) performance. === Policing the database constraints === Indexes are used to police database constraints, such as UNIQUE, EXCLUSION, PRIMARY KEY and FOREIGN KEY. An index may be declared as UNIQUE, which creates an implicit constraint on the underlying table. Database systems usually implicitly create an index on a set of columns declared PRIMARY KEY, and some are capable of using an already-existing index to police this constraint. Many database systems require that both referencing and referenced sets of columns in a FOREIGN KEY constraint are indexed, thus improving performance of inserts, updates and deletes to the tables participating in the constraint. Some database systems support an EXCLUSION constraint that ensures that, for a newly inserted or updated record, a certain predicate holds for no other record. This can be used to implement a UNIQUE constraint (with equality predicate) or more complex constraints, like ensuring that no overlapping time ranges or no intersecting geometry objects would be stored in the table. An index supporting fast searching for records satisfying the predicate is required to police such a constraint. == Index architecture and indexing methods == === Non-clustered === The data is present in arbitrary order, but the logical ordering is specified by the index. The data rows may be spread throughout the table regardless of the value of the indexed column or expression. The non-clustered index tree contains the index keys in sorted order, with the leaf level of the index containing the pointer to the record (page and the row number in the data page in page-organized engines; row offset in file-organized engines). In a non-clustered index, The physical order of the rows is not the same as the index order. The indexed columns are typically non-primary key columns used in JOIN, WHERE, and ORDER BY clauses. There can be more than one non-clustered index on a database table. === Clustered === Clustering alters the data block into a certain distinct order to match the index, resulting in the row data being stored in order. Therefore, only one clustered index can be created on a given database table. Clustered indexes can greatly increase overall speed of retrieval, but usually only where the data is accessed sequentially in the same or reverse order of the clustered index, or when a range of items is selected. Since the physical records are in this sort order on disk, the next row item in the sequence is immediately before or after the last one, and so fewer data block reads are required. The primary feature of a clustered index is therefore the ordering of the physical data rows in accordance with the index blocks that point to them. Some databases separate the data and index blocks into separate files, others put two completely different data blocks within the same physical file(s). === Cluster === When multiple databases and multiple tables are joined, it is called a cluster (not to be confused with clustered index described previously). The records for the tables sharing the value of a cluster key shall be stored together in the same or nearby data blocks. This may improve the joins of these tables on the cluster key, since the matching records are stored together and less I/O is required to locate them. The cluster configuration defines the data layout in the tables that are parts of the cluster. A cluster can be keyed with a B-tree index or a hash table. The data block where the table record is stored is defined by the value of the cluster key. == Column order == The order that the index definition defines the columns in is important. It is possible to retrieve a set of row identifiers using only the first indexed column. However, it is not possible or efficient (on most databases) to retrieve the set of row identifiers using only the second or greater indexed column. For example, in a phone book organized by city first, then by last name, and then by first name, in a particular city, one can easily extract the list of all phone numbers. However, it would be very tedious to find all the phone numbers for a particular last name. One would have to look within each city's section for the entries with that last name. Some databases can do this, others just won't use the index. In the phone book example with a composite index created on the columns (city, last_name, first_name), if we search by giving exact values for all the three fields, search time is minimal—but if we provide the values for city and first_name only, the search uses only the city field to retrieve all matched records. Then a sequential lookup checks the matching with first_name. So, to improve the performance, one must ensure that the index is created on the order of search columns. == Applications and limitations == Indexes are useful for many applications but come with some limitations. Consider the following SQL statement: SELECT first_name FROM people WHERE last_name = 'Smith';. To process this statement without an index the database software must look at the last_name column on every row in the table (this is known as a full table scan). With an index the database simply follows the index data structure (typically a B-tree) until the Smith entry has been found; this is much less computationally expensive than a full table scan. Consider this SQL statement: SELECT email_address FROM customers WHERE email_address LIKE '%@wikipedia.org';. This query would yield an email address for every customer whose email address ends with "@wikipedia.org", but even if the email_address column has been indexed the database must perform a full index scan. This is because the index is built with the assumption that words go from left to right. With a wildcard at the beginning of the search-term, the database software is unable to use the underlying index data structure (in other words, the WHERE-clause is not sargable). This problem can be solved through the addition of another index created on reverse(email_address) and a SQL query like this: SELECT email_address FROM customers WHERE reverse(email_address) LIKE reverse('%@wikipedia.org');. This puts the wild-card at the right-most part of the query (now gro.aidepikiw@%), which the index on reverse(email_address) can satisfy. When the wildcard characters are used on both sides of the search word as %wikipedia.org%, the index available on this field is not used. Rather only a sequential search is performed, which takes ⁠ O ( N ) {\displaystyle

    Read more →
  • Content-oriented workflow models

    Content-oriented workflow models

    In data management, a content-oriented workflow model seeks to articulate workflow progression by the presence of content units (like data-records/objects/documents). Most content-oriented workflow approaches provide a life-cycle model for content units, such that workflow progression can be qualified by conditions on the state of the units. Most approaches are research and work in progress and the content models and life-cycle models are more or less formalized. The term content-oriented workflows is an umbrella term for several scientific workflow approaches, namely "data-driven", "resource-driven", "artifact-centric", "object-aware", and "document-oriented". Thus, the meaning of "content" ranges from simple data attributes to self-contained documents; the term "content-oriented workflows" appeared at first in as an umbrella term. Such a general term, independent from a specific approach, is necessary to contrast the content-oriented modelling principle with traditional activity-oriented workflow models (like Petri nets or BPMN) where a workflow is driven by a control flow and where the content production perspective is neglected or even missing. The term "content" was chosen to subsume the different levels in granularity of the content units in the respective workflow models; it was also chosen to make associations with content management. Both terms "artifact-centric" and "data-driven" would also be good candidates for an umbrella term, but each is closely related to a specific approach of a single working group. The "artifact-centric" group itself (i.e. IBM Research) has generalized the characteristics of their approach and has used "information-centric" as an umbrella term in. Yet, the term information is too unspecific in the context of computer science, thus, "content-orientated workflows" is considered as good compromise. == Workflow Model Approaches == === Data-driven === The data-driven process structures provides a sophisticated workflow model being specialized on hierarchical write-and-review-processes. The approach provides interleaved synchronization of sub-processes and extends activity diagrams. Unfortunately, the COREPRO prototype implementation is not publicly available. Research on the project had been ceased. The general idea has been continued by Reichert in form of the #Object-aware approach. Synonyms data-driven process structures / data-driven modeling and coordination Protagonists Dr. Dominic Müller (University of Twente), Joachim Herbst (DaimlerChrysler Research), and Manfred Reichert (at this time Assoc. Prof. at Univ. of Twente, currently Prof. at Ulm Univ.) Organization(s) University of Twente, DaimlerChrysler Period 2005 - 2007 Selected publications Implementation COREPRO === Resource-driven === The resource-driven workflow system is an early approach that considered workflows from a content-oriented perspective and emphasizes on the missing support for plain document-driven processes by traditional activity-oriented workflow engines. The resource-driven approach demonstrated the application of database triggers for handling workflow events. Still the system implementation is centralized and the workflow schema is statically defined. The project appeared in 2005 but many aspects are considered future work by the authors. Research did not continue on the project. Wang completed his PhD thesis in 2009, yet, his thesis does not mention the resource-driven approach to workflow modelling but is about discrete event simulation. Synonyms Resource-based Workflows / Document-Driven Workflow Systems Protagonists Jianrui Wang and Prof. Akhil Kumar Organization Pennsylvania State University Period 2005 - today Selected publications Implementation N/A === Artifact-centric === The artifact-centric approach provides a framework for content-oriented workflows. In this model, the enterprise application landscape includes distributed business services, while the workflow engine is centralized. Process enactment is integrated with database management system infrastructure, and the project is funded by IBM. Synonyms artifact-centric business process models / artifact-based business process (ACP) / artifact-centric workflows Protagonists Richard Hull and Dr. Kamal Bhattacharya as well as Cagdas E. Gerede and Jianwen Su Organization IBM (T.J. Watson Research Center, NY) Period 2007 - today Selected publications Implementation ArtiFact === Object-aware === The object-aware approach manages a set of object types and generates forms for creating object instances. The form completion flow is controlled by transitions between object configurations each describing a progressing set of mandatory attributes. Each object configuration is named by an object state. The data production flow is user-shifting and it is discrete by defining a sequence of object states. The discussion is currently limited to a centralized system, without any workflows across different organizations. However, the approach is of great relevance to many domains like concurrent engineering. Finally, the object-aware approach and its PHILharmonicFlows system are going to provide general-purpose workflow systems for generic enactment of data production processes. Synonyms object-aware process management / datenorientiertes Prozess-Management-System Protagonists Vera Künzle and Prof. Manfred Reichert Organization Ulm University Period 2009 - today Selected publications Implementation PHILharmonicFlows === Distributed Document-oriented === Distributed document-oriented process management (dDPM) enables distributed case handling in heterogeneous system environments and it is based on document-oriented integration. The workflow model reflects the paper-based working practice in inter-institutional healthcare scenarios. It targets distributed knowledge-driven ad hoc workflows, wherein distributed information systems are required to coordinate work with initially unknown sets of actors and activities. The distributed workflow engine supports process planning & process history as well as participant management and process template creation with import/export. The workflow engine embeds a functional fusion of 1) group-based instant messaging 2) with a shared work list editor 3) with version control. The software implementation of dDPM is α-Flow which is available as open source. dDPM and α-Flow provide a content-oriented approach to schema-less workflows. The complete distributed case handling application is provided in form of a single active Document ("α-Doc"). The α-Doc is a case file (as information carrier) with an embedded workflow engine (in form of active properties). Inviting process participants is equivalent to providing them with a copy of an α-Doc, copying it like an ordinary desktop file. All α-Docs that belong to the same case can synchronize each other, based on the participant management, electronic postboxes, store-and-forward messaging, and an offline-capable synchronization protocol. Synonyms distributed document-oriented process management (dDPM), distributed case handling via active documents Protagonists Christoph P. Neumann and Prof. Richard Lenz Organization Friedrich-Alexander-Universität Erlangen-Nürnberg Period 2009 - 2012 Selected Publications and a PhD thesis Implementation α-Flow (open source) == Related Concepts == === Content Management === The bandwidth of Content management systems (CMS) reaches from Web content management systems (WCMS) and Document management system (DMS) to Enterprise Content Management (ECM). Mature DMS products support document production workflows in a basic form, primarily focusing on review cycle workflows concerning a single document. === Groupware and Computer-Supported Cooperative Work === Groupware focuses on messaging (like E-Mail, Chat, and Instant Messaging), shared calendars (e.g. Lotus Notes, Microsoft Outlook with Exchange Server), and conferencing (e.g. Skype). Groupware overlaps with Computer-supported cooperative work (CSCW), that originated from shared multimedia editors (for live drawing/sketching) and synchronous multi-user applications like desktop sharing. The extensive conceptual claim of CSWC must be put into perspective by its actual solution scope, that is available as the CSCW Matrix. === Case Handling === The case handling paradigm stems from Prof. van der Aalst and gained momentum in 2005. The core features are: (a) provide all information available, i.e. present the case as a whole rather than showing bits and pieces, (b) decide about activities on the basis of the information available rather than the activities already executed, (c) separate work distribution from authorization and allow for additional types of roles, not just the execute role, and (d) allow workers to view and add/modify data before or after the corresponding activities have been executed. In healthcare, the flow of a patient between healthcare professionals is considered as a workflow - with activities that inc

    Read more →
  • Completeness (cryptography)

    Completeness (cryptography)

    In cryptography, a boolean function is said to be complete if the value of each output bit depends on all input bits. This is a desirable property to have in an encryption cipher, so that if one bit of the input (plaintext) is changed, every bit of the output (ciphertext) has an average of 50% probability of changing. The easiest way to show why this is good is the following: consider that if we changed our 8-byte plaintext's last byte, it would only have any effect on the 8th byte of the ciphertext. This would mean that if the attacker guessed 256 different plaintext-ciphertext pairs, he would always know the last byte of every 8byte sequence we send (effectively 12.5% of all our data). Finding out 256 plaintext-ciphertext pairs is not hard at all in the internet world, given that standard protocols are used, and standard protocols have standard headers and commands (e.g. "get", "put", "mail from:", etc.) which the attacker can safely guess. On the other hand, if our cipher has this property (and is generally secure in other ways, too), the attacker would need to collect 264 (~1020) plaintext-ciphertext pairs to crack the cipher in this way.

    Read more →