AI For Psychology Students

AI For Psychology Students — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Closest point method

    Closest point method

    The closest point method (CPM) is an embedding method for solving partial differential equations on surfaces. The closest point method uses standard numerical approaches such as finite differences, finite element or spectral methods in order to solve the embedding partial differential equation (PDE) which is equal to the original PDE on the surface. The solution is computed in a band surrounding the surface in order to be computationally efficient. In order to extend the data off the surface, the closest point method uses a closest point representation. This representation extends function values to be constant along directions normal to the surface. == Definitions == Closest Point function: Given a surface S , c p ( x ) {\displaystyle {\mathcal {S}},cp(\mathbf {x} )} refers to a (possibly non-unique) point belonging to S {\displaystyle {\mathcal {S}}} , which is closest to x {\displaystyle \mathbf {x} } [SE]. Closest point extension: Let S {\displaystyle {\mathcal {S}}} , be a smooth surface in R d {\displaystyle \mathbb {R} ^{d}} . The closest point extension of a function u : S → R {\displaystyle u:{\mathcal {S}}\rightarrow \mathbb {R} } , to a neighborhood Ω {\displaystyle \Omega } of S {\displaystyle {\mathcal {S}}} , is the function v : Ω → R {\displaystyle v:\Omega \rightarrow \mathbb {R} } , defined by v ( x ) = u ( c p ( x ) ) {\displaystyle v(\mathbf {x} )=u(cp(\mathbf {x} ))} . == Closest point method == Initialization consists of these steps [EW]: If it is not already given, a closest point representation of the surface is constructed. A computational domain is chosen. Typically this is a band around the surface. Replace surface gradients by standard gradients in R 3 {\displaystyle \mathbb {R} ^{3}} . Solution is initialized by extending the initial surface data on to the computational domain using the closest point function. After initialization, alternate between the following two steps: Using the closest point function, extend the solution off the surface to the computational domain. Compute the solution to the embedding PDE on a Cartesian mesh in the computational domain for one time step. == Banding == The surface PDE is extended into R 3 {\displaystyle \mathbb {R} ^{3}} however it is only necessary to solve this new PDE near the surface. Hence, we solve the PDE in a band surrounding the surface for efficient computational purposes. Ω c x : ‖ x − c p ( x ) ‖ 2 ≤ λ {\displaystyle \Omega _{c}{x:\|x-cp(x)\|_{2}\leq \lambda }} where λ {\displaystyle \lambda } is the bandwidth. == Example: Heat equation on a circle == Using initial profile u S ( θ , t ) = sin ⁡ ( θ ) {\displaystyle u_{S}(\theta ,t)=\sin(\theta )} leads to the solution u S ( θ , t ) = exp ⁡ ( − t ) sin ⁡ ( θ ) {\displaystyle u_{S}(\theta ,t)=\exp(-t)\sin(\theta )} for the heat equation. Forward Euler time-stepping is used with relation Δ t = 0.1 Δ x 2 {\displaystyle \Delta t=0.1\Delta x^{2}} and degree-four interpolation polynomials for the interpolations. Second-order centered differences are used for the spatial discretization. The CPM results in the expected second order error in the solution u {\displaystyle u} . == Applications == The closest point method can be applied to various PDEs on surfaces. Reaction–diffusion problems on point clouds [RD], eigenvalue problems [EV], and level set equations [LS] are a few examples.

    Read more →
  • Bitmap index

    Bitmap index

    A bitmap index is a special kind of database index that uses bitmaps. Bitmap indexes have traditionally been considered to work well for low-cardinality columns, which have a modest number of distinct values, either absolutely, or relative to the number of records that contain the data. The extreme case of low cardinality is Boolean data (e.g., does a resident in a city have internet access?), which has two values, True and False. Bitmap indexes use bit arrays (commonly called bitmaps) and answer queries by performing bitwise logical operations on these bitmaps. Bitmap indexes have a significant space and performance advantage over other structures for query of such data. Their drawback is they are less efficient than the traditional B-tree indexes for columns whose data is frequently updated: consequently, they are more often employed in read-only systems that are specialized for fast query - e.g., data warehouses, and generally unsuitable for online transaction processing applications. Some researchers argue that bitmap indexes are also useful for moderate or even high-cardinality data (e.g., unique-valued data) which is accessed in a read-only manner, and queries access multiple bitmap-indexed columns using the AND, OR or XOR operators extensively. Bitmap indexes are also useful in data warehousing applications for joining a large fact table to smaller dimension tables such as those arranged in a star schema. == Example == Continuing the internet access example, a bitmap index may be logically viewed as follows: On the left, Identifier refers to the unique number assigned to each resident, HasInternet is the data to be indexed, the content of the bitmap index is shown as two columns under the heading bitmaps. Each column in the left illustration under the Bitmaps header is a bitmap in the bitmap index. In this case, there are two such bitmaps, one for "has internet" Yes and one for "has internet" No. It is easy to see that each bit in bitmap Y shows whether a particular row refers to a person who has internet access. This is the simplest form of bitmap index. Most columns will have more distinct values. For example, the sales amount is likely to have a much larger number of distinct values. Variations on the bitmap index can effectively index this data as well. We briefly review three such variations. Note: Many of the references cited here are reviewed at (John Wu (2007)). For those who might be interested in experimenting with some of the ideas mentioned here, many of them are implemented in open source software such as FastBit, the Lemur Bitmap Index C++ Library, the Roaring Bitmap Java library and the Apache Hive Data Warehouse system. == Compression == For historical reasons, bitmap compression and inverted list compression were developed as separate lines of research, and only later were recognized as solving essentially the same problem. Software can compress each bitmap in a bitmap index to save space. There has been considerable amount of work on this subject. Though there are exceptions such as Roaring bitmaps, Bitmap compression algorithms typically employ run-length encoding, such as the Byte-aligned Bitmap Code, the Word-Aligned Hybrid code, the Partitioned Word-Aligned Hybrid (PWAH) compression, the Position List Word Aligned Hybrid, the Compressed Adaptive Index (COMPAX), Enhanced Word-Aligned Hybrid (EWAH) and the COmpressed 'N' Composable Integer SEt (CONCISE). These compression methods require very little effort to compress and decompress. More importantly, bitmaps compressed with BBC, WAH, COMPAX, PLWAH, EWAH and CONCISE can directly participate in bitwise operations without decompression. This gives them considerable advantages over generic compression techniques such as LZ77. BBC compression and its derivatives are used in a commercial database management system. BBC is effective in both reducing index sizes and maintaining query performance. BBC encodes the bitmaps in bytes, while WAH encodes in words, better matching current CPUs. "On both synthetic data and real application data, the new word aligned schemes use only 50% more space, but perform logical operations on compressed data 12 times faster than BBC." PLWAH bitmaps were reported to take 50% of the storage space consumed by WAH bitmaps and offer up to 20% faster performance on logical operations. Similar considerations can be done for CONCISE and Enhanced Word-Aligned Hybrid. The performance of schemes such as BBC, WAH, PLWAH, EWAH, COMPAX and CONCISE is dependent on the order of the rows. A simple lexicographical sort can divide the index size by 9 and make indexes several times faster. The larger the table, the more important it is to sort the rows. Reshuffling techniques have also been proposed to achieve the same results of sorting when indexing streaming data. == Encoding == Basic bitmap indexes use one bitmap for each distinct value. It is possible to reduce the number of bitmaps used by using a different encoding method. For example, it is possible to encode C distinct values using log(C) bitmaps with binary encoding. This reduces the number of bitmaps, further saving space, but to answer any query, most of the bitmaps have to be accessed. This makes it potentially not as effective as scanning a vertical projection of the base data, also known as a materialized view or projection index. Finding the optimal encoding method that balances (arbitrary) query performance, index size and index maintenance remains a challenge. Without considering compression, Chan and Ioannidis analyzed a class of multi-component encoding methods and came to the conclusion that two-component encoding sits at the kink of the performance vs. index size curve and therefore represents the best trade-off between index size and query performance. == Binning == For high-cardinality columns, it is useful to bin the values, where each bin covers multiple values and build the bitmaps to represent the values in each bin. This approach reduces the number of bitmaps used regardless of encoding method. However, binned indexes can only answer some queries without examining the base data. For example, if a bin covers the range from 0.1 to 0.2, then when the user asks for all values less than 0.15, all rows that fall in the bin are possible hits and have to be checked to verify whether they are actually less than 0.15. The process of checking the base data is known as the candidate check. In most cases, the time used by the candidate check is significantly longer than the time needed to work with the bitmap index. Therefore, binned indexes exhibit irregular performance. They can be very fast for some queries, but much slower if the query does not exactly match a bin. == History == The concept of bitmap index was first introduced by Professor Israel Spiegler and Rafi Maayan in their research "Storage and Retrieval Considerations of Binary Data Bases", published in 1985. The first commercial database product to implement a bitmap index was Computer Corporation of America's Model 204. Patrick O'Neil published a paper about this implementation in 1987. This implementation is a hybrid between the basic bitmap index (without compression) and the list of Row Identifiers (RID-list). Overall, the index is organized as a B+tree. When the column cardinality is low, each leaf node of the B-tree would contain long list of RIDs. In this case, it requires less space to represent the RID-lists as bitmaps. Since each bitmap represents one distinct value, this is the basic bitmap index. As the column cardinality increases, each bitmap becomes sparse and it may take more disk space to store the bitmaps than to store the same content as RID-lists. In this case, it switches to use the RID-lists, which makes it a B+tree index. == In-memory bitmaps == One of the strongest reasons for using bitmap indexes is that the intermediate results produced from them are also bitmaps and can be efficiently reused in further operations to answer more complex queries. Many programming languages support this as a bit array data structure. For example, Java has the BitSet class and .NET have the BitArray class. Some database systems that do not offer persistent bitmap indexes use bitmaps internally to speed up query processing. For example, PostgreSQL versions 8.1 and later implement a "bitmap index scan" optimization to speed up arbitrarily complex logical operations between available indexes on a single table. For tables with many columns, the total number of distinct indexes to satisfy all possible queries (with equality filtering conditions on either of the fields) grows very fast, being defined by this formula: C n [ n 2 ] ≡ n ! ( n − [ n 2 ] ) ! [ n 2 ] ! {\displaystyle \mathbf {C} _{n}^{\left[{\frac {n}{2}}\right]}\equiv {\frac {n!}{\left(n-\left[{\frac {n}{2}}\right]\right)!\left[{\frac {n}{2}}\right]!}}} . A bitmap index scan combines expressions on different indexes, thus requiring only one index per column t

    Read more →
  • Cambridge Analytica

    Cambridge Analytica

    Cambridge Analytica Ltd. (CA), previously known as SCL USA, was a British political consulting firm that came to prominence through the Facebook–Cambridge Analytica data scandal. It was founded in 2013, as a subsidiary of the private intelligence company and self-described "global election management agency" SCL Group by long-time SCL executives Nigel Oakes, Alexander Nix and Alexander Oakes, with Nix as CEO. Cambridge Analytica was hired by a variety of political actors, including the Trinidadian government in 2010 and the 2016 presidential campaigns of Ted Cruz and Donald Trump. The firm maintained offices in London, New York City, and Washington, D.C. The company closed operations in 2018 due to backlash from the scandal, although firms related to both Cambridge Analytica and its parent firm SCL still exist. == History == Cambridge Analytica was founded in 2013 as a subsidiary of the private intelligence company SCL Group, which describes itself as providing "data, analytics and strategy to governments and military organisations worldwide". The company was part of "an international web of companies" headed by the London-based SCL Group. Cambridge Analytica (SCL USA) was incorporated in January 2013 with its registered office being in Westferry Circus, London and consisting of just one staff member, director and CEO Alexander Nix (also appointed in January 2015). Nix was also the director of nine similar companies sharing the same registered offices in London, including Firecrest technologies, Emerdata and six SCL Group companies including "SCL elections limited". Nigel Oakes, known as the former boyfriend of Lady Helen Windsor, had founded the predecessor SCL Group in the 1990s, and in 2005 Oakes established SCL Group together with his brother Alexander Oakes and Alexander Nix; SCL Group was the parent company of Cambridge Analytica. Former Conservative minister and MP Sir Geoffrey Pattie was the founding chairman of SCL; Lord Ivar Mountbatten also joined Oakes as a director of the company. As a result of the Facebook–Cambridge Analytica data scandal, Nix was removed as CEO and replaced by Julian Wheatland before the company closed. Several of the company's executives were Old Etonians. The company's owners included several of the Conservative Party's largest donors such as billionaire Vincent Tchenguiz, former British Conservative minister Jonathan Marland, Baron Marland and the family of American hedge fund manager Robert Mercer. The company combined misappropriation of digital assets, data mining, data brokerage, and data analysis with strategic communication during electoral processes. While its parent SCL had focused on influencing elections in developing countries since the 1990s, Cambridge Analytica focused more on the western world, including the United Kingdom and the United States; CEO Alexander Nix has said CA was involved in 44 U.S. political races in 2014. In 2015, CA performed data analysis services for Ted Cruz's presidential campaign. In 2016, CA worked for Donald Trump's presidential campaign as well as for Leave.EU (one of the organisations campaigning in the United Kingdom's referendum on European Union membership). CA's role in those campaigns has been controversial and is the subject of ongoing inquiries in both countries. Political scientists question CA's claims about the effectiveness of its methods of targeting voters. == Data scandal == In March 2018, media outlets broke news of Cambridge Analytica's business practices. The New York Times and The Observer reported that the company had acquired and used personal data about Facebook users from an external researcher who had told Facebook he was collecting it for academic purposes. Shortly afterwards, Channel 4 News aired undercover investigative videos showing Nix boasting about using prostitutes, bribery sting operations, and honey traps to discredit politicians on whom it had conducted opposition research, and saying that the company "ran all of (Donald Trump's) digital campaign". In response to the media reports, the Information Commissioner's Office (ICO) of the UK pursued a warrant to search the company's servers. Facebook banned Cambridge Analytica from advertising on its platform, saying that it had been deceived. On 23 March 2018, the British High Court granted the ICO a warrant to search Cambridge Analytica's London offices. As a result, Nix was suspended as CEO, and replaced by Julian Wheatland. The personal data of up to 87 million Facebook users were acquired via the 270,000 Facebook users who used a Facebook app created by Aleksandr Kogan called "This Is Your Digital Life". This was a personality profiling app and asked simple personality questions similar to other Facebook quizzes. Kogan was a scientist and psychologist, also being an employed lecturer for the University of Cambridge from 2012 to 2018. Alexander Nix claimed they had close to five thousand data points on each person who participated. They also gathered information through other data brokers ending with them acquiring millions of data points from American citizens. Kogan's app exploited a feature of Facebook's Graph API (version 1.0), which permitted any third-party app to access not only the app user's data, but also the full profile data of all of that user's Facebook friends, without those friends' knowledge or consent. This platform-wide design was available to all developers and was used by tens of thousands of apps; Facebook CEO Mark Zuckerberg later told the House Energy and Commerce Committee that the company was auditing "tens of thousands" of apps that had had access to large amounts of user data. Because the average Facebook user at the time had approximately 300 friends, the 270,000 users who installed Kogan's app yielded data on up to 87 million people. Facebook deprecated the friends-data API in April 2014 and shut it down entirely in April 2015, but data already collected by apps remained in developers' possession. Kogan passed this data to Cambridge Analytica, breaching Facebook's terms of service. On 1 May 2018, Cambridge Analytica and its parent company SCL filed for insolvency proceedings and closed operations. Alexander Tayler, a former director for Cambridge Analytica, was appointed director of Emerdata on 28 March 2018. Rebekah Mercer, Jennifer Mercer, Alexander Nix and Johnson Chun Shun Ko, who has links to American businessman Erik Prince, are in leadership positions at Emerdata. The Russo brothers are producing an upcoming film on Cambridge Analytica. In 2019 the Federal Trade Commission filed an administrative complaint against Cambridge Analytica for misuse of data. In 2020, the British Information Commissioner's Office closed a three-year inquiry into the company, concluded that Cambridge Analytica was "not involved" in the 2016 Brexit referendum and found no additional evidence for Russia's alleged interference during the campaign. US sensitive polling and election data, however, were passed to Russian Intelligence via a Cambridge Analytica contractor Sam Patten, Trump campaign manager Paul Manafort, and Russian agent Konstantin Kilimnik, who was indicted during the affair. Publicly, parent company SCL Group called itself a "global election management agency", Politico reported it was known for involvement "in military disinformation campaigns to social media branding and voter targeting". SCL gained work on a large number of campaigns for the US and UK governments' war on terror advancing their model of behavioral conflict during the 2000s. SCL's involvement in the political world has been primarily in the developing world where it has been used by the military and politicians to study and manipulate public opinion and political will. Slate writer Sharon Weinberger compared one of SCL's hypothetical test scenarios to fomenting a coup. Among the investors in Cambridge Analytica were some of the Conservative Party's largest donors such as billionaire Vincent Tchenguiz, former Conservative minister Jonathan Marland, Baron Marland, Roger Gabb, the family of American hedge fund manager Robert Mercer, and Steve Bannon. A minimum of 15 million dollars has been invested into the company by Mercer, according to The New York Times. Bannon's stake in the company was estimated at 1 to 5 million dollars, but he divested his holdings in April 2017 as required by his role as White House Chief Strategist. In March 2018, Jennifer Mercer and Rebekah Mercer became directors of Emerdata limited. In March 2018 it became public by Christopher Wylie, that Cambridge Analytica's first activities were founded on a data set, which its parent company SCL bought 2014 from a company named Global Science Research founded by Aleksandr Kogan and his team present across the world who worked as a psychologist at Cambridge. During Boris Johnson's tenure as foreign secretary, the Foreign Office sought advice from Cambridge Analytica and Boris Johnson had a meeting with Alexander N

    Read more →
  • List of cryptosystems

    List of cryptosystems

    A cryptosystem is a set of cryptographic algorithms that map ciphertexts and plaintexts to each other. == Private-key cryptosystems == Private-key cryptosystems use the same key for encryption and decryption. Caesar cipher Substitution cipher Enigma machine Data Encryption Standard Twofish Serpent Camellia Salsa20 ChaCha20 Blowfish CAST5 Kuznyechik RC4 3DES Skipjack Safer IDEA Advanced Encryption Standard, also known as AES and Rijndael. == Public-key cryptosystems == Public-key cryptosystems use a public key for encryption and a private key for decryption. Diffie–Hellman key exchange RSA encryption Rabin cryptosystem Schnorr signature ElGamal encryption Elliptic-curve cryptography Lattice-based cryptography McEliece cryptosystem Multivariate cryptography Isogeny-based cryptography

    Read more →
  • Shaded Picture System

    Shaded Picture System

    The Shaded Picture System was a 3D raster computer display processor introduced by Evans & Sutherland in October 1973. The Shaded Picture System was the first general-purpose, commercially available raster computer graphics display processor capable of real-time, shaded 3D graphics. It could only display black and white graphics at a resolution of 256 by 256. It was extremely expensive, and very few units were ever sold. == History == The principles of shaded, hidden-line true 3D graphics were pioneered at the University of Utah in 1967. However, this algorithm was slow and would take several minutes to produce an image. In 1970, Gary Watkins developed a FORTRAN simulator of a faster algorithm that would theoretically generate shaded 3D images in real-time, "if implemented in suitable hardware". The simulator itself was still not capable of real-time shaded 3D image rendering. Evans & Sutherland developed a functional prototype of this "suitable hardware", which was later sold as the Shaded Picture System in 1973. About a year earlier in 1972, Evans & Sutherland sold the first and only CT1 to Case Western Reserve University. The CT1, or Continuous Tone 1, was a specialized image generator, not meant as a marketable or mass-produced product. At the time, the CT1, along with G.E./NASA's upgraded Electronic Scene Generator from 1971, would have been the only real-time raster graphics systems sold to customers comparable to the Shaded Picture System, although both the CT1 and Electronic Scene Generator were intentionally produced as one-off products and specialized for the needs of their customers. The Shaded Picture System, in contrast, was intentionally marketed.In early 1975, Evans & Sutherland demonstrated a random-access video frame buffer using relatively low-cost semiconductor memory, which was much more capable than the Shaded Picture System. When interfaced with a (non-shaded) E&S Picture System, the frame buffer had a resolution of 512 by 512 in grayscale and partial color capabilities. By the end of 1975, this frame buffer was commercially available.

    Read more →
  • Sentiment analysis

    Sentiment analysis

    Sentiment analysis (also known as opinion mining) is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to voice of the customer materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from marketing to customer service to clinical medicine. With the rise of deep language models, such as RoBERTa, more difficult data domains can be analyzed, e.g., news texts where authors typically express their opinion/sentiment less explicitly. == Types == A basic task in sentiment analysis is classifying the polarity of a given text at the document, sentence, or feature/aspect level—whether the expressed opinion in a document, a sentence or an entity feature/aspect is positive, negative, or neutral. Advanced, "beyond polarity" sentiment classification looks, for instance, at emotional states such as enjoyment, anger, disgust, sadness, fear, and surprise. Precursors to sentimental analysis include the General Inquirer, which provided hints toward quantifying patterns in text and, separately, psychological research that examined a person's psychological state based on analysis of their verbal behavior. Subsequently, the method described in a patent by Volcani and Fogel, looked specifically at sentiment and identified individual words and phrases in text with respect to different emotional scales. A current system based on their work, called EffectCheck, presents synonyms that can be used to increase or decrease the level of evoked emotion in each scale. Many other subsequent efforts were less sophisticated, using a mere polar view of sentiment, from positive to negative, such as work by Turney, and Pang who applied different methods for detecting the polarity of product reviews and movie reviews respectively. This work is at the document level. One can also classify a document's polarity on a multi-way scale, which was attempted by Pang and Snyder among others: Pang and Lee expanded the basic task of classifying a movie review as either positive or negative to predict star ratings on either a 3- or a 4-star scale, while Snyder performed an in-depth analysis of restaurant reviews, predicting ratings for various aspects of the given restaurant, such as the food and atmosphere (on a five-star scale). First steps to bringing together various approaches—learning, lexical, knowledge-based, etc.—were taken in the 2004 AAAI Spring Symposium where linguists, computer scientists, and other interested researchers first aligned interests and proposed shared tasks and benchmark data sets for the systematic computational research on affect, appeal, subjectivity, and sentiment in text. Even though in most statistical classification methods, the neutral class is ignored under the assumption that neutral texts lie near the boundary of the binary classifier, several researchers suggest that, as in every polarity problem, three categories must be identified. Moreover, it can be proven that specific classifiers such as the Max Entropy and SVMs can benefit from the introduction of a neutral class and improve the overall accuracy of the classification. There are in principle two ways for operating with a neutral class. Either, the algorithm proceeds by first identifying the neutral language, filtering it out and then assessing the rest in terms of positive and negative sentiments, or it builds a three-way classification in one step. This second approach often involves estimating a probability distribution over all categories (e.g. naive Bayes classifiers as implemented by the NLTK). Whether and how to use a neutral class depends on the nature of the data: if the data is clearly clustered into neutral, negative and positive language, it makes sense to filter the neutral language out and focus on the polarity between positive and negative sentiments. If, in contrast, the data are mostly neutral with small deviations towards positive and negative affect, this strategy would make it harder to clearly distinguish between the two poles. A different method for determining sentiment is the use of a scaling system whereby words commonly associated with having a negative, neutral, or positive sentiment are given an associated number on a −10 to +10 scale (most negative up to most positive) or simply from 0 to a positive upper limit such as +4. This makes it possible to adjust the sentiment of a given term relative to its environment (usually on the level of the sentence). When a piece of unstructured text is analyzed using natural language processing, each concept in the specified environment is given a score based on the way sentiment words relate to the concept and its associated score. This allows movement to a more sophisticated understanding of sentiment, because it is now possible to adjust the sentiment value of a concept relative to modifications that may surround it. Words, for example, that intensify, relax or negate the sentiment expressed by the concept can affect its score. Alternatively, texts can be given a positive and negative sentiment strength score if the goal is to determine the sentiment in a text rather than the overall polarity and strength of the text. There are various other types of sentiment analysis, such as aspect-based sentiment analysis, grading sentiment analysis (positive, negative, neutral), multilingual sentiment analysis and detection of emotions. === Subjectivity/objectivity identification === This task is commonly defined as classifying a given text (usually a sentence) into one of two classes: objective or subjective. This problem can sometimes be more difficult than polarity classification. The subjectivity of words and phrases may depend on their context and an objective document may contain subjective sentences (e.g., a news article quoting people's opinions). Moreover, as mentioned by Su, results are largely dependent on the definition of subjectivity used when annotating texts. However, Pang showed that removing objective sentences from a document before classifying its polarity helped improve performance. Subjective and objective identification, emerging subtasks of sentiment analysis to use syntactic, semantic features, and machine learning knowledge to identify if a sentence or document contains facts or opinions. Awareness of recognizing factual and opinions is not recent, having possibly first presented by Carbonell at Yale University in 1979. The term objective refers to the incident carrying factual information. Example of an objective sentence: 'To be elected president of the United States, a candidate must be at least thirty-five years of age.' The term subjective describes the incident contains non-factual information in various forms, such as personal opinions, judgment, and predictions, also known as 'private states'. In the example down below, it reflects a private states 'We Americans'. Moreover, the target entity commented by the opinions can take several forms from tangible product to intangible topic matters stated in Liu (2010). Furthermore, three types of attitudes were observed by Liu (2010), 1) positive opinions, 2) neutral opinions, and 3) negative opinions. Example of a subjective sentence: 'We Americans need to elect a president who is mature and who is able to make wise decisions.' This analysis is a classification problem. Each class's collections of words or phrase indicators are defined for to locate desirable patterns on unannotated text. For subjective expression, a different word list has been created. Lists of subjective indicators in words or phrases have been developed by multiple researchers in the linguist and natural language processing field states in Riloff et al. (2003). A dictionary of extraction rules has to be created for measuring given expressions. Over the years, in subjective detection, the features extraction progression from curating features by hand to automated features learning. At the moment, automated learning methods can further separate into supervised and unsupervised machine learning. Patterns extraction with machine learning process annotated and unannotated text have been explored extensively by academic researchers. However, researchers recognized several challenges in developing fixed sets of rules for expressions respectably. Much of the challenges in rule development stems from the nature of textual information. Six challenges have been recognized by several researchers: 1) metaphorical expressions, 2) discrepancies in writings, 3) context-sensitive, 4) represented words with fewer usages, 5) time-sensitive, and 6) ever-growing volume. Metaphorical expressions. The text contains metaphoric expression may impact on the performance on the extraction. Besides, metaphors take in different forms, which may have been contribu

    Read more →
  • CARE Principles for Indigenous Data Governance

    CARE Principles for Indigenous Data Governance

    The CARE Principles for Indigenous Data Governance are a set of principles intended to guide open data projects in engaging Indigenous Peoples rights and interests. CARE was created in 2019 by the International Indigenous Data Sovereignty Interest Group, a group that is a part of the Research Data Alliance. It outlines collective rights related to open data in the context of the United Nations Declaration on the Rights of Indigenous Peoples and Indigenous data sovereignty. CARE is an acronym which stands for Collective Benefit, Authority to Control, Responsibility, Ethics. The CARE Principles are 'people and purpose-oriented, reflecting the crucial role of data in advancing Indigenous innovation and self-determination', and intended as a complement to the data-oriented perspective of other standards such as FAIR data (findable, accessible, interoperable, reusable). The CARE principles have been embedded into the Beta version of Standardised Data on Initiatives (STARDIT). CARE principles were the basis of a submission to the UN's Global Digital Compact.

    Read more →
  • Influencer

    Influencer

    An influencer is an individual who has the capacity to shape the attitudes, behavior, or decisions of others through authority, knowledge, position, or the nature of the relationship with the audience. The term is used in various fields such as media, business, politics, religion, and communication, referring to influencers such as social media influencers, podcasters, public speakers, religious influencers, writers, and newsletter writers etc who have dedicated followings in various areas. One writer defines influencers as "a range of third parties who exercise influence over the organization and its potential customers." Another writer defines an influencer as a "third party who significantly shapes the customer's purchasing decision but may never be accountable for it." According to another writer, influencers are "well-connected, create an impact, have active minds, and are trendsetters". Just because a person has many followers does not necessarily mean they have much influence over those people. In contemporary usage, the term frequently refers to a social media influencer, (also known as an online influencer or simply influencer) a person who builds a grassroots online presence through engaging content such as photos, videos, and updates. This is done by using direct audience interaction to establish authenticity, expertise, and appeal, and by standing apart from traditional celebrities by growing their platform through social media rather than pre-existing fame. The modern referent of the term is commonly a paid role in which a business entity pays for the social media influence-for-hire activity to promote its products and services, known as influencer marketing. A 1% increase in spending on influencer marketing can lead to a 0.5% increase in audience engagement. As such, an influencer effectively acts as a modern salesperson or a marketer. Types of influencers include fashion influencer, travel influencer, and virtual influencer, and they involve content creators and streamers. Some influencers are associated primarily with specific social media apps such as TikTok, Instagram, or Pinterest; many influencers are also considered internet celebrities. As of 2023, Instagram is the social media platform businesses spend the most advertising money towards marketing with influencers. However, influencers can have an impact on any social media network. == History == === Origins === The word influencer in its general sense of a person or thing that exerts influence, is attested in historical sources at least since the 17th century. The Oxford English Dictionary (OED) gives 1664 as the earliest example of usage and cites a sentence from Henry More's A Modest Enquiry into the Mystery of Iniquity: "The head and influencer of the whole Church". The origins of online influencing can be traced back to the emergence of digital blogs and platforms in the early 2000s. Nevertheless, recent studies demonstrate that Instagram, an application with more than one billion users, harbors the majority of the influencer demographic. These individuals are sometimes referred to as "Instagrammers" or "Instafamous". A crucial aspect of influencing is their association with sponsors. The 2015 debut of Vamp, a company that links influencers with sponsorships, transformed the landscape of influencing. There is much debate about whether social media influencers can be considered celebrities, as their path to fame is often less traditional and arguably easier. Melody Nouri addressed the differences between the two types in her article "The Power of Influence: Traditional Celebrities vs Social Media Influencer". Nouri asserts that social media platforms have a greater negative impact on young, impressionable audiences in comparison with traditional media such as magazines, billboards, advertisements, and tabloids featuring celebrities. Online, it is thought to be simpler to manipulate an image and lifestyle in such a way that viewers are more susceptible to believing it. One theory considers the former American First Lady Eleanor Roosevelt (1884–1962) to be the "original media influencer." While she achieved celebrity in her role as First Lady, she built a global personal brand as a wise, informative, trustworthy American woman. Her voice was her own, unrestricted by political advisors and powerful men, and with it, Roosevelt exerted unprecedented social and cultural influence in radio, print, public speaking, film, and television until she died. In one notable example, it may have been Roosevelt's television support of John F. Kennedy which nudged his "hairline victory" during the 1960 Presidential campaign. In another example, David Ogilvy paid Roosevelt more than a quarter of a million dollars in today's currency to make a TV commercial for Good Luck margarine (1959), in which Roosevelt also managed to mention world hunger. As a content creator, she wrote My Day, a popular daily newspaper column that ran nationwide for twenty-six years. Like a social media post, My Day covered all aspects of her life, and in it Roosevelt often recommended movies, books, and products that she admired. Roosevelt also had a hand in designing all three of her public affairs television shows. Unlike contemporary influencers, she was less motivated by a pay-to-play situation than by a desire to educate and inspire; but she did use her influence to benefit the entertainment industry careers of her children, and she welcomed the revenue that her influence bought, most of which was donated to charity. === 2000s === The early 2000s showed corporate endeavors to leverage the internet for influence, with some companies participating in forums for promotions or providing bloggers with complimentary products in return for favorable reviews. A few of these practices were viewed as unethical for taking advantage of the labor of young individuals without providing remuneration. In 2004, The Blogstar Network was established by Ted Murphy of MindComet. Bloggers were encouraged to join an email list and receive remunerated offers from corporations in exchange for creating specific posts. For instance, bloggers were compensated for writing reviews of fast-food meals on their blogs. Blogstar is widely regarded as the first influencer marketing network. Murphy succeeded Blogstar with PayPerPost, which was introduced in 2006. This platform compensated significant posters on prominent forums and social media platforms for every post made about a corporate product. Payment rates were determined by the influencer's status. Though very popular, PayPerPost, received a great deal of criticism as these influencers were not required to disclose their involvement with PayPerPost as traditional journalism would have. With the success of PayPerPost, the public became aware that there was a drive for corporate interests to influence what some people were posting to these sites. The platform also incentivized other firms to establish comparable programs. Despite concerns, marketing networks with influencers continued to grow throughout the 2000s and into the 2010s. The influencer marketing industry was worth as much as $8 billion in 2019, according to estimates from Business Insider Intelligence, which are based on Mediakix data. Evan Asano, the Former CEO and founder of the agency Mediakix, previously spoke with Business Insider and said he believed influencer marketing on Instagram would continue to grow despite likes being hidden. === 2010s === By the 2010s, the term "influencer" described digital content creators with a large following, distinctive brand persona, and a patterned relationship with commercial sponsors. By this period, influencer marketing had become a widely researched field globally, with systematic reviews drawing on hundreds of studies that documented the growing role of authenticity, audience engagement, and parasocial relationships in shaping how consumers responded to influencer content across different markets. During this period, influencer culture also developed through distinct channels outside Western markets. In South Korea, the global spread of Korean pop culture, also called K-Pop, through platforms such as YouTube, Facebook, and Twitter gave rise to what scholars have called 'Hallyu 2.0' or the 'New Korean Wave', where fans throughout Southeast Asia, North America, Latin America, and Europe shared, subtitled, and redistributed Korean music and film content on a large scale. This helped Korean entertainers to build substantial followings internationally. Consumers often mistakenly view celebrities as reliable, leading to trust and confidence in the products being promoted. A 2001 study from Rutgers University discovered that individuals were using "internet forums as influential sources of consumer information." The study proposes that consumers preferred internet forums and social media when making purchasing decisions over conventional advertising and print sources. An in

    Read more →
  • Azure Maps

    Azure Maps

    Azure Maps is a suite of cloud-based, location-based services provided by Microsoft as part of the company's Azure platform. The platform provides geospatial and location-based services via REST APIs and software development kits (SDKs). The service is typically used to integrate maps or geospatial data into applications. Azure Maps differs from Microsoft's other enterprise mapping service, Bing Maps, in its pricing model, focus on privacy, and its level of integration into the broader Azure cloud ecosystem. == History == Azure Maps was first introduced in public preview mode under the name "Azure Location Based Services" in 2017, primarily as an enterprise solution. The services was intended to add mapping and location-based functionality onto the existing Azure cloud services suite, seen as a critical part of Microsoft's broader Internet-of-Things (IoT) strategy. The preview version included APIs which could be used to develop location aware apps for use cases such as logistics and mobility. In 2018, the software was renamed "Azure Maps," and became generally available to the public, and a number of new functions were added, including route calculation, travel time calculation, and incorporation of real-time traffic data and incident information. Azure Maps was integrated with Azure IoT Central in 2018, which added tracking, monitoring, and geofencing capabilities. A set of mobility APIs on were added in 2019, with applications such as use in public transport apps and shared bicycle fleet management. “Azure Maps Creator,” which converts private facility floor plans into indoor map data, was also introduced in 2019. Some commentators linked these services to Microsoft's broader development of augmented reality products. In 2020, Azure Maps Visual for Power BI was released, integrating location-based features and mapping capabilities into Microsoft's business intelligence software. An elevation API (which was later retired), geolocation services, and an iOS and Android software development kit were introduced in 2021. In 2022, support for historical weather, air quality, and tropical storm data was made generally available and custom styling for indoor maps was also introduced. In 2023, Azure Maps was certified as HIPAA compliant in a move to target healthcare and health insurance companies. == Functionality == === Geocoding === Geocoding is one of the core functionalities of Azure Maps, converting addresses or place names into geographic coordinates. Batch geocoding is used to process large amounts of address data, a function used for route optimization and spatial analysis. === Reverse geocoding === Reverse geocoding derives human-readable information from geographic coordinates like longitude and latitude, used in navigation and by geographic information systems. === Routing === Azure Maps uses map data and routing algorithms to calculate the shortest or fastest routes between locations based on factors like vehicle size and type, traffic conditions, and distance. Routing also supports multi-modal routing, which include multiple modes of transport in a single trip, including cycling, walking, and ferries. This functionality is used for location-based searches and route optimization in applications like fleet management, proximity marketing, and emergency services as well as logistics and delivery, urban planning, ride sharing apps, and outdoor activities. === Map visualization === The platform supports map visualizations that can be modified to reflect real-time data (including from IoT sensors) as well as historical data patterns. Visualizations include heat maps, street maps, satellite imagery and other custom data layers. Maps are rendered using raster or vector tiles which reduce the load of displaying large data sets or complex maps. This can be used in various applications in areas like transportation, smart cities, retail and marketing, public health, and environmental monitoring. For example, it can be used for tracking the spread of diseases or measuring the impact of changing climatic patterns. === Geofencing and spatial analytics === Azure Maps supports polygonal geofencing, which enables the definition of custom geographic boundaries. Geofenced areas can be monitored in real-time for events of interest. For example, an application could send an alert when equipment or persons enter or leave a defined area. Tools for analyzing historical geofencing data are also available via the APIs for optimization purposes. == Industry usage == Azure Maps' geofencing function has seen usage in the construction industry, designating hazardous areas for safety purposes and sending alerts if anyone enters the area. Private facility maps are used by construction companies for monitoring large construction sites to increase productivity and prevent accidents or damage. In emergency management, New Zealand based company Beca has used Azure Maps to provide analysis on the impact of earthquakes to users, including information on the severity and location of an earthquake and the impact on affected properties. Alaska's Department of Transportation uses Azure Maps as part of an information system providing weather-related warnings and analytics to road crews. Airmap, an airspace management platform for drones, uses Azure Maps. Azure Maps has also been used in conjunction with Azure Monitor for risk monitoring by an insurance company. Other companies that use or have used Azure Maps include BMW, Banco Santander, Jvion, MV Transportation, C.H. Robertson, Wise Skulls, Tata Consultancy Services, Providence Health and Services, Gas Brasiliano Distribuidora S.A., Shell plc, Persistent Systems, Phase 2 Dining and Entertainment, Symbio, HID, Globant, and Insight Enterprises. == Partnerships == Azure Maps and TomTom have been partners since 2016, and TomTom provides location data to Azure Maps and can process data from Azure Maps for mapping purposes. In 2021, Azure Maps partnered with AccuWeather to make climatic data available via its APIs, making weather data along all parts of calculated routes available for mobility and logistics purposes. Microsoft has partnered with Esri, the developer of ArcGIS, and there is cross-compatibility between Azure and ArcGIS so that data from Azure Maps can be integrated into ArcGIS and vice versa. Azure Maps partnered with Moovit in 2019, a startup providing software that interfaces with public transport data. Moovit's database on global public transit networks, including information on which stations and facilities are wheelchair accessible, was linked to Azure Maps. This service was noted for its use increasing accessibility to public transport for the visually impaired by means of voice activated route planning assistance. NORAD has used some Azure Maps functions for their NORAD Tracks Santa website during Christmas holidays. == Components == === REST APIs === Various APIs cover the major functionalities across Azure Maps: Data registry API Geolocation API Render API Route API Search API Spatial API Time zone API Traffic API Weather API === SDKs === Azure Maps SDKs uses MapLibre-style specifications and open source MapLibre GL-based libraries as a rendering engine. The Web SDK is used for developing web apps with maps and location-based data and functionality. It includes a map control module as well as modules with drawing tools. It also supports Azure Maps Creator and various spatial data formats. The platform also includes a set of REST SDKs for developers integrating Azure Maps REST APIs into Python, C#, Java or JavaScript applications. Azure Maps also includes Android and iOS SDKs used for developing applications for Android and Apple devices. === Azure Maps Creator === Azure Maps Creator is a tool for generating custom maps for locations like large office complexes, construction sites, or university campuses. These maps can then be integrated into applications and used with other Azure Maps functions for purposes such as wayfinding and maintenance and security in building automation contexts. === Azure Maps Visual for Power BI === Azure Maps is integrated with Microsoft Power BI, a graphical tool for producing data visualizations. Since July 2020, Power BI can be used in conjunction with Azure Maps for developing map-based data visualizations. This functionality entered general availability in May 2023.

    Read more →
  • Data deduplication

    Data deduplication

    In computing, data deduplication is a technique for eliminating duplicate copies of repeating data. Successful implementation of the technique can improve storage utilization, which may in turn lower capital expenditure by reducing the overall amount of storage media required to meet storage capacity needs. It can also be applied to network data transfers to reduce the number of bytes that must be sent. The deduplication process requires comparison of data 'chunks' (also known as 'byte patterns') which are unique, contiguous blocks of data. These chunks are identified and stored during a process of analysis, and compared to other chunks within existing data. Whenever a match occurs, the redundant chunk is replaced with a small reference that points to the stored chunk. Given that the same byte pattern may occur dozens, hundreds, or even thousands of times (the match frequency is dependent on the chunk size), the amount of data that must be stored or transferred can be greatly reduced. A related technique is single-instance (data) storage, which replaces multiple copies of content at the whole-file level with a single shared copy. While possible to combine this with other forms of data compression and deduplication, it is distinct from newer approaches to data deduplication (which can operate at the segment or sub-block level). Deduplication is different from data compression algorithms, such as LZ77 and LZ78. Whereas compression algorithms identify redundant data inside individual files and encodes this redundant data more efficiently, the intent of deduplication is to inspect large volumes of data and identify large sections – such as entire files or large sections of files – that are identical, and replace them with a shared copy. == Functioning principle == For example, a typical email system might contain 100 instances of the same 1 MB (megabyte) file attachment. Each time the email platform is backed up, all 100 instances of the attachment are saved, requiring 100 MB storage space. With data deduplication, only one instance of the attachment is actually stored; the subsequent instances are referenced back to the saved copy for deduplication ratio of roughly 100 to 1. Deduplication is often paired with data compression for additional storage saving: Deduplication is first used to eliminate large chunks of repetitive data, and compression is then used to efficiently encode each of the stored chunks. In computer code, deduplication is done by, for example, storing information in variables so that they don't have to be written out individually but can be changed all at once at a central referenced location. Examples are CSS classes and named references in MediaWiki. == Benefits == Storage-based data deduplication reduces the amount of storage needed for a given set of files. It is most effective in applications where many copies of very similar or even identical data are stored on a single disk. In the case of data backups, which routinely are performed to protect against data loss, most data in a given backup remain unchanged from the previous backup. Common backup systems try to exploit this by omitting (or hard linking) files that haven't changed or storing differences between files. Neither approach captures all redundancies, however. Hard-linking does not help with large files that have only changed in small ways, such as an email database; differences only find redundancies in adjacent versions of a single file (consider a section that was deleted and later added in again, or a logo image included in many documents). In-line network data deduplication is used to reduce the number of bytes that must be transferred between endpoints, which can reduce the amount of bandwidth required. See WAN optimization for more information. Virtual servers and virtual desktops benefit from deduplication because it allows nominally separate system files for each virtual machine to be coalesced into a single storage space. At the same time, if a given virtual machine customizes a file, deduplication will not change the files on the other virtual machines—something that alternatives like hard links or shared disks do not offer. Backing up or making duplicate copies of virtual environments is similarly improved. == Classification == === Post-process versus in-line deduplication === Deduplication may occur "in-line", as data is flowing, or "post-process" after it has been written. With post-process deduplication, new data is first stored on the storage device and then a process at a later time will analyze the data looking for duplication. The benefit is that there is no need to wait for the hash calculations and lookup to be completed before storing the data, thereby ensuring that store performance is not degraded. Implementations offering policy-based operation can give users the ability to defer optimization on "active" files, or to process files based on type and location. One potential drawback is that duplicate data may be unnecessarily stored for a short time, which can be problematic if the system is nearing full capacity. Alternatively, deduplication hash calculations can be done in-line: synchronized as data enters the target device. If the storage system identifies a block which it has already stored, only a reference to the existing block is stored, rather than the whole new block. The advantage of in-line deduplication over post-process deduplication is that it requires less storage and network traffic, since duplicate data is never stored or transferred. On the negative side, hash calculations may be computationally expensive, thereby reducing the storage throughput. However, certain vendors with in-line deduplication have demonstrated equipment which performs in-line deduplication at high rates. Post-process and in-line deduplication methods are often heavily debated. === Data formats === The SNIA Dictionary identifies two methods: Content-agnostic data deduplication – a data deduplication method that does not require awareness of specific application data formats. Content-aware data deduplication – a data deduplication method that leverages knowledge of specific application data formats. === Source versus target deduplication === Another way to classify data deduplication methods is according to where they occur. Deduplication occurring close to where data is created, is referred to as "source deduplication". When it occurs near where the data is stored, it is called "target deduplication". Source deduplication ensures that data on the data source is deduplicated. This generally takes place directly within a file system. The file system will periodically scan new files creating hashes and compare them to hashes of existing files. When files with same hashes are found then the file copy is removed and the new file points to the old file. Unlike hard links however, duplicated files are considered to be separate entities and if one of the duplicated files is later modified, then using a system called copy-on-write a copy of that changed file or block is created. The deduplication process is transparent to the users and backup applications. Backing up a deduplicated file system will often cause duplication to occur resulting in the backups being bigger than the source data. Source deduplication can be declared explicitly for copying operations, as no calculation is needed to know that the copied data is in need of deduplication. This leads to a new form of link on file systems, called a reference-counted link, or reflink, in some systems (e.g. Linux), or a cloned file on macOS, where one or more inodes (file information entries) are made to share some or all of their data. It is named analogously to hard links, which work at the inode level, and symbolic links, which work at the filename level.The individual entries have a copy-on-write behavior that is non-aliasing, i.e. changing one copy afterwards will not affect other copies. Microsoft's ReFS also supports this operation. Target deduplication is the process of removing duplicates when the data was not generated at that location. Example of this would be a server connected to a SAN/NAS, The SAN/NAS would be a target for the server (target deduplication). The server is not aware of any deduplication, the server is also the point of data generation. A second example would be backup. Generally this will be a backup store such as a data repository or a virtual tape library. === Deduplication methods === One of the most common forms of data deduplication implementations works by comparing chunks of data to detect duplicates. For that to happen, each chunk of data is assigned an identification, calculated by the software, typically using cryptographic hash functions. In many implementations, the assumption is made that if the identification is identical, the data is identical, even though this cannot be true in all cases due to the pigeonhole principle; other implementations do not as

    Read more →
  • IBM remote batch terminals

    IBM remote batch terminals

    The IBM 2780 and the IBM 3780 are devices developed by IBM for performing remote job entry (RJE) and other batch functions over telephone lines; they communicate with the mainframe via Binary Synchronous Communications (BSC or Bisync) and replaced older terminals using synchronous transmit-receive (STR). In addition, IBM has developed workstation programs for the 1130, 360/20, 2922, System/360 other than 360/20, System/370 and System/3. == 2780 Data Transmission Terminal == The 2780 Data Transmission Terminal first shipped in 1967. It consists of: A line printer similar to the IBM 1443 that can print up to 240 lines per minute (lpm), or 300 lpm using an extremely restricted character set. A card reader/punch unit, similar to an IBM 1442, that can read up to 400 cards per minute (cpm) and can punch up to 355 cpm. A line buffer that stores data received or to be transmitted over the communications line. A binary synchronous adapter which controls the flow of data over the communications line. The 2780 is capable of local (offline) card to print operation. It comes in four models: Model 1: Can read punched cards and transmit the data to a remote host computer, and can receive and print data sent by the host. Model 2: Same as Model 1 but adds the ability to punch card data received from the host. Model 3: Can only print data received from the host, but not send data to it. Model 4: Can read and punch card data, but has no printing capabilities. The 2780 uses a dedicated communication line at speeds of 1200, 2000, 2400 or 4800 bits per second. It is a half duplex device, although full duplex lines can be used with some increase in throughput. It can communicate in Transcode (a 6-bit code), 8-bit EBCDIC, or 7-bit ASCII. == 2770 Data Communication System == The 2770, announced in 1969, "was said to surpass all other IBM terminals in the variety of available input-output devices." The 2770 was developed by the IBM General Products Division (GPD) in Rochester, MN. It comes standard with a desktop terminal with keyboard. The printer and other devices (any two in any combination) can be attached to the 2772 Multi-Purpose Control unit. Possible devices include: 50 Magnetic Data Inscriber 545 Card Punch Model 3 (non-printing) or Model 4 (printing) 1017 Paper Tape Reader 1018 Paper Tape Punch 1053 Printer Model 1 1255 Magnetic Character Reader Models 1, 2 or 3 2203 Printer Model A1 or A2 2213 Printer Model 1 or 2 2265 Display Station Model 2 2502 Card Reader Model A1 or A2 5496 Data Recorder == 3780 Data Communications Terminal == In May 1972, IBM announced the IBM 3780, an enhanced version of the 2780. The 3780 was developed by IBM's Data Processing Division (DPD). There is one model, with an optional card punch. The 3780 drops Transcode support and incorporates several performance enhancements. It supports compression of blank fields in data using run-length encoding. It provides the ability to interleave data between devices, introduces double buffering, and adds support for the Wait-before-transmit ACKnowledgement (WACK) and Temporary Text Delay (TTD) Binary Synchronous control characters. The integrated punched card unit can read cards at 600 cards per minute. The integrated printer is rated at 300, 350 or 425 lines per minute based on characters set (63, 52 or 39 characters). The 3781 Card Punch is an optional feature. It punches 160 columns per second, or 91 cards per minute if all 80 columns are punched. The IBM 2780 and 3780 were later emulated on various types of equipment, including eventually the personal computer. A notable early emulation was the DN60, by Digital Equipment Corporation in the late 1970s. == 3770 Data Communications System == In 1974 IBM Data Processing Division (DPD) offered a successor to the 3780, called the 3770 Data Communications System, supporting SDLC, BSC, BSC Multi-leaving and SNA, depending on the configuration. The 3770 is a family of desk console style terminals that offers a variety of keyboard and printer combinations as well as I/O equipment attachment and communications features. The terminals come built into a desk and include the following models: 3771 Communication Terminal (optional card reader, optional card punch, wire matrix printer) Models 1 (40 cps printer), 2 (80 cps printer), and 3 (120 cps printer). 3773 Communication Terminal (diskette, wire matrix printer) Models 1 (40 cps printer), 2 (80 cps printer), and 3 (120 cps printer). Each model has a P version which adds some programming features. 3774 Communication Terminal (optional card reader, optional card punch, optional belt printer, wire matrix printer) Models 1 (80 cps printer), and 2 (120 cps printer). Each model has a P version which adds some programming features, a 480-character display and a non-removable diskette. 3775 Communication Terminal (optional card reader, optional card punch, optional diskette, belt printer) Model 1 (120 lpm printer). The model P1 adds some programming features, a 480-character display and a non-removable diskette. 3776 Communication Terminal (optional card reader, optional card punch, optional diskette, belt printer) Models 1 (300 lpm printer) and 2 (400 lpm printer). Models 3 and 4 are similar to models 1 and 2. 3777 Communication Terminal (optional card reader, optional diskette, train printer) Model 1 (up to 1000 lpm printer depending on character set). Model 2 adds an optional card punch, model 3 adds an optional magnetic tape drive and model 4 replaces the train printer with a slower model called the IBM 3262. The model 4 also allows a second, optional, 3262. The following I/O devices can be attached to a 3770 terminal: IBM 2502 Card Reader: Models A1 (up to 150 card per minute), A2 (up to 300 cards per minute) or A3 (up to 400 cards per minute) IBM 3203 Printer Model 3: 1000 LPM using 48 character set IBM 3501 Card Reader: Up to 50 cards per minute desktop unit IBM 3521 Card Punch: Up to 50 cards per minute IBM 3782 Card Attachment unit, which allows the 2502 or 3521 to be attached to any terminal except the 3777 IBM 3784 Line Printer, can be attached to a 3774 as a second printer. Up to 155 LPM with 48 characters set print belt. == Workstation programs == IBM distributes workstation programs with systems software including OS/360 Attached Support Processor (ASP) Houston Automatic Spooling Priority (HASP and HASP II) Operating System/Virtual Storage 1 (OS/VS1) Operating System/Virtual Storage 2 (OS/VS2 MVS) Release 2 through 3.8 MVS versions from MVS/SP Version 1 through z/OS Priority Output Writers, Execution processors and input Readers (POWER) Remote Spooling Communications Subsystem (RSCS) Except for the RJE workstation programs in OS/360, these programs use a variation of BSC known as Multi-leaving. In addition, IBM provides separately ordered workstation programs using BSC. Systems Network Architecture (SNA) and TCP/IP. Workstation programs are available from IBM and third-party vendors to support all of these protocols: 2770/3770 2780/3780 Multileaving Network Job Entry (NJE) OS/360 RJE SNA TCP/IP

    Read more →
  • Classora

    Classora

    Classora is a knowledge base for the Internet oriented to data analysis. From a practical point of view, Classora is a digital repository that stores structured information and allows it to be displayed in multiple formats: analytically, graphically, geographically (through maps); as well as carry out OLAP analysis. The information contained in Classora comes from public sources and is uploaded into the system through bots and ETL processes. The Knowledge Base has a commercial API for semantic enhancement, and an open web through which any user can access to part of the information collected (it also allows users to complete data and share opinions). Internally, Classora is organized into Knowledge Units and Reports. A «Knowledge Unit» is any element of the World about which information may be stored and presented in the form of a data sheet (a person, a company, a country, etc.) A «Report» is a group of Knowledge Units: a ranking of companies, a sport classification table, a survey about people, etc. In fact, one of the technical capabilities of Classora is that it allows the comparison of reports and knowledge units gathered from different sources, thereby generating an added value for the media in which this information is published: digital media, interactive TV, etc. == Key definitions == === Knowledge unit === The units of knowledge (also known as entries) in Classora are data sheets that have a certain semantic equivalence with the articles on the Wikipedia: they store information about any element of the world, be it a film, a country, a company or an animal. However, they differ from Wikipedia in that Classora stores structured information, enriched with a metadata layer; and therefore it is able to automatically interpret the meaning of each unit of knowledge. === Data report === A report is a group of units of knowledge in which the repetition of elements is not allowed. This definition includes any list, poll, ranking, etc.; and, in general, any consultation that involves more than one unit of knowledge. Classora excels at the reports management due to its visualization capabilities, being able to display data in the form of tables, graphs and maps. Types of reports: Sports scores: Sports competitions results sanctioned by the competent institution. Rankings and lists: All types of interesting and curious lists, whether they have an implicit order or not. Polls: Units of knowledge that are ranked according to users’ votes. Queries to the Knowledge Base: Questions from users using CQL. Networks of connections: automatically calculated from the reports and the taxonomy of each Knowledge Unit. === Organizational taxonomy === An organizational taxonomy (also referred to as entry type) is a data sheet that brings together the common attributes of a set of units of knowledge. For instance, the organizational taxonomy F1 Driver displays attributes such as date of debut, team, etc.; and the organizational taxonomy Football Club presents attributes such as city, stadium, etc. In Classora, taxonomies are hierarchically organized, so that they inherit attributes from their parent taxonomies. For instance, F1 Driver is a subsidiary taxonomy of Sportsperson, which is a subsidiary taxonomy of Person, which in turn is a subsidiary taxonomy of Organism. The simplest type of entry in Classora is Classora Object. All the other taxonomies are its subsidiaries and inherit its attributes. In fact, the only attribute Classora Object possesses is name (all units of knowledge are required to have one name at least). == Architecture of Classora == === Data Extraction Module === The Data Extraction Module consists of a set of robots coordinated by software that also manages the potential incidents. Most of the information available in Classora is automatically uploaded through those robots, which connect to the main online public sources to gather all types of data. There are three categories of robots: Extraction robots: responsible for the massive uploading of reports from official public sources (FIFA, CIA, IMF, Eurostat...). They are used for either absolute or incremental data uploading. Data scanner robots: responsible for looking for and updating the data of a unit of knowledge. They use specific sources to perform this task: Wikipedia, IMDB, World Bank, etc. Content aggregators: they don’t connect to external sources. Instead, they generate new information using Classora’s internal database. === Participatory Module === In Classora’s Open Website, Internet users may participate providing their knowledge as they would on the Wikipedia. There are different ways to participate: adding or correcting data in the Knowledge Base, voting in surveys (participatory rankings) and creating new Knowledge Units and Data Reports. === Connectivity Module === The Knowledge Base is designed to be embedded in multi-platform, multi-channel systems, thus enabling its integration into mobile devices, tablets, interactive TV, etc. This integration may be carried out through specific plugins (for navigators or other devices) or an API REST that provides content in XML or JSON formats. The API is divided into three blocks of operations. The first one is the block of general utility tools (ranging from autosuggest components about geographical hierarchies to operations to obtain the list of today’s celebrity birthdays, using CQL). The second one is the block of operations for widget generation (graphs, maps, rankings) using information from the knowledge base. Finally, there is a block of operations designed for the publication of free-source content. == Project statistics == As of April 2012, 2,000,000 Knowledge Units, 15,000 Reports, around 10,000 Maps and several million potential Comparative Analyses had been added to Classora. According to the site of web metrics Alexa, Classora Open Website is ranked at 100,557 globally and at 2,880 in the Spanish traffic ranking. Users spend an average of 9 ½ minutes in Classora.

    Read more →
  • Microsoft Teams

    Microsoft Teams

    Microsoft Teams is a team collaboration platform developed by Microsoft as part of the Microsoft 365 suite. It offers features such as workspace chat, video conferencing, file storage, and integration with both Microsoft and third-party applications and services. Teams gradually replaced earlier Microsoft messaging and collaboration platforms, including Skype for Business, Skype, Flip, and Microsoft Classroom. The platform saw significant growth during the COVID-19 pandemic, alongside competitors such as Zoom, Slack, and Google Meet, as organizations shifted to remote work and virtual meetings. As of January 2023, Microsoft reported approximately 280 million monthly active users. == History == On August 29, 2007, Microsoft acquired Parlano, the developer of the persistent group chat tool MindAlign. Years later, on March 4, 2016, Microsoft considered acquiring Slack for $8 billion. However, the proposal was reportedly opposed by Bill Gates, who advocated for focusing on enhancing Skype for Business instead. Lu Qi, then executive vice president of Applications and Services, had led the initiative to pursue the Slack acquisition. Following Lu's departure later that year, Microsoft announced Microsoft Teams on November 2, 2016, at an event in New York City, positioning it as a direct competitor to Slack. Teams launched worldwide on March 14, 2017. The service was initially led by corporate vice president Brian MacDonald. In response to the launch, Slack published a full-page advertisement in The New York Times welcoming the competition and outlining its product philosophy. Although Slack was used by 28 companies in the Fortune 100, The Verge wrote that executives would question paying for the service if Teams provides a similar function in their company's existing Office 365 subscription. However, ZDNET noted that the platforms initially served different markets, as Teams did not support external users, making it less appealing to small businesses and freelancers, a limitation Microsoft later addressed. In response to Teams' announcement, Slack deepened in-product integration with Google services. In May 2017, Microsoft announced that Teams would replace Microsoft Classroom in Office 365 Education. A free version of Teams was released on July 12, 2018, offering most core features at no cost, albeit with limits on users and storage. In January 2019, Microsoft introduced updates targeting "Firstline Workers" to improve Teams’ performance across shared or limited-access devices. In September 2019, Microsoft announced the retirement of Skype for Business in favor of Teams, which took effect on July 31, 2021. In early 2020, Microsoft introduced a push-to-talk "Walkie Talkie" feature aimed at firstline workers using smartphones and tablets over Wi-Fi or cellular networks. The COVID-19 pandemic significantly boosted usage of Teams. On March 19, 2020, Microsoft reported 44 million daily active users. In April, the platform logged 4.1 billion meeting minutes in a single day. A public preview of Microsoft Teams for Linux was released in December 2019, but the Linux client was discontinued in 2022. In July 2020, Microsoft shut down its video game livestreaming platform Mixer, and announced that some of its technologies would be repurposed for use in Teams. On February 28, 2025, Microsoft announced that Skype would be fully retired on May 5, 2025, with users given options to export their data or transition to Microsoft Teams. In October 2025, together with other Microsoft 365 suite apps, Teams had its logo updated. == Usage == == Underlying software == Microsoft Teams, as part of the Microsoft 365 suite, utilizes SharePoint and Exchange Online. Each Team, Shared Channel, and Private Channel has its own Microsoft 365 Group and SharePoint Site used for file storage. Messages are stored in Cosmos DB and are journaled to Exchange Online mailboxes. Private messages, including messages in Private Channels, are journaled to the sender and recipients' mailboxes. Public Channel messages are journaled to their corresponding Team's group mailbox, whereas, messages from Shared Channels are journaled to their own mailboxes. Contacts and voicemail are stored in Exchange Online. Microsoft Teams client is a web-based desktop app, originally developed on top of the Electron framework which combines the Chromium rendering engine and the Node.js JavaScript platform. Version 2.0 client was rebuilt using the Evergreen version of Microsoft Edge WebView2 in place of Electron. == Features == === Chats === Teams allows users to communicate in two-way persistent chats with one or multiple participants. Participants can message using text, emojis, stickers and gifs, as well as sharing links and files. In August 2022, the chat feature was updated for "chat with yourself"; allowing for the organization of files, notes, comments, images, and videos within a private chat tab. === Teams === Teams allows communities, groups, or teams to contribute in a shared workspace where messages and digital content on a specific topic are shared. Team members can join through an invitation sent by a team administrator or owner or sharing of a specific URL. Teams for Education allows admins and teachers to set up groups for classes, professional learning communities (PLCs), staff members, and everyone. === Channels === Channels allow team members to communicate without the use of email or group SMS (texting). Users can reply to posts with text, images, GIFs, and image macros. Direct messages send private messages to designated users rather than the entire channel. Connectors can be used within a channel to submit information contacted through a third-party service. Connectors include Mailchimp, Facebook Pages, Twitter, Power BI and Bing News. === Group conversations === Ad-hoc groups can be created to share instant messaging, audio calls (VoIP), and video calls inside the client software. === Telephone replacement === A feature on one of the higher cost licencing tiers allows connectivity to the public switched telephone network (PSTN) telephone system. This allows users to use Teams as if it were a telephone, making and receiving calls over the PSTN, including the ability to host "conference calls" with multiple participants. === Meeting === Meetings can be scheduled with multiple participants able to share audio, video, chat and presented content with all participants. Multiple users can connect via a meeting link. Automated minutes are possible using the recording and transcript features. Teams has a plugin for Microsoft Outlook to schedule a Teams Meeting in Outlook for a specific date and time and invite others to attend. If a meeting is scheduled within a channel, users visiting the channel are able to see if a meeting is in progress. ==== Teams Live Events ==== Teams Live Events replaces Skype Meeting Broadcast for users to broadcast to 10,000 participants on Teams, Yammer, or Microsoft Stream. ==== Breakout Rooms ==== Breakout rooms split a meeting into small groups. This is often utilized for collaboration during trainings or any environment where having all participants speak at once could be disruptive or unfeasible. Breakout rooms can be set by the hosts to a certain length of time, after which all participants will automatically rejoin the main meeting room. ==== Front Row ==== Front Row adjusts the layout of the viewer's screen, placing the speaker or content in the center of the gallery with other meeting participant's video feeds reduced in size and located below the speaker. === Education === Microsoft Teams for Education allows teachers to distribute, provide feedback, and grade student assignments turned in via Teams using the Assignments tab through Office 365 for Education subscribers. Quizzes can also be assigned to students through an integration with Office Forms. === Protocols === Microsoft Teams is based on a number of Microsoft-specific protocols. Video conferences are realized over the protocol MNP24, known from the Skype consumer version. VoIP and video conference clients based on SIP and H.323 need special gateways to connect to Microsoft Teams servers. With the help of Interactive Connectivity Establishment (ICE), clients behind Network address translation routers and restrictive firewalls are also able to connect, if peer-to-peer is not possible. === Integrations === Microsoft Teams has integrations through Microsoft AppSource, its integration marketplace. In 2020, Microsoft partnered with KUDO, a cloud-based solution with language interpretation, to allow integrated language meeting controls. In June 2022, an update was released using AI to improve call audio through the elimination of background feedback loops and cancelling non-vocal audio. == Anti-trust controversy == In July 2023, the European Commission opened an anti-trust investigation into the possibility that Microsoft unfairly used its office suite market power to increase sales of Teams and hurt

    Read more →
  • Data set (IBM mainframe)

    Data set (IBM mainframe)

    In the context of IBM mainframe computers in the IBM System/360 line and its successors, a data set (IBM preferred) or dataset is a computer file having a record organization. Use of this term began with, e.g., DOS/360 and OS/360, and is still used by their successors, including the current VSE and z/OS. Documentation for these systems historically preferred this term rather than file. A data set is typically stored on a direct access storage device (DASD) or magnetic tape, however unit record devices, such as punch card readers, card punches, line printers and page printers can provide input/output (I/O) for a data set (file). Data sets are not unstructured streams of bytes, but rather are organized in various logical record and block structures determined by the DSORG (data set organization), RECFM (record format), and other parameters. These parameters are specified at the time of the data set allocation (creation), for example with Job Control Language DD statements. Within a running program they are stored in the Data Control Block (DCB) or Access Control Block (ACB), which are data structures used to access data sets using access methods. Records in a data set may be fixed, variable, or “undefined” length. == Data set organization == For OS/360, the DCB's DSORG parameter specifies how the data set is organized. It may be CQ Queued Telecommunications Access Method (QTAM) in Message Control Program (MCP) CX Communications line group DA Basic Direct Access Method (BDAM) GS Graphics device for Graphics Access Method(GAM) IS Indexed Sequential Access Method (ISAM) MQ QTAM message queue in application PO Partitioned Organization PS Physical Sequential among others. Data sets on tape may only be DSORG=PS. The choice of organization depends on how the data is to be accessed, and in particular, how it is to be updated. Programmers utilize various access methods (such as QSAM or VSAM) in programs for reading and writing data sets. Access method depends on the given data set organization. == Record format (RECFM) == Regardless of organization, the physical structure of each record is essentially the same, and is uniform throughout the data set. This is specified in the DCB RECFM parameter. RECFM=F means that the records are of fixed length, specified via the LRECL parameter. RECFM=V specifies a variable-length record. V records when stored on media are prefixed by a Record Descriptor Word (RDW) containing the integer length of the record in bytes and flag bits. With RECFM=FB and RECFM=VB, multiple logical records are grouped together into a single physical block on tape or DASD. FB and VB are fixed-blocked, and variable-blocked, respectively. RECFM=U (undefined) is also variable length, but the length of the record is determined by the length of the block rather than by a control field. The BLKSIZE parameter specifies the maximum length of the block. RECFM=FBS could be also specified, meaning fixed-blocked standard, meaning all the blocks except the last one were required to be in full BLKSIZE length. RECFM=VBS, or variable-blocked spanned, means a logical record could be spanned across two or more blocks, with flags in the RDW indicating whether a record segment is continued into the next block and/or was continued from the previous one. This mechanism eliminates the need for using any "delimiter" byte value to separate records. Thus data can be of any type, including binary integers, floating-point, or characters, without introducing a false end-of-record condition. The data set is an abstraction of a collection of records, in contrast to files as unstructured streams of bytes. == Partitioned data set == A partitioned data set (PDS) is a data set containing multiple members, each of which holds a separate sub-data set, similar to a directory in other types of file systems. This type of data set is often used to hold load modules (old format bound executable programs), source program libraries (especially Assembler macro definitions), ISPF screen definitions, and Job Control Language. A PDS may be compared to a Zip file or COM Structured Storage. A Partitioned Data Set can only be allocated on a single volume and have a maximum size of 65,535 tracks. Besides members, a PDS contains also a directory. Each member can be accessed indirectly via the directory structure. Once a member is located, the data stored in that member are handled in the same manner as a PS (sequential) data set. Whenever a member is deleted, the space it occupied is unusable for storing other data. Likewise, if a member is re-written, it is stored in a new spot at the back of the PDS and leaves wasted “dead” space in the middle. The only way to recover “dead” space is to perform file compression. Compression, which is done using the IEBCOPY utility, moves all members to the front of the data space and leaves free usable space at the back. (Note that in modern parlance, this kind of operation might be called defragmentation or garbage collection; data compression nowadays refers to a different, more complicated concept.) PDS files can only reside on DASD, not on magnetic tape, in order to use the directory structure to access individual members. Partitioned data sets are most often used for storing multiple job control language files, utility control statements, and executable modules. An improvement of this scheme is a Partitioned Data Set Extended (PDSE or PDS/E, sometimes just libraries) introduced with DFSMSdfp for MVS/XA and MVS/ESA systems. A PDS/E library can store program objects or other types of members, but not both. BPAM cannot process a PDS/E containing program objects. PDS/E structure is similar to PDS and is used to store the same types of data. However, PDS/E files have a better directory structure which does not require pre-allocation of directory blocks when the PDS/E is defined (and therefore does not run out of directory blocks if not enough were specified). Also, PDS/E automatically stores members in such a way that compression operation is not needed to reclaim "dead" space. PDS/E files can only reside on DASD in order to use the directory structure to access individual members. == Generation Data Group == A Generation Data Group (GDG) is a group of non-VSAM data sets that are successive generations of historically-related data stored on an IBM mainframe (running OS/360 and its successors or DOS/360 and its successors). A GDG is usually cataloged. An individual member of the GDG collection is called a "Generation Data Set." The latter may be identified by an absolute number, ACCTG.OURGDG(1234), or a relative number: (-1) for the previous generation, (0) for the current one, and (+1) the next generation. A GDG specifies how many generations of a data set are to be kept and at what age a generation will be deleted. Whenever a new generation is created, the system checks whether one or more obsolete generations are to be deleted. The purpose of GDGs is to automate archival, using the command language JCL, the data set name given is generic. When DSN appears, the GDG data set appears along with the history number, where (0) is the most recent version (-1), (-2), ... are previous generations (+1) a new generation (see DD) Another use of GDGs is to be able to address all generations simultaneously within a JCL script without having to know the number of currently available generations. To do this, you have to omit the parentheses and the generation number in the JCL when specifying the dataset. === GDG JCL & features === Generation Data Groups are defined using either the BLDG statement of the IEHPROGM utility or the DEFINE GENERATIONGROUP statement of the newer IDCAMS utility, which allows setting various parameters. LIMIT(10) would limit the number of generations limit to 10. SCRATCH FOR (91) would retain each member, up to the limited#generations, at least 91 days. IDCAMS can also delete (and optionally uncatalog) a GDG. ==== Example ==== Creation of a standard GDG for five safety scopes, each at least 35 days old: Delete a standard GDG:

    Read more →
  • Social media use in hiring

    Social media use in hiring

    Social media use in hiring refers to the examination by employers of job applicants' (public) social media profiles as part of the hiring assessment. For example, the vast majority of Fortune 500 companies use social media as a tool to screen prospective employees and as a tool for talent acquisition. This practice raises ethical questions. Employers and recruiters note that they have access only to information that applicants choose to make public. Many Western-European countries restrict employer's use of social media in the workplace. States including Arkansas, California, Colorado, Illinois, Maryland, Michigan, Nevada, New Jersey, New Mexico, Utah, Washington, and Wisconsin protect applicants and employees from surrendering usernames and passwords for social media accounts. Use of social media has caused significant problems for some applicants who are active on social media. A 2013 survey of 17,000 young people in six countries found that one in ten people aged 16 to 34 claimed to have been rejected for a job because of social media activity. Social media services have been reported to affect deception in resumes. While these services do not affect deception frequency, it does increase deception about interests and hobbies. == Ethical implications == This issue raises many ethical questions that some consider an employer's right and others consider discrimination. As of 2016, except in the states of California, Maryland, and Illinois, there are no laws that prohibit employers from using social media profiles as a basis of whether or not someone should be hired. Title VII also prohibits discrimination during any aspect of employment including hiring or firing, recruitment, or testing. Social media has been integrating into the workplace, and this has led to conflicts within employees and employers.[107] Particularly, Facebook has been seen as a popular platform for employers to investigate in order to learn more about potential employees. This conflict first started in Maryland when an employer requested and received an employee's Facebook username and password. State lawmakers first introduced legislation in 2012 to prohibit employers from requesting passwords to personal social accounts in order to get a job or to keep a job. This led to Canada, Germany, the U.S. Congress and 11 U.S. states to pass or propose legislation that prevents employers' access to private social accounts of employees.[108] Many Western European countries have already implemented laws that restrict the regulation of social media in the workplace. States including Arkansas, California, Colorado, Illinois, Maryland, Michigan, Nevada, New Jersey, New Mexico, Utah, Washington, and Wisconsin have passed legislation that protects potential employees and current employees from employers that demand them to give forth their username or password for a social media account. Laws that forbid employers from disciplining an employee based on activity off the job on social media sites have also been put into act in states including California, Colorado, Connecticut, North Dakota, and New York. Several states have similar laws that protect students in colleges and universities from having to grant access to their social media accounts. Eight states have passed the law that prohibits post secondary institutions from demanding social media login information from any prospective or current students and privacy legislation has been introduced or is pending in at least 36 states as of July 2013. As of May 2014, legislation has been introduced and is in the process of pending in at least 28 states and has been enacted in Maine and Wisconsin. In addition, the National Labor Relations Board has been devoting a lot of their attention to attacking employer policies regarding social media that can discipline employees who seek to speak and post freely on social media sites. Use of social media by young people has caused significant problems for some applicants who are active on social media when they try to enter the job market. A survey of 17,000 young people in six countries in 2013 found that 1 in 10 people aged 16 to 34 have been rejected for a job because of online comments they made on social media websites. A 2014 survey of recruiters found that 93% of them check candidates' social media postings. Moreover, professor Stijn Baert of Ghent University conducted a field experiment in which fictitious job candidates applied for real job vacancies in Belgium. They were identical except in one respect: their Facebook profile photos. It was found that candidates with the most wholesome photos were a lot more likely to receive invitations for job interviews than those with the more controversial photos. In addition, Facebook profile photos had a greater impact on hiring decisions when candidates were highly educated. These cases have created some privacy implications as to whether or not companies should have the right to look at employee's Facebook profiles. In March 2012, Facebook decided they might take legal action against employers for gaining access to employee's profiles through their passwords. According to Facebook Chief Privacy Officer for policy, Erin Egan, the company has worked hard to give its users the tools to control who sees their information. He also said users shouldn't be forced to share private information and communications just to get a job. According to the network's Statement of Rights and Responsibilities, sharing or soliciting a password is a violation of Facebook policy. Employees may still give their password information out to get a job, but according to Erin Egan, Facebook will continue to do their part to protect the privacy and security of their users. == Impacts == Use of social media by young people has caused significant problems for some applicants who are active on social media when they try to enter the job market. A survey of 17,000 young people in six countries in 2013 found that 1 in 10 people aged 16 to 34 have been rejected for a job because of online comments they made on social media websites. A 2014 survey of recruiters found that 93% of them check candidates' social media postings. Moreover, in 2015 professor Stijn Baert of Ghent University conducted a field experiment in which fictitious job candidates applied for real job vacancies in Belgium. They were identical except in one respect: their Facebook profile photos. It was found that candidates with the most wholesome photos were a lot more likely to receive invitations for job interviews than those with the more controversial photos. In addition, Facebook profile photos had a greater impact on hiring decisions when candidates were highly educated. These cases have created some privacy implications as to whether or not companies should have the right to look at employee's Facebook profiles. In March 2012, Facebook decided they might take legal action against employers for gaining access to employee's profiles through their passwords. According to Facebook Chief Privacy Officer for policy, Erin Egan, the company has worked hard to give its users the tools to control who sees their information. He also said users shouldn't be forced to share private information and communications just to get a job. According to the network's Statement of Rights and Responsibilities, sharing or soliciting a password is a violation of Facebook policy. Employees may still give their password information out to get a job, but according to Erin Egan, Facebook will continue to do their part to protect the privacy and security of their users. == Policy Responses == 26 US states now have laws against an employer requiring a current or potential employee to give the employer their username and password.

    Read more →