Lenna

Lenna

Lenna (or Lena) is a standard test image used in the field of digital image processing, starting in 1973. It is a picture of the Swedish model Lena Forsén, shot by photographer Dwight Hooker and cropped from the centerfold of the November 1972 issue of Playboy magazine. Lenna has attracted controversy because of its subject matter. Starting in the mid-2010s, many journals have deemed it inappropriate and discouraged its use, while others have banned it from publication outright. Forsén herself has called for it to be retired, saying "It's time I retired from tech." The spelling "Lenna" came from the model's desire to encourage the proper pronunciation of her name. "I didn't want to be called Leena [English: ]," she explained. == History == Before Lenna, the first use of a Playboy magazine image to illustrate image processing algorithms was in 1961. Lawrence G. Roberts used two cropped six-bit grayscale facsimile scanned images from Playboy's July 1960 issue featuring Playmate Teddi Smith, in his master's thesis on image dithering at Massachusetts Institute of Technology. Lenna was originally intended for high resolution color image processing study. Its history was described in the May 2001 newsletter of the IEEE Professional Communication Society, in an article by Jamie Hutchinson: Alexander Sawchuk estimates that it was in June or July of 1973 when he, then an assistant professor of electrical engineering at the University of Southern California Signal and Image Processing Institute (SIPI), along with a graduate student and the SIPI lab manager, was hurriedly searching the lab for a good image to scan for a colleague's conference paper. They got tired of their stock of usual test images, dull stuff dating back to television standards work in the early 1960s. They wanted something glossy to ensure good output dynamic range, and they wanted a human face. Just then, somebody happened to walk in with a recent issue of Playboy. The engineers tore away the top third of the centerfold so they could wrap it around the drum of their Muirhead wirephoto scanner, which they had outfitted with analog-to-digital converters (one each for the red, green, and blue channels) and a Hewlett Packard 2100 minicomputer. The Muirhead had a fixed resolution of 100 lines per inch and the engineers wanted a 512×512 image, so they limited the scan to the top 5.12 inches of the picture, effectively cropping it at the subject's shoulders. The image's reach was limited in the 1970s and 80s, which is reflected in it initially only appearing in .org domains, but in July 1991, the image featured on the cover of Optical Engineering alongside Peppers, another popular test image. This drew the attention of Playboy to the potential copyright infringement. The peak of image hits on the internet was in 1995. The scan became one of the most used images in computer history. The use of the photo in electronic imaging has been described as "clearly one of the most important events in [its] history". The image spread to over 100 different domains, particularly .com and .edu. In a 1999 issue of IEEE Transactions on Image Processing "Lena" was used in three separate articles, and the picture continued to appear in scientific journals throughout the beginning of the 21st century. Lenna is so widely accepted in the image processing community that Forsén was a guest at the 50th annual Conference of the Society for Imaging Science and Technology (IS&T) in 1997. In 2015, Lena Forsén was also guest of honor at the banquet of IEEE ICIP 2015. After delivering a speech, she chaired the best paper award ceremony. To explain why the image became a standard in the field, David C. Munson, editor-in-chief of IEEE Transactions on Image Processing, stated that it was a good test image because of its detail, flat regions, shading, and texture. He also noted that "the Lena image is a picture of an attractive woman. It is not surprising that the (mostly male) image processing research community gravitated toward an image that they found attractive." While Playboy often cracks down on illegal uses of its material and did initially send a notice to the publisher of Optical Engineering about its unauthorized use in that publication, over time it has decided to overlook the wide use of Lena. Eileen Kent, VP of new media at Playboy, said, "We decided we should exploit this, because it is a phenomenon." == Criticism == The use of the image has produced controversy because Playboy is "seen (by some) as being degrading to women". In a 1999 essay on reasons for the male predominance in computer science, applied mathematician Dianne P. O'Leary wrote: Suggestive pictures used in lectures on image processing ... convey the message that the lecturer caters to the males only. For example, it is amazing that the "Lena" pin-up image is still used as an example in courses and published as a test image in journals today. A 2012 paper on compressed sensing used a photo of the model Fabio Lanzoni as a test image to draw attention to this issue. The use of the test image at the magnet school Thomas Jefferson High School for Science and Technology in Fairfax County, Virginia, provoked a guest editorial by a senior in The Washington Post in 2015 about its detrimental impact on aspiring female students in computer science. In 2017, the Journal of Modern Optics published an editorial titled "On alternatives to Lenna" suggesting three images (Pirate, Cameraman, and Peppers) that "are reasonably close to Lenna in feature space". In 2018, the Nature Nanotechnology journal announced that they would no longer consider articles using Lenna. In the same year SPIE, the publishers of Optical Engineering, also announced that they "strongly discourage" the use of Lenna, and would no longer consider new submissions containing the image "without convincing scientific justification for its use". They noted that aside from the copyright and ethical issues, that it was also no longer useful as a standard image: "In today's age of high-resolution digital image technology, it seems difficult to argue that a 512 × 512 image produced with a 1970s-era analog scanner is the best we have to offer as an image quality test standard". Forsén stated in the 2019 documentary film Losing Lena, "I retired from modeling a long time ago. It's time I retired from tech, too... Let's commit to losing me." The Institute of Electrical and Electronics Engineers (IEEE) announced that, starting April 1, 2024, it will no longer allow use of Lenna in its publications.

Test data

Test data are sets of inputs or information used to verify the correctness, performance, and reliability of software systems. Test data encompass various types, such as positive and negative scenarios, edge cases, and realistic user scenarios, and aims to exercise different aspects of the software to uncover bugs and validate its behavior. Test data is also used in regression testing to verify that new code changes or enhancements do not introduce unintended side effects or break existing functionalities. == Background == Test data may be used to verify that a given set of inputs to a function produces an expected result. Alternatively, data can be used to challenge the program's ability to handle unusual, extreme, exceptional, or unexpected inputs. Test data can be produced in a focused or systematic manner, as is typically the case in domain testing, or through less focused approaches, such as high-volume randomized automated tests. Test data can be generated by the tester or by a program or function that assists the tester. It can be recorded for reuse or used only once. Test data may be created manually, using data generation tools (often based on randomness), or retrieved from an existing production environment. The data set may consist of synthetic (fake) data, but ideally, it should include representative (real) data. == Limitations == Due to privacy regulations such as GDPR, PCI, and the HIPAA, the use of privacy-sensitive personal data for testing is restricted. However, anonymized (and preferably subsetted) production data may be used as representative data for testing and development. Programmers may also choose to generate synthetic data as an alternative to using real or anonymized data. While synthetic data can offer significant advantages, such as enhanced privacy and flexibility, it also comes with limitations. For instance, generating synthetic data that accurately reflects real-world complexity can be challenging. There is also a risk of synthetic data not fully capturing the nuances of real data, potentially leading to gaps in test coverage. == Domain testing == Domain testing is a set of techniques focusing on test data. This includes identifying critical inputs, values at the boundaries between equivalence classes, and combinations of inputs that drive the system toward specific outputs. Domain testing helps ensure that various scenarios are effectively tested, including edge cases and unusual conditions.

SCinet

SCinet is the high-performance network built annually by volunteers in support of SC (formerly Supercomputing, the International Conference for High Performance Computing, Networking, Storage and Analysis). SCinet is the primary network for the yearly conference and is used by attendees and exhibitors to demonstrate and test high-performance computing and networking applications. == International Community == SCinet is also a hub for the international networking community. It provides a platform to share the latest research, technologies, and demonstrations for networks, network technology providers, and even software developers who are in charge of supporting HPC communities at their own institutions or organizations. == Volunteers == Nearly 200 volunteers from educational institutions, high performance computing sites, equipment vendors, research and education networks, government agencies and telecommunications carriers collaborate via technology and in-person to design, build and operate SCinet. While many of these credentialed individuals have volunteered at SCinet for years, first timers join the team each year. They include international students and participants in the National Science Foundation-funded Women in IT Networking at SC (WINS) program. The 2017 SCinet team included women and men from high performance computing institutions in the U.S. and throughout the world. == History == Originated in 1991 as an initiative within the SC conference to provide networking to attendees, SCinet has grown to become the "World's Fastest Network" during the duration of the conference. For 29 years, SCinet has provided SC attendees and the high performance computing (HPC) community with the innovative network platform necessary to internationally interconnect, transport, and display HPC research during SC. Historically, SCinet has been used as a platform to test networking technology and applications which have found their way into common use. == Research and development == In the past years, SCinet deployed conference wide networking technologies such as ATM, FDDI, HiPPi before they were deployed commercially.

Data set (IBM mainframe)

In the context of IBM mainframe computers in the IBM System/360 line and its successors, a data set (IBM preferred) or dataset is a computer file having a record organization. Use of this term began with, e.g., DOS/360 and OS/360, and is still used by their successors, including the current VSE and z/OS. Documentation for these systems historically preferred this term rather than file. A data set is typically stored on a direct access storage device (DASD) or magnetic tape, however unit record devices, such as punch card readers, card punches, line printers and page printers can provide input/output (I/O) for a data set (file). Data sets are not unstructured streams of bytes, but rather are organized in various logical record and block structures determined by the DSORG (data set organization), RECFM (record format), and other parameters. These parameters are specified at the time of the data set allocation (creation), for example with Job Control Language DD statements. Within a running program they are stored in the Data Control Block (DCB) or Access Control Block (ACB), which are data structures used to access data sets using access methods. Records in a data set may be fixed, variable, or “undefined” length. == Data set organization == For OS/360, the DCB's DSORG parameter specifies how the data set is organized. It may be CQ Queued Telecommunications Access Method (QTAM) in Message Control Program (MCP) CX Communications line group DA Basic Direct Access Method (BDAM) GS Graphics device for Graphics Access Method(GAM) IS Indexed Sequential Access Method (ISAM) MQ QTAM message queue in application PO Partitioned Organization PS Physical Sequential among others. Data sets on tape may only be DSORG=PS. The choice of organization depends on how the data is to be accessed, and in particular, how it is to be updated. Programmers utilize various access methods (such as QSAM or VSAM) in programs for reading and writing data sets. Access method depends on the given data set organization. == Record format (RECFM) == Regardless of organization, the physical structure of each record is essentially the same, and is uniform throughout the data set. This is specified in the DCB RECFM parameter. RECFM=F means that the records are of fixed length, specified via the LRECL parameter. RECFM=V specifies a variable-length record. V records when stored on media are prefixed by a Record Descriptor Word (RDW) containing the integer length of the record in bytes and flag bits. With RECFM=FB and RECFM=VB, multiple logical records are grouped together into a single physical block on tape or DASD. FB and VB are fixed-blocked, and variable-blocked, respectively. RECFM=U (undefined) is also variable length, but the length of the record is determined by the length of the block rather than by a control field. The BLKSIZE parameter specifies the maximum length of the block. RECFM=FBS could be also specified, meaning fixed-blocked standard, meaning all the blocks except the last one were required to be in full BLKSIZE length. RECFM=VBS, or variable-blocked spanned, means a logical record could be spanned across two or more blocks, with flags in the RDW indicating whether a record segment is continued into the next block and/or was continued from the previous one. This mechanism eliminates the need for using any "delimiter" byte value to separate records. Thus data can be of any type, including binary integers, floating-point, or characters, without introducing a false end-of-record condition. The data set is an abstraction of a collection of records, in contrast to files as unstructured streams of bytes. == Partitioned data set == A partitioned data set (PDS) is a data set containing multiple members, each of which holds a separate sub-data set, similar to a directory in other types of file systems. This type of data set is often used to hold load modules (old format bound executable programs), source program libraries (especially Assembler macro definitions), ISPF screen definitions, and Job Control Language. A PDS may be compared to a Zip file or COM Structured Storage. A Partitioned Data Set can only be allocated on a single volume and have a maximum size of 65,535 tracks. Besides members, a PDS contains also a directory. Each member can be accessed indirectly via the directory structure. Once a member is located, the data stored in that member are handled in the same manner as a PS (sequential) data set. Whenever a member is deleted, the space it occupied is unusable for storing other data. Likewise, if a member is re-written, it is stored in a new spot at the back of the PDS and leaves wasted “dead” space in the middle. The only way to recover “dead” space is to perform file compression. Compression, which is done using the IEBCOPY utility, moves all members to the front of the data space and leaves free usable space at the back. (Note that in modern parlance, this kind of operation might be called defragmentation or garbage collection; data compression nowadays refers to a different, more complicated concept.) PDS files can only reside on DASD, not on magnetic tape, in order to use the directory structure to access individual members. Partitioned data sets are most often used for storing multiple job control language files, utility control statements, and executable modules. An improvement of this scheme is a Partitioned Data Set Extended (PDSE or PDS/E, sometimes just libraries) introduced with DFSMSdfp for MVS/XA and MVS/ESA systems. A PDS/E library can store program objects or other types of members, but not both. BPAM cannot process a PDS/E containing program objects. PDS/E structure is similar to PDS and is used to store the same types of data. However, PDS/E files have a better directory structure which does not require pre-allocation of directory blocks when the PDS/E is defined (and therefore does not run out of directory blocks if not enough were specified). Also, PDS/E automatically stores members in such a way that compression operation is not needed to reclaim "dead" space. PDS/E files can only reside on DASD in order to use the directory structure to access individual members. == Generation Data Group == A Generation Data Group (GDG) is a group of non-VSAM data sets that are successive generations of historically-related data stored on an IBM mainframe (running OS/360 and its successors or DOS/360 and its successors). A GDG is usually cataloged. An individual member of the GDG collection is called a "Generation Data Set." The latter may be identified by an absolute number, ACCTG.OURGDG(1234), or a relative number: (-1) for the previous generation, (0) for the current one, and (+1) the next generation. A GDG specifies how many generations of a data set are to be kept and at what age a generation will be deleted. Whenever a new generation is created, the system checks whether one or more obsolete generations are to be deleted. The purpose of GDGs is to automate archival, using the command language JCL, the data set name given is generic. When DSN appears, the GDG data set appears along with the history number, where (0) is the most recent version (-1), (-2), ... are previous generations (+1) a new generation (see DD) Another use of GDGs is to be able to address all generations simultaneously within a JCL script without having to know the number of currently available generations. To do this, you have to omit the parentheses and the generation number in the JCL when specifying the dataset. === GDG JCL & features === Generation Data Groups are defined using either the BLDG statement of the IEHPROGM utility or the DEFINE GENERATIONGROUP statement of the newer IDCAMS utility, which allows setting various parameters. LIMIT(10) would limit the number of generations limit to 10. SCRATCH FOR (91) would retain each member, up to the limited#generations, at least 91 days. IDCAMS can also delete (and optionally uncatalog) a GDG. ==== Example ==== Creation of a standard GDG for five safety scopes, each at least 35 days old: Delete a standard GDG:

Peñabot

Peñabot is the nickname for automated social media accounts allegedly used by the Mexican government of Enrique Peña Nieto and the PRI political party to keep unfavorable news from reaching the Mexican public. Peñabot accusations are related to the broader issue of fake news in the 21st century. == History of disinformation in Mexican politics == The PRI political party has been reported to use fake news since before Peña Nieto. The main tactic originally was to spread such propaganda through open radio and television networks. Such tactic was effective in Mexico, because newspaper readership is low and cable TV is largely limited to the middle classes; consequently, the country's two major television networks – Televisa and TV Azteca – exert a significant influence in national politics. Televisa itself, not only owns around two-thirds of the programming on Mexico's TV channels, making it not only Mexico's largest television network, but also is the largest media network in the Spanish-speaking world. == Peñabots == Analysts have given the name Peñabots to a suspected network of automated accounts on social media used by the Mexican government to spread pro-government propaganda and to marginalize dissenting opinions in social media. The bots were first noticed in the 2012 elections when they were used to disseminate opinions in support of Enrique Peña Nieto on social networks such as Twitter and Facebook. According to Aristegui Noticias, their usage went against articles 6 and 134 of the Mexican Constitution. Those used by Peña Nieto's government cost an estimated 80 million pesos monthly, which news outlets argued only helped the government spread fake support towards the president, but did not have a benefit towards Mexican people (with whom EPN was highly unpopular). Facebook held approximately 640,321 Peñabots, while Twitter had less. As of July 2017, Oxford Internet Institute's Computational Propaganda Research Project claimed many western democracies, Mexico included, perform social media manipulation, thus saying the manipulation comes directly from the Mexican government itself. During Peña Nieto's subsequent presidency, analysts noted that Peñabots were used to overpower trending topics that critiqued government, to flood trending government critical hashtags with spam, to create fake trends by pushing alternative hashtags, and to push smear campaigns and threats against government-critical activists and journalists. Peñabots were distinguished as their pattern of activity was distinct from that of ordinary interaction on social networks. === Meadebots === On Twitter it was reported that about 94% of the followers of 2018 presidential candidate from the PRI Jose Antonio Meade were bots. When Antonio Meade presented himself as a candidate for the 2018 presidential election, his social media accounts such as "@MovimientoMEADE" (created by the PRI's official account @PRI_Nacional), obtained a huge quantity of followers in a short span of time. Some users noticed and brought it to attention, and after investigation it was reported 94% of such followers were bots (702,000 out of 747,000), and the account was eliminated from Twitter after 20 hours. The fake accounts used the hashtags #YoConMeade and #Meade18. It was further revealed was that Meade's official account on Twitter, @JoseAMeadeK had 25% bots (216,000 fake followers out of the 981,000). == Manipulation of news media in Mexico, through television == The Mexican government of Peña Nieto has been accused of using various means to keep unfavorable news from reaching the Mexican people. Many Mexicans have protested this practice as it clearly goes against the freedom of speech. The PRI has been reported to use fake news since before Peña Nieto. The main tactic has been to spread such propaganda through radio and television. This tactic is perceived as effective in Mexico, because newspaper readership is low and research on the Internet and cable TV is largely limited to the middle classes; consequently, the country's two major television networks – Televisa and TV Azteca – exert a significant influence in national politics. Televisa itself, owns around two-thirds of the programming on Mexico's TV channels, making it not only Mexico's largest television network, but also is the largest media network in the Spanish-speaking world. In June 2012, before the 2012 Mexican presidential elections, the British newspaper The Guardian published a series of allegations claiming Televisa, sold favorable coverage to top politicians in its news and entertainment shows, this scandal became known as the Televisa controversy. The documents published by 'The Guardian alleged that a secretive circle within Televisa manipulated news coverage to favor PRI presidential candidate Enrique Peña Nieto, who was poised as favorite to win. Televisa's secret circle supposedly commissioned videos to promote Peña Nieto and lash out his political rivals in 2009. The Guardian documents suggest that Televisa's secret team distributed such videos through e-mail, posting them posted them on Facebook and YouTube, some can still be seen there. Another document was a PowerPoint presentation, with a slide explicitly aimed at rival leftist candidate of the Party of the Democratic Revolution (PRD), Andrés Manuel López Obrador. Supposedly given to The Guardian by a Televisa employee. The document's authenticity was never possible to confirm– however dates, names, and events largely coincide. Televisa refused to talk the documents, and denied a relationship with the PRI or its presidential candidate, saying that they had provided equal media coverage to all parties. Televisa published an article supposedly showing discrepancies in The Guardian documents and denying accusations. Mexican citizens complained about the perceived favoritism towards Enrique Peña Nieto and the PRI, protesting through the Yo Soy 132 movement which Televisa covered in detail. However, Televisa's news media coverage is perceived to have been biased, by using a media coverage tactic Mexican citizens call cortinas de humo (smoke screens). These introduce a news scandal giving extensive coverage to distract citizens from a potential conflict-of-interest or controversy that could damage the image of the politician favored by the network. An example of a perceived smoke screen would be the news media coverage of "Caso Michoacán" and "Caso Paolette" distracting all the attention from the parallel "Yo soy 132" movement. A few years later, on the day of September 11, 2016; factual evidence of Televisa's performing media manipulation emerged, when a Televisa news anchor while live-on air reading a teleprompter, mistakenly read out loud that "try that Jaime "Ël Bronco" Rodríguez Calderón (Nuevo Leon's governor) is mentioned as little as possible". Newspaper El Universal caught it on video and published it social media. Televisa didn't mention the story and declined to comment. Lack of news coverage concerning Nuevo León's Governor Jaime Rodriguez, is perceived due to him being the first elected governor to not be part of any political party (Independent Governor), and because unlike the governors from the PRI preceding him, the independent governor "El Bronco" doesn't spend money on publicity at all, preferring to communicate all news by using social media such as Twitter and Facebook. While the incident may have proven Televisa's bias, there wasn't anything to incriminate the PRI political party or Enrique Peña Nieto, though it did further suspicion of Televisa manipulating news media. In contrast, a December 2017 article of The New York Times, reported Enrique Peña Nieto spending about 2000 million dollars on publicity, during his first 5 years as president, the largest publicity budget ever spent by a Mexican President. Additionally, 68 percent of news journalists admitted to not believe to have enough freedom of speech, and award-winning news reporter Carmen Aristegui was controversially fired shortly after revealing the Mexican White House scandals. == Violence and spying towards news journalists and civil rights activists == Far for only being receiving accusations of spreading fake news, the Mexican government of EPN (Enrique Peña Nieto) has also been accused of violence towards news journalists, and of spying on them, and also towards civil right leaders and their families. During his tenure as president, Peña Nieto has been accused of failing to protect news journalists, whose deaths are speculated to be politically triggered, by politicians attempting to prevent them from covering political scandals. The New York Times published a news report on the matter titled, "In Mexico it's easy to kill a journalist", on it mentioning how during EPN's government, Mexico became one of the worst countries on which to be a journalist. The assassination of journalist Javier Valdez on May 23, 2017, received national coverage, with multiple news journalists

Insider threat

An insider threat is a perceived threat to an organization that comes from people within the organization, such as employees, former employees, contractors or business associates, who have inside information concerning the organization's security practices, data and computer systems. The threat may involve fraud, the theft of confidential or commercially valuable information, the theft of intellectual property, or the sabotage of computer systems. == Overview == Insiders may have accounts giving them legitimate access to computer systems, with this access originally having been given to them to serve in the performance of their duties; these permissions could be abused to harm the organization. Insiders are often familiar with the organization's data and intellectual property as well as the methods that are in place to protect them. This makes it easier for the insider to circumvent any security controls of which they are aware. Physical proximity to data means that the insider does not need to hack into the organizational network through the outer perimeter by traversing firewalls; rather they are in the building already, often with direct access to the organization's internal network. Insider threats are harder to defend against than attacks from outsiders, since the insider already has legitimate access to the organization's information and assets. An insider may attempt to steal property or information for personal gain or to benefit another organization or country. The threat to the organization could also be through malicious software left running on its computer systems by former employees, a so-called logic bomb. == Research == Insider threat is an active area of research in academia and government. The CERT Coordination Center at Carnegie-Mellon University maintains the CERT Insider Threat Center, which includes a database of more than 850 cases of insider threats, including instances of fraud, theft and sabotage; the database is used for research and analysis. CERT's Insider Threat Team also maintains an informational blog to help organizations and businesses defend themselves against insider crime. The Threat Lab and Defense Personnel and Security Research Center (DOD PERSEREC) has also recently emerged as a national resource within the United States of America. The Threat Lab hosts an annual conference, the SBS Summit. They also maintain a website that contains resources from this conference. Complimenting these efforts, a companion podcast was created, Voices from the SBS Summit. In 2022, the Threat Lab created an interdisciplinary journal, Counter Insider Threat Research and Practice (CITRAP) which publishes research on insider threat detection. === Findings === In the 2022 Data Breach Investigations Report (DBIR), Verizon found that 82% of breaches involved the human element, noting that employees continue to play a leading role in cybersecurity incidents and breaches. According to the UK Information Commissioners Office, 90% of all breaches reported to them in 2019 were the result of mistakes made by end users. This was up from 61% and 87% over the previous two years. A 2018 whitepaper reported that 53% of companies surveyed had confirmed insider attacks against their organization in the previous 12 months, with 27% saying insider attacks have become more frequent. A report published in July 2012 on the insider threat in the U.S. financial sector gives some statistics on insider threat incidents: 80% of the malicious acts were committed at work during working hours; 81% of the perpetrators planned their actions beforehand; 33% of the perpetrators were described as "difficult" and 17% as being "disgruntled". The insider was identified in 74% of cases. Financial gain was a motive in 81% of cases, revenge in 23% of cases, and 27% of the people carrying out malicious acts were in financial difficulties at the time. The US Department of Defense Personnel Security Research Center published a report that describes approaches for detecting insider threats. Earlier it published ten case studies of insider attacks by information technology professionals. Cybersecurity experts believe that 38% of negligent insiders are victims of a phishing attack, whereby they receive an email that appears to come from a legitimate source such as a company. These emails normally contain malware in the form of hyperlinks. == Typologies and ontologies == Multiple classification systems and ontologies have been proposed to classify insider threats. Traditional models of insider threat identify three broad categories: Malicious insiders, which are people who take advantage of their access to inflict harm on an organization; Negligent insiders, which are people who make errors and disregard policies, which place their organizations at risk; and Infiltrators, who are external actors that obtain legitimate access credentials without authorization. == Criticisms == Insider threat research has been criticized. Critics have argued that insider threat is a poorly defined concept. Forensically investigating insider data theft is notoriously difficult, and requires novel techniques such as stochastic forensics. Data supporting insider threat is generally proprietary (i.e., encrypted data). Theoretical/conceptual models of insider threat are often based on loose interpretations of research in the behavioral and social sciences, using "deductive principles and intuitions of subject matter expert." Adopting sociotechnical approaches, researchers have also argued for the need to consider insider threat from the perspective of social systems. Jordan Schoenherr said that "surveillance requires an understanding of how sanctioning systems are framed, how employees will respond to surveillance, what workplace norms are deemed relevant, and what ‘deviance’ means, e.g., deviation for a justified organization norm or failure to conform to an organizational norm that conflicts with general social values." By treating all employees as potential insider threats, organizations might create conditions that lead to insider threats. == Sector-specific concerns == === Healthcare === The healthcare industry faces particularly acute insider threat risks due to the large number of workforce members who require access to sensitive patient records for legitimate clinical purposes. The U.S. Department of Health and Human Services has identified unauthorized access by insiders, including workforce snooping on patient records and theft of protected health information for identity fraud, as a persistent enforcement concern. The Health Insurance Portability and Accountability Act (HIPAA) Security Rule addresses insider threats through several administrative safeguards, including workforce security procedures requiring covered entities to implement policies for authorizing and supervising workforce members who work with electronic protected health information, as well as termination procedures to revoke access when employment ends (45 CFR 164.308(a)(3)). The rule also requires audit controls to record and examine information system activity (45 CFR 164.312(b)), enabling detection of unauthorized access by insiders. The December 2024 Notice of proposed rulemaking (NPRM) to overhaul the HIPAA Security Rule would strengthen insider threat defenses by mandating role-based access controls, requiring notification of relevant workforce members within 24 hours of any changes to access privileges, and requiring regular review of audit logs to detect anomalous access patterns.

Cover-coding

Cover-coding is a technique for obscuring the data that is transmitted over an insecure link, to reduce the risks of snooping. An example of cover-coding would be for the sender to perform a bitwise XOR (exclusive OR) of the original data with a password or random number which is known to both sender and receiver. The resulting cover-coded data is then transmitted from sender to the receiver, who uncovers the original data by performing a further bitwise XOR (exclusive OR) operation on the received data using the same password or random number. ISO 18000-6C (EPC Class 1 Generation 2) RFID tags protect some operations with a cover code. The reader requests a random number from the tag, and the tag responds with a new random number. The reader then encrypts future communications with this number, using bitwise XOR, to the data it sends. Cover coding is secure if the tag signal can't be intercepted and the random number is not re-used. Compared to the loud transmissions from the reader, tag backscatter is much weaker and difficult -- but not impossible -- to intercept.