AI Analytics Hub

AI Analytics Hub — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

Memory color effect

The memory color effect is the phenomenon that the canonical hue of a type of object acquired through experience (e.g. the sky, a leaf, or a strawberry) can directly modulate the appearance of the actual colors of objects. Human observers acquire memory colors through their experiences with instances of that type. For example, most human observers know that an apple typically has a reddish hue; this knowledge about the canonical color which is represented in memory constitutes a memory color. As an example of the effect, normal human trichromats, when presented with a gray banana, often perceive the gray banana as being yellow - the banana's memory color. In light of this, subjects typically adjust the color of the banana towards the color blue - the opponent color of yellow - when asked to adjust its surface to gray to cancel the subtle activation of banana's memory color. Subsequent empirical studies have also shown the memory color effect on man-made objects (e.g. smurfs, German mailboxes), the effect being especially pronounced for blue and yellow objects. To explain this, researchers have argued that because natural daylight shifts from short wavelengths of light (i.e., bluish hues) towards light of longer wavelengths (i.e., yellowish-orange hues) during the day, the memory colors for blue and yellow objects are recruited by the visual system to a higher degree to compensate for this fluctuation in illumination, thereby providing a stronger memory color effect. == Form identification == Memory color plays a role when detecting an object. In a study where participants were given objects, such as an apple, with two alternate forms for each, a crooked apple and a circular apple, researchers changed the colors of the alternate forms and asked if they could identify them. Most of the participants answered "unsure," suggesting that we use memory color when identifying an object. The research redefined memory color as a phenomenon when "a form's identity affects the phenomenal hue of that form." == Color effect on memorization == Memory color effect can be derived from the human instinct to memorize objects better. Comparing the effect of recognizing gray-scaled images and colored images, results showed that people were able to recall colored images 5% higher compared to gray-scaled images. An important factor was that higher level of contrast between the object and background color influences memory. In a specific study related to this, participants reported that colors were 5% to 10% easier to recognize compared to black and white. == Color constancy and memory color effect == Color constancy is the phenomenon where a surface to appear to be of the same color under a wide rage of illumination. A study tested two hypotheses with regards to color memory; the photoreceptor hypothesis and the surface reflectance hypothesis. The test color was surround either by various color patches forming a complex pattern or a uniform “grey” field at the same chromaticity as that of the illuminant. The test color was presented on a dark background for the control group. It was observed that complex surround results where in line with the surface-reflectance hypothesis and not the photoreceptor hypothesis, showing that the accuracy and precision of color memory are fundamentals to understanding the phenomenon of color constancy. == Significance to the evolution of trichromacy == While objects that possess canonical hues make up a small percentage of the objects which populate humans’ visual experience, the human visual system evolved in an environment populated with objects that possess canonical hues. This suggests that the memory color effect is related to the emergence of trichromacy because it has been argued that trichromacy evolved to optimize the ability to detect ripe fruits—objects that appear in canonical hues. == In perception research == In perception research, the memory color effect is cited as evidence for the opponent color theory, which states that four basic colors can be paired with its opponent color: red—green, blue—yellow. This explains why participants adjust the ripe banana color to a blueish tone to make its memory color yellow as gray. Researchers have also found empirical evidence that suggests memory color is recruited by the visual system to achieve color constancy. For example, participants had a lower percentage of color constancy when looking at a color incongruent scene, such as a purple banana, compared to a color diagnostical scene, a yellow banana. This suggests that color constancy is influenced by the color of objects that we are familiar with, which the memory color effect takes part.
Read more →
Glossary of operating systems terms

This page is a glossary of Operating systems terminology. == A == access token: In Microsoft Windows operating systems, an access token contains the security credentials for a login session and identifies the user, the user's groups, the user's privileges, and, in some cases, a particular application. == B == binary semaphore: See semaphore. booting: In computing, booting (also known as booting up) is the initial set of operations that a computer performs after electrical power is switched on or when the computer is reset. This can take tens of seconds and typically involves performing a power-on self-test, locating and initializing peripheral devices, and then finding, loading and starting the operating system. == C == cache: In computer science, a cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere. cloud: Cloud computing operating systems are recent, and were not mentioned in Gagne's 8th Edition (2009). In contrast, by Gagne's 9th (2012), cloud o/s received 3 pages of coverage (41, 42, 716). Doeppner (2011) mentions them (p. 3), but only to prove that operating systems "are not a solved problem" and that even if the day of the dedicated PC is waning, cloud computing has created an entirely new opportunity for o/s development ala sharing, networks, memory, parallelism, etc. Gagne (2012) adds that in addition to numerous traditional o/s's at cloud warehouses, Virtual machine o/s (VMMs), Eucalyptus, Vware, vCloud Director and others are being developed specifically for cloud management with numerous traditional o/s features (security, threads, file and memory management, guis, etc.) (p. 42). Microsoft's investment in cloud aspects of o/s tend to support that argument. concurrency == D == daemon: Operating systems often start daemons at boot time and serve the function of responding to network requests, hardware activity, or other programs by performing some task. Daemons can also configure hardware (like udevd on some Linux systems), run scheduled tasks (like cron), and perform a variety of other tasks. == E == == F == == G == == H == == I == == J == == K == kernel: In computing, the kernel is a computer program that manages input/output requests from software and translates them into data processing instructions for the central processing unit and other electronic components of a computer. The kernel is a fundamental part of a modern computer's operating system. == L == lock: In computer science, a lock or mutex (from mutual exclusion) is a synchronization mechanism for enforcing limits on access to a resource in an environment where there are many threads of execution. A lock is designed to enforce a mutual exclusion concurrency control policy. == M == mutual exclusion: Mutual exclusion is to allow only one process at a time to access the same critical section (a part of code which accesses the critical resource). This helps prevent race conditions. mutex: See lock. == N == == O == == P == paging daemon: See daemon. process == Q == == R == == S == semaphore: In computer science, particularly in operating systems, a semaphore is a variable or abstract data type that is used for controlling access, by multiple processes, to a common resource in a parallel programming or a multi user environment. == T == thread: In computer science, a thread of execution is the smallest sequence of programmed instructions that can be managed independently by an operating system scheduler. The scheduler itself is a light-weight process. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process. templating: In an o/s context, templating refers to creating a single virtual machine image as a guest operating system, then saving it as a tool for multiple running virtual machines (Gagne, 2012, p. 716). The technique is used both in virtualization and cloud computing management, and is common in large server warehouses. == U == == V == == W == == Z ==
Read more →
Mini-STX

Mini-STX (mSTX, Mini Socket Technology EXtended, originally "Intel 5x5") is a computer motherboard form factor that was released by Intel in 2015 (as "Intel 5x5"). These motherboards measure 147mm by 140mm (5.8" x 5.5"), making them larger than "4x4" NUC (102x102mm / 4.01" x 4.01" inches) and Nano-ITX (120x120mm / 4.7" x 4.7") boards, but notably smaller than the more common Mini-ITX (170x170mm / 6.7" x 6.7") boards. Unlike these standards, which use a square shape, the Mini-STX form factor is 7mm longer from front-to-rear, making it slightly rectangular. == Mini-STX design elements == The Mini-STX design suggests (but does not require) support for: Socketed processors (e.g. LGA or PGA CPUs) Onboard power regulation circuitry, enabling direct DC power input IO ports embedded on the front and rear of the motherboard (akin to NUC, but unlike typical motherboards which often use headers instead to connect built-in ports on enclosures) == Adoption by manufacturers == This motherboard form factor is still not in particularly common use with consumer-PC manufacturers, although there are a few offerings: ASRock offers both DeskMini kits (that use mini-STX boards) and standalone motherboards, Asus offer VivoMini kits (that use mini-STX boards) and standalone motherboards, Gigabyte offers a few motherboards, and industrial PC suppliers (e.g. Kontron, Iesy, ASRock Industrial) also provide some options for mini-STX equipment. == Derivatives == ASRock developed a derivative of mini-STX, dubbed micro-STX, for their 'DeskMini GTX/RX' small form-factor PCs and industrial motherboards. Micro-STX adds an MXM slot which allows the use of special PCI Express expansion cards, including graphics or machine learning accelerators, but increases the width of the board to be extended two inches, resulting in measurements of 147 x 188 mm (5.8" x 7.4")
Read more →
Digital Cinema Package

A Digital Cinema Package (DCP) is a collection of digital files used to store and convey digital cinema (DC) audio, image, and data streams. The term was popularized by Digital Cinema Initiatives, LLC in its original recommendation for packaging DC contents. However, the industry tends to apply the term to the structure more formally known as the composition. A DCP is a container format for compositions, a hierarchical file structure that represents a title version. The DCP may carry a partial composition (e.g. not a complete set of files), a single complete composition, or multiple and complete compositions. The composition consists of a Composition Playlist (in XML format) that defines the playback sequence of a set of Track Files. Track Files carry the essence (audio, image, subtitles), which is wrapped using Material eXchange Format (MXF). Track Files must contain only one essence type. Two track files at a minimum must be present in every composition (see SMPTE ST429-2 D-Cinema Packaging – DCP Constraints, or Cinepedia): a track file carrying picture essence, and a track file carrying audio essence. The composition, consisting of a Composition Playlist (CPL) and associated track files, are distributed as a Digital Cinema Package (DCP). A composition is a complete representation of a title version, while the DCP need not carry a full composition. However, as already noted, it is commonplace in the industry to discuss the title in terms of a DCP, as that is the deliverable to the cinema. The Picture Track File essence is compressed using JPEG 2000 and the Audio Track File carries a 24-bit linear PCM uncompressed multichannel WAV file. Encryption may optionally be applied to the essence of a track file to protect it from unauthorized use. The encryption used is AES 128-bit in CBC mode. In practice, there are two versions of composition in use. The original version is called Interop DCP. In 2009, a specification was published by SMPTE (SMPTE ST 429-2 Digital Cinema Packaging – DCP Constraints) for what is commonly referred to as SMPTE DCP. SMPTE DCP is similar but not backwards compatible with Interop DCP, resulting in an uphill effort to transition the industry from Interop DCP to SMPTE DCP. SMPTE DCP requires significant constraints to ensure success in the field, as shown by ISDCF. While legacy support for Interop DCP is necessary for commercial products, new productions are encouraged to be distributed in SMPTE DCP. == Technical specifications == The DCP root folder (in the storage medium) contains a number of files, some used to store the image and audio contents, and some other used to organize and manage the whole playlist. === Picture MXF files === Picture contents may be stored in one or more reels corresponding to one or more MXF files. Each reel contains pictures as MPEG-2 or JPEG 2000 essence, depending on the adopted codec. MPEG-2 is no longer compliant with the DCI specification. JPEG 2000 is the only accepted compression format. Supported frame rates are: SMPTE (JPEG 2000) 24, 25, 30, 48, 50, and 60 fps @ 2K 24, 25, and 30 fps @ 4K 24 and 48 fps @ 2K stereoscopic MXF Interop (JPEG 2000) – Deprecated 24 and 48 fps @ 2K (MXF Interop can be encoded at 25 frame/s but support is not guaranteed) 24 fps @ 4K 24 fps @ 2K stereoscopic MXF Interop (MPEG-2) – Deprecated 23.976 and 24 fps @ 1920 × 1080 Maximum frame sizes are 2048 × 1080 for 2K DC, and 4096 × 2160 for 4K DC. Common formats are: SMPTE (JPEG 2000) Flat (1998 × 1080 or 3996 × 2160), = 1.85:1 aspect ratio Scope (2048 × 858 or 4096 × 1716), ~2.39:1 aspect ratio HDTV (1920 × 1080 or 3840 × 2160), 16:9 aspect ratio (~1.78:1) (although not specifically defined in the DCI specification, this resolution is DCI compliant per section 8.4.3.2). Full (2048 × 1080 or 4096 × 2160) (~1.9:1 aspect ratio, official name by DCI is Full Container. Not widely accepted in cinemas.) MXF Interop (MPEG-2) – Deprecated Full Frame (1920 × 1080) 12 bits per component precision (36 bits total per pixel) XYZ' colorspace; the prime mark indicates gamma encoding (gamma=2.6) Maximum bit rate is 250 Mbit/s (1.3 MBytes per frame at 24 frame per second) === Sound MXF files === Sound contents are also stored in reels corresponding to picture reels in number and duration. In case of multilingual features, separate reels are required to convey different languages. Each file contains linear PCM essence. Sampling rate is 48,000 or 96,000 samples per second Sample precision of 24 bits Linear mapping (no companding) Up to 16 independent channels === Asset map file === List of all files included in the DCP, in XML format. === Composition playlist file === Defines the playback order during presentation. The order is saved in XML format in this file; each picture and sound reel is identified by its UUID. In the following example, a reel is composed by picture and sound: === Packing list file or package key list (PKL) === All files in the composition are hashed and their hash is stored here, in XML format. This file is generally used during ingestion in a digital cinema server to verify if data have been corrupted or tampered with in some way. For example, an MXF picture reel is identified by the following element: The hash value is the Base64 encoding of the SHA-1 checksum. It can be calculated with the command: openssl sha1 -binary "FILE_NAME" | openssl base64 === Volume index file === A single DCP may be stored in more than one medium (e.g., multiple hard disks). The XML file VOLINDEX is used to identify the volume order in the series. == 3D DCP == The DCP format is also used to store stereoscopic (3D) contents for 3D films. In this case, 48 frames exist for every second – 24 frames for the left eye, 24 frames for the right. Depending on the projection system used, the left eye and right eye pictures are either shown alternately (double or triple flash systems) at 48 fps or, on 4k systems, both left and right eye pictures are shown simultaneously, one above the other, at 24 fps. In triple flash systems, active shutter glasses are required whereas optical filtering such as circular polarisation is used in conjunction with passive glasses on polarized systems. Since the maximum bit rate is always 250 Mbit/s, this results in a net 125 Mbit/s for single frame, but the visual quality decrease is generally unnoticeable. == D-Box == D-Box codes for motion controlled seating (labelled as "Motion Data" in the DCP specification), if present, are stored as a monoaural WAV file on Sound Track channel 13. Motion Data tracks are unencrypted and not watermarked. == Creation == Most film producers and distributors rely on digital cinema encoding facilities to produce and quality control check a digital cinema package before release. Facilities follow strict guidelines set out in the DCI recommendations to ensure compatibility with all digital cinema equipment. For bigger studio release films, the facility will usually create a Digital Cinema Distribution Master (DCDM). A DCDM is the post-production step prior to a DCP. The frames are in XYZ TIFF format and both sound and picture are not yet wrapped into MXF files. A DCP can be encoded directly from a DCDM. A DCDM is useful for archiving purposes and also facilities can share them for international re-versioning purposes. They can easily be turned into alternative version DCPs for foreign territories. For smaller release films, the facility will usually skip the creation of a DCDM and instead encode directly from the Digital Source Master (DSM) the original film supplied to the encoding facility. A DSM can be supplied in a multitude of formats and color spaces. For this reason, the encoding facility needs to have extensive knowledge in color space handling including, on occasion, the use of 3D LUTs to carefully match the look of the finished DCP to a celluloid film print. This can be a highly involved process in which the DCP and the film print are "butterflied" (shown side by side) in a highly calibrated cinema. Less demanding DCPs are encoded from tape formats such as HDCAM SR. Quality control checks are always performed in calibrated cinemas and carefully checked for errors. QC checks are often attended by colorists, directors, sound mixers and other personnel to check for correct picture and sound reproduction in the finished DCP. == Accessibility == === Hearing impaired audio === A Hearing Impaired (HI) audio track is designed for people who are hearing-impaired to better hear dialog. Moviegoers can wear headphones which play this audio track synchronized with the film. Hearing Impaired audio is stored in the DCP on Sound Track channel 7. === Audio description === Audio description is narration for people who are blind or visually impaired. Audio description is stored in the DCP as "Visually Impaired-Native" (VI-N) audio on Sound Track channel 8. === Sign Language Video === A Sign Language Video track can be included in a DCP to allow for display of sign la
Read more →
TargetLink

TargetLink is a software for automatic code generation, based on a subset of Simulink/Stateflow models, produced by dSPACE GmbH. TargetLink requires an existing MATLAB/Simulink model to work on. TargetLink generates both ANSI-C and production code optimized for specific processors. It also supports the generation of AUTOSAR-compliant code for software components for the automotive sector. The management of all relevant information for code generation takes place in a central data container, called the Data Dictionary. Testing of the generated code is implemented in Simulink, which is also used for the specification of the underlying simulation models. TargetLink supports three simulation modes to test the generated code: Model-in-the-loop simulation (MIL): this mode allows the model design to be checked. An MIL simulation is also known as a floating-point simulation, since the variables are typically floating-point variables. Software-in-the-loop (SIL): the simulation is based on the execution of generated code, which runs on a PC system. The variables are typically plain or fixed point numbers. Processor-in-the-loop (PIL): in a PIL simulation, the generated code runs on the target hardware or on an evaluation board. So-called real-time frames are included, making it possible to transfer the simulation results as well as memory consumption and runtime information to the PC. The Motor Industry Software Reliability Association (MISRA) published official MISRA modeling guidelines for TargetLink in late 2007, which are particularly important for functional safety of safety-critical applications. In 2009, TÜV SÜD certified TargetLink for use during the development of safety-critical systems to ISO DIS 26262 and IEC 61508.
Read more →
Deplatforming

Deplatforming, also known as no-platforming, is a boycott on an individual or group by removing the platforms used to share their information or ideas. The term is commonly associated with social media. == History == === Deplatforming of invited speakers === In the United States, the banning of speakers on university campuses dates back to the 1940s. This was carried out by the policies of the universities themselves. The University of California had a policy known as the Speaker Ban, codified in university regulations under President Robert Gordon Sproul, that mostly, but not exclusively, targeted communists. One rule stated that "the University assumed the right to prevent exploitation of its prestige by unqualified persons or by those who would use it as a platform for propaganda." This rule was used in 1951 to block Max Shachtman, a socialist, from speaking at the University of California at Berkeley. In 1947, former U.S. Vice President Henry A. Wallace was banned from speaking at UCLA because of his views on U.S. Cold War policy, and in 1961, Malcolm X was prohibited from speaking at Berkeley as a religious leader. Controversial speakers invited to appear on college campuses have faced deplatforming attempts to disinvite them or to otherwise prevent them from speaking. The British National Union of Students established its No Platform policy as early as 1973. In the mid-1980s, visits by South African ambassador Glenn Babb to Canadian college campuses faced opposition from students opposed to apartheid. In the United States, recent examples include the March 2017 disruption by protestors of a public speech at Middlebury College by political scientist Charles Murray. In February 2018, students at the University of Central Oklahoma rescinded a speaking invitation to creationist Ken Ham, after pressure from an LGBT student group. In March 2018, a "small group of protesters" at Lewis & Clark Law School attempted to stop a speech by visiting lecturer Christina Hoff Sommers. In the 2019 film No Safe Spaces, Adam Carolla and Dennis Prager documented their own disinvitation along with others. As of February 2020, the Foundation for Individual Rights in Education, a speech advocacy group, documented 469 disinvitation or disruption attempts at American campuses since 2000, including both "unsuccessful disinvitation attempts" and "successful disinvitations"; the group defines the latter category as including three subcategories: formal disinvitation by the sponsor of the speaking engagement; the speaker's withdrawal "in the face of disinvitation demands"; and "heckler's vetoes" (situations when "students or faculty persistently disrupt or entirely prevent the speakers' ability to speak"). === Deplatforming in social media === Beginning in 2015, Reddit banned several communities on the site ("subreddits") for violating the site's anti-harassment policy. A 2017 study published in the journal Proceedings of the ACM on Human-Computer Interaction, examining "the causal effects of the ban on both participating users and affected communities," found that "the ban served a number of useful purposes for Reddit" and that "Users participating in the banned subreddits either left the site or (for those who remained) dramatically reduced their hate speech usage. Communities that inherited the displaced activity of these users did not suffer from an increase in hate speech." In June 2020 and January 2021, Reddit also issued bans to pro-Trump communities over violations of the website's content and harassment policies. On May 2, 2019, Facebook and the Facebook-owned platform Instagram announced a ban of "dangerous individuals and organizations" including Nation of Islam leader Louis Farrakhan, Milo Yiannopoulos, Alex Jones and his organization InfoWars, Paul Joseph Watson, Laura Loomer, and Paul Nehlen. In the wake of the 2021 storming of the US Capitol, Twitter banned then-president Donald Trump, as well as 70,000 other accounts linked to the event and the far-right movement QAnon. Some studies have found that the deplatforming of extremists reduced their audience, although other research has found that some content creators became more toxic following deplatforming and migration to alt-tech platform. ==== Twitter ==== On November 18, 2022, Elon Musk, as newly appointed CEO of Twitter, reopened previously banned Twitter accounts of high-profile users, including Kathy Griffin, Jordan Peterson, and The Babylon Bee as part of the new Twitter policy. As Musk exclaimed, "New Twitter policy is freedom of speech, but not freedom of reach". ==== Alex Jones ==== On August 6, 2018, Facebook, Apple, YouTube and Spotify removed all content by Jones and InfoWars for policy violations. YouTube removed channels associated with InfoWars, including The Alex Jones Channel. On Facebook, four pages associated with InfoWars and Alex Jones were removed over repeated policy violations. Apple removed all podcasts associated with Jones from iTunes. On August 13, 2018, Vimeo removed all of Jones's videos because of "prohibitions on discriminatory and hateful content". Facebook cited instances of dehumanizing immigrants, Muslims and transgender people, as well as glorification of violence, as examples of hate speech. After InfoWars was banned from Facebook, Jones used another of his websites, NewsWars, to circumvent the ban. Jones's accounts were also removed from Pinterest, Mailchimp and LinkedIn. As of early August 2018, Jones retained active accounts on Instagram, Google+ and Twitter. In September, Jones was permanently banned from Twitter and Periscope after berating CNN reporter Oliver Darcy. On September 7, 2018, the InfoWars app was removed from the Apple App Store for "objectionable content". He was banned from using PayPal for business transactions, having violated the company's policies by expressing "hate or discriminatory intolerance against certain communities and religions." After Elon Musk's purchase of Twitter several previously banned accounts were reinstated including Donald Trump, Andrew Tate and Ye resulting in questioning if Alex Jones will be unbanned as well. However Musk denied that Alex Jones will be unbanned criticizing Jones as a person that "would use the deaths of children for gain, politics or fame". InfoWars remained available on Roku devices in January 2019, a year after the channel's removal from multiple streaming services. Roku indicated that they do not "curate or censor based on viewpoint," and that it had policies against content that is "unlawful, incited illegal activities, or violates third-party rights," but that InfoWars was not in violation of these policies. Following a social media backlash, Roku removed InfoWars and stated "After the InfoWars channel became available, we heard from concerned parties and have determined that the channel should be removed from our platform." In March 2019, YouTube terminated the Resistance News channel due to its reuploading of live streams from InfoWars. On May 1, 2019, Jones was barred from using both Facebook and Instagram. Jones briefly moved to Dlive, but was suspended in April 2019 for violating community guidelines. In March 2020, the InfoWars app was removed from the Google Play store due to claims of Jones disseminating COVID-19 misinformation. A Google spokesperson stated that "combating misinformation on the Play Store is a top priority for the team" and apps that violate Play policy by "distributing misleading or harmful information" are removed from the store. ==== Donald Trump ==== On January 6, 2021, in a joint session of the United States Congress, the counting of the votes of the Electoral College was interrupted by a breach of the United States Capitol chambers. The rioters were supporters of President Donald Trump who hoped to delay and overturn the President's loss in the 2020 election. The event resulted in five deaths and at least 400 people being charged with crimes. The certification of the electoral votes was only completed in the early morning hours of January 7, 2021. In the wake of several Tweets by President Trump on January 7, 2021 Facebook, Instagram, YouTube, Reddit, and Twitter all deplatformed Trump to some extent. Twitter deactivated his personal account, which the company said could possibly be used to promote further violence. Trump subsequently tweeted similar messages from the President's official US Government account @POTUS, which resulted in him being permanently banned on January 8. Twitter then announced that Trump's ban from their platform would be permanent. Trump planned to rejoin on social media through the use of a new platform by May or June 2021, according to Jason Miller on a Fox News broadcast. The same week Musk announced Twitter's new freedom of speech policy, he tweeted a poll to ask whether to bring back Trump into the platform. The poll ended with 51.8% in favor of unbanning Trump's account. Twitter has since reinstated Trump's Twitter accou
Read more →
Creepy treehouse

Creepy treehouse is a social media term, or internet slang, referring to websites or technologies that are used for educational purposes but regarded by students as an invasion of privacy. == History == The term was first described in 2008 by Utah Valley University instructional-design services director Jared Stein as "institutionally controlled technology/tool that emulates or mimics pre-existing [sic] technologies or tools that may already be in use by the learners, or by learners' peer groups." This was when social media such as Facebook was starting to become mainstream and professors would try and get students to interact with them on the site for educational purposes. Some professors would require their students to use Facebook or Twitter as part of class assignments. == Usage == The term was first described as "technological innovations by faculty members that make students’ skin crawl." The term also refers to online accounts and websites that users tend to avoid, especially young people who avoid visiting the pages of educators and other adults. Author Martin Weller defines creepy treehouse as a digital space where authority figures are viewed as invading younger people's privacy. One such example is a professor giving his students an option to use a popular video game to learn about history instead of writing an essay. Students in that class chose to write the essay instead as the method was previously unmentioned and it was not an unnatural method of interaction. Another example given was Blackboard Sync, a feature that was used to connect the school website Blackboard with students' Facebook accounts. == Solutions == University of Regina professor Alec Couros suggests that instead of "forcing" student participation with their own digital platforms, professors should use methods like online forums. Jason Jones of chronicle.com suggested letting students create social media groups for the class themselves and explaining why using technologies is required and important.
Read more →
Web worker

A web worker, as defined by the World Wide Web Consortium (W3C) and the Web Hypertext Application Technology Working Group (WHATWG), is a JavaScript script executed from an HTML page that runs in the background, independently of scripts that may also have been executed from the same HTML page. Web workers are often able to utilize multi-core CPUs more effectively. The W3C and WHATWG envision web workers as long-running scripts that are not interrupted by scripts that respond to clicks or other user interactions. Keeping such workers from being interrupted by user activities should allow Web pages to remain responsive at the same time as they are running long tasks in the background. The web worker specification is part of the HTML Living Standard. == Overview == As envisioned by WHATWG, web workers are relatively heavy-weight and are not intended to be used in large numbers. They are expected to be long-lived, with a high start-up performance cost, and a high per-instance memory cost. Web workers run outside the context of an HTML document's scripts. Consequently, while they do not have access to the DOM, they can facilitate concurrent execution of JavaScript programs. == Features == Web workers interact with the main document via message passing. The following code creates a Worker that will execute the JavaScript in the given file. To send a message to the worker, the postMessage method of the worker object is used as shown below. The onmessage property uses an event handler to retrieve information from a worker. Once a worker is terminated, it goes out of scope and the variable referencing it becomes undefined; at this point a new worker has to be created if needed. == Example == The simplest use of web workers is for performing a computationally expensive task without interrupting the user interface. In this example, the main document spawns a web worker to compute prime numbers, and progressively displays the most recently found prime number. The main page is as follows: The Worker() constructor call creates a web worker and returns a worker object representing that web worker, which is used to communicate with the web worker. That object's onmessage event handler allows the code to receive messages from the web worker. The Web Worker itself is as follows: To send a message back to the page, the postMessage() method is used to post a message when a prime is found. == Support == If the browser supports web workers, a Worker property will be available on the global window object. The Worker property will be undefined if the browser does not support it. The following example code checks for web worker support on a browser Web workers are currently supported by Chrome, Opera, Edge, Internet Explorer (version 10), Mozilla Firefox, and Safari. Mobile Safari for iOS has supported web workers since iOS 5. The Android browser first supported web workers in Android 2.1, but support was removed in Android versions 2.2–4.3 before being restored in Android 4.4.
Read more →
Pixelmator Pro

Pixelmator Pro is a photo, video, and vector graphic editor developed by Apple for macOS and iPadOS as part of its Pixelmator and pro apps platforms and as a part of their Apple Creator Studio suite of applications. Pixelmator Pro relies heavily on technologies from Apple platforms such as Metal, CoreML, Core Image, AVFoundation, GCD, and SwiftUI. == Features == GPU accelerated with Metal 50+ standard image editing tools Layer-based image editor Video editing support Vector graphic support (including SVG support) AI-powered editing features such as background removal ML Super Resolution and Smart Replace Supports a variety of media formats (JPEG, RAW, Apple ProRAW, PSD, PNG, GIF, MP4, HEIF, etc) == Reception == Pixelmator Pro was generally well-received by reviewers who praised its deep use of machine learning, fully macOS-native design, and relatively affordable one-time purchase compared to subscription software such as Adobe Photoshop. Some reviewers criticized that some features are hard to find or hard to use. It was awarded Apple's Mac App of the Year in 2018. Pixelmator Pro does not have support for panorama stitching. == Acquisition by Apple == On November 1, 2024, the Pixelmator Team announced that they were to be acquired by Apple, subject to regulatory approval. Their site promises "There will be no material changes to the Pixelmator Pro, Pixelmator for iOS, and Photomator apps at this time." The acquisition was completed in February 2025. On January 13, 2026, Apple announced that a new version of Pixelmator Pro with AI features would be included in its new Apple Creator Studio subscription, the app would be brought to the iPad and the Mac app would be redesigned with Liquid Glass. == Version history == == Applescript == In 2020 Pixelmator Pro added the ability to leverage Apple's automation language 'AppleScript' to automate many tasks in version 1.8 (Lynx). This enabled simple and advanced automation activities such as image resize, crop, color adjustments, format change, moving layers around, and more advanced actions like removing background, Gaussian blur, text replacement, shadows, color replacement, etc.
Read more →
Static web page

A static web page, sometimes called a flat page or a stationary page, is a web page that is delivered to a web browser exactly as stored, in contrast to dynamic web pages which are generated by a web application. Consequently, a static web page displays the same information for all users, from all contexts, subject to modern capabilities of a web server to negotiate content-type or language of the document where such versions are available and the server is configured to do so. However, a webpage's JavaScript can introduce dynamic functionality which may make the static web page dynamic. == Overview == Static web pages are often HTML documents, stored as files in the file system and made available by the web server over HTTP (nevertheless URLs ending with ".html" are not always static). However, loose interpretations of the term could include web pages stored in a database, and could even include pages formatted using a template and served through an application server, as long as the page served is unchanging and presented essentially as stored. The content of static web pages remains stationary irrespective of the number of times it is viewed. Such web pages are suitable for the contents that rarely need to be updated, though modern web template systems are changing this. Maintaining large numbers of static pages as files can be impractical without automated tools, such as static site generators. Any personalization or interactivity has to run client-side, which is restricting. Cloud-based website builders, including Wix, Weebly, and Duda, offer no-code platforms for creating static and dynamic web pages through graphical interfaces, without requiring programming expertise. === Advantages === Provide improved security over dynamic websites (dynamic websites are at risk to web shell attacks if a vulnerability is present) Improved performance for end users compared to dynamic websites Fewer or no dependencies on systems such as databases or other application servers Cost savings from utilizing cloud storage, as opposed to a hosted environment Security configurations are easy to set up, which makes it more secure Static files can be cached by content delivery networks (CDNs) and other intermediate caches, which both reduces page load times at the user and also reduces load on the origin server. Static websites can have improved uptime, since they are still available through any available CDN exit node even when other CDN nodes or the origin webserver are temporarily offline. === Disadvantages === Dynamic functionality must be performed on the client side. After each update of a static website, some or all users may see old, stale, outdated previous versions instead of the latest version until the old version is flushed from CDNs and other caches. == Static site generators == Static site generators are applications that compile static websites - typically populating HTML templates in a predefined folder and file structure, with content supplied in a format such as Markdown or AsciiDoc. === Implementations === Jekyll (powers GitHub Pages) Middleman Hugo Next.js Astro.build Pelican Franklin
Read more →
Quality of experience

Quality of experience (QoE) is a measure of the delight or annoyance of a customer's experiences with a service (e.g., web browsing, phone call, TV broadcast). QoE focuses on the entire service experience; it is a holistic concept, similar to the field of user experience, but with its roots in telecommunication. QoE is an emerging multidisciplinary field based on social psychology, cognitive science, economics, and engineering science, focused on understanding overall human quality requirements. == Definition and concepts == In 2013, within the context of the COST Action QUALINET, QoE has been defined as:The degree of delight or annoyance of the user of an application or service. It results from the fulfillment of his or her expectations with respect to the utility and / or enjoyment of the application or service in the light of the user’s personality and current state.This definition has been adopted in 2016 by the International Telecommunication Union in Recommendation ITU-T P.10/G.100. Before, various definitions of QoE had existed in the domain, with the above-mentioned definition now finding wide acceptance in the community. QoE has historically emerged from Quality of Service (QoS), which attempts to objectively measure service parameters (such as packet loss rates or average throughput). QoS measurement is most of the time not related to a customer, but to the media or network itself. QoE however is a purely subjective measure from the user's perspective of the overall quality of the service provided, by capturing people's aesthetic and hedonic needs. QoE looks at a vendor's or purveyor's offering from the standpoint of the customer or end user, and asks, "What mix of goods, services, and support, do you think will provide you with the perception that the total product is providing you with the experience you desired and/or expected?" It then asks, "Is this what the vendor/purveyor has actually provided?" If not, "What changes need to be made to enhance your total experience?" In short, QoE provides an assessment of human expectations, feelings, perceptions, cognition and satisfaction with respect to a particular product, service or application. QoE is a blueprint of all human subjective and objective quality needs and experiences arising from the interaction of a person with technology and with business entities in a particular context. Although QoE is perceived as subjective, it is an important measure that counts for customers of a service. Being able to measure it in a controlled manner helps operators understand what may be wrong with their services and how to improve them. == QoE factors == QoE aims at taking into consideration every factor that contributes to a user's perceived quality of a system or service. This includes system, human and contextual factors. The following so-called "influence factors" have been identified and classified by Reiter et al.: Human Influence Factors Low-level processing (visual and auditory acuity, gender, age, mood, …) Higher-level processing (cognitive processes, socio-cultural and economic background, expectations, needs and goals, other personality traits…) System Influence Factors Content-related Media-related (encoding, resolution, sample rate, …) Network-related (bandwidth, delay, jitter, …) Device-related (screen resolution, display size, …) Context Influence Factors Physical context (location and space) Temporal context (time of day, frequency of use, …) Social context (inter-personal relations during experience) Economic context Task context (multitasking, interruptions, task type) Technical and information context (relationship between systems) Studies in the field of QoE have typically focused on system factors, primarily due to its origin in the QoS and network engineering domains. Through the use of dedicated test laboratories, the context is often sought to be kept constant. == QoE versus User Experience == QoE is strongly related to but different from the field of User Experience (UX), which also focuses on users' experiences with services. Historically, QoE has emerged from telecommunication research, while UX has its roots in Human–Computer Interaction. Both fields can be considered multi-disciplinary. In contrast to UX, the goal of improving QoE for users was more strongly motivated by economic needs. Wechsung and De Moor identify the following key differences between the fields: == QoE measurement == As a measure of the end-to-end performance at the service level from the user's perspective, QoE is an important metric for the design of systems and engineering processes. This is particularly relevant for video services because – due to their high traffic demands –, bad network performance may highly affect the user's experience. So, when designing systems, the expected output, i.e. the expected QoE, is often taken into account – also as a system output metric and optimization goal. To measure this level of QoE, human ratings can be used. The mean opinion score (MOS) is a widely used measure for assessing the quality of media signals. It is a limited form of QoE measurement, relating to a specific media type, in a controlled environment and without explicitly taking into account user expectations. The MOS as an indicator of experienced quality has been used for audio and speech communication, as well as for the assessment of quality of Internet video, television and other multimedia signals, and web browsing. Due to inherent limitations in measuring QoE in a single scalar value, the usefulness of the MOS is often debated. Subjective quality evaluation requires a lot of human resources, establishing it as a time-consuming process. Objective evaluation methods can provide quality results faster, but require dedicated computing resources. Since such instrumental video quality algorithms are often developed based on a limited set of subjective data, their QoE prediction accuracy may be low when compared to human ratings. QoE metrics are often measured at the end devices and can conceptually be seen as the remaining quality after the distortion introduced during the preparation of the content and the delivery through the network, until it reaches the decoder at the end device. There are several elements in the media preparation and delivery chain, and some of them may introduce distortion. This causes degradation of the content, and several elements in this chain can be considered as "QoE-relevant" for the offered services. The causes of degradation are applicable for any multimedia service, that is, not exclusive to video or speech. Typical degradations occur at the encoding system (compression degradation), transport network, access network (e.g., packet loss or packet delay), home network (e.g. WiFi performance) and end device (e.g. decoding performance). == QoE management == Several QoE-centric network management and bandwidth management solutions have been proposed, which aim to improve the QoE delivered to the end-users. When managing a network, QoE fairness may be taken into account in order to keep the users sufficiently satisfied (i.e., high QoE) in a fair manner. From a QoE perspective, network resources and multimedia services should be managed in order to guarantee specific QoE levels instead of classical QoS parameters, which are unable to reflect the actual delivered QoE. A pure QoE-centric management is challenged by the nature of the Internet itself, as the Internet protocols and architecture were not originally designed to support today's complex and high demanding multimedia services. As an example for an implementation of QoE management, network nodes can become QoE-aware by estimating the status of the multimedia service as perceived by the end-users. This information can then be used to improve the delivery of the multimedia service over the network and proactively improve the users' QoE. This can be achieved, for example, via traffic shaping. QoE management gives the service provider and network operator the capability to minimize storage and network resources by allocating only the resources that are sufficient to maintain a specific level of user satisfaction. As it may involve limiting resources for some users or services in order to increase the overall network performance and QoE, the practice of QoE management requires that net neutrality regulations are considered.
Read more →
Algorithmic radicalization

Algorithmic radicalization is the concept that recommender algorithms on popular social media sites, such as YouTube and Facebook, drive users toward progressively more extreme content over time, leading to the development of radicalized extremist political views. Algorithms meticulously record user interactions, encompassing likes, dislikes and the duration of time watching content, with the objective of generating an endless stream of media designed to sustain user engagement. The phenomenon of echo chamber channels has been demonstrated to exacerbate the polarization of consumers, primarily through the reinforcement of media preferences and the validation of one's existing beliefs. Algorithmic radicalization remains a controversial phenomenon as it is often not in the best interest of social media companies to remove echo chamber channels. To what extent recommender algorithms are actually responsible for radicalization remains disputed. Studies have found contradictory results regarding the promotion of extremist content by algorithms. == Social media echo chambers and filter bubbles == Social media platforms learn the interests and likes of the user to modify their experiences in their feed to keep them engaged and scrolling, known as a filter bubble. An echo chamber is formed when users come across beliefs that magnify or reinforce their thoughts and form a group of like-minded users in a closed system. Echo chambers spread information without any opposing beliefs and can possibly lead to confirmation bias. According to group polarization theory, an echo chamber can potentially lead users and groups towards more extreme radicalized positions. According to the National Library of Medicine, "Users online tend to prefer information adhering to their worldviews, ignore dissenting information, and form polarized groups around shared narratives. Furthermore, when polarization is high, misinformation quickly proliferates." == By site == === Facebook === Facebook's algorithm focuses on recommending content that makes the user want to interact. They rank content by prioritizing popular posts by friends, viral content, and sometimes divisive content. Each feed is personalized to the user's specific interests which can sometimes lead users towards an echo chamber of troublesome content. Users can find their list of interests the algorithm uses by going to the "Your ad Preferences" page. According to a Pew Research study, 74% of Facebook users did not know that list existed until they were directed towards that page in the study. It is also relatively common for Facebook to assign political labels to their users. In recent years, Facebook has started using artificial intelligence to change the content users see in their feed and what is recommended to them. A document known as The Facebook Files has revealed that their AI system prioritizes user engagement over everything else. The Facebook Files has also demonstrated that controlling the AI systems has proven difficult to handle. In an August 2019 internal memo leaked in 2021, Facebook has admitted that "the mechanics of our platforms are not neutral", concluding that in order to reach maximum profits, optimization for engagement is necessary. In order to increase engagement, algorithms have found that hate, misinformation, and politics are instrumental for app activity. As referenced in the memo, "The more incendiary the material, the more it keeps users engaged, the more it is boosted by the algorithm." According to a 2018 study, "false rumors spread faster and wider than true information... They found falsehoods are 70% more likely to be retweeted on Twitter than the truth, and reach their first 1,500 people six times faster. This effect is more pronounced with political news than other categories." === YouTube === YouTube has been around since 2005 and has more than 2.5 billion monthly users. YouTube discovery content systems focus on the user's personal activity (watched, favorites, likes) to direct them to recommended content. YouTube's algorithm is accountable for roughly 70% of users' recommended videos and what drives people to watch certain content. According to a 2022 study by the Mozilla Foundation, users have little power to keep unsolicited videos out of their suggested recommended content. This includes videos about hate speech, livestreams, etc. YouTube has been identified as an influential platform for spreading radicalized content. Al-Qaeda and similar extremist groups have been linked to using YouTube for recruitment videos and engaging with international media outlets. In a research study published by the American Behavioral Scientist Journal, they researched "whether it is possible to identify a set of attributes that may help explain part of the YouTube algorithm's decision-making process". The results of the study showed that YouTube's algorithm recommendations for extremism content factor into the presence of radical keywords in a video's title. In February 2023, in the case of Gonzalez v. Google, the question at hand is whether or not Google, the parent company of YouTube, is protected from lawsuits claiming that the site's algorithms aided terrorists in recommending ISIS videos to users. Section 230 is known to generally protect online platforms from civil liability for the content posted by its users. Multiple studies have found little to no evidence to suggest that YouTube's algorithms direct attention towards far-right content to those not already engaged with it. === TikTok === TikTok is a platform that recommends videos to a user's 'For You Page' (FYP), making every users' page different. With the nature of the algorithm behind the app, TikTok's FYP has been linked to showing more explicit and radical videos over time based on users' previous interactions on the app. Since TikTok's inception, the app has been scrutinized for misinformation and hate speech as those forms of media usually generate more interactions to the algorithm. Various extremist groups, including jihadist organizations, have utilized TikTok to disseminate propaganda, recruit followers, and incite violence. The platform's algorithm, which recommends content based on user engagement, can expose users to extremist content that aligns with their interests or interactions. As of 2022, TikTok's head of US Security has put out a statement that "81,518,334 videos were removed globally between April – June for violating our Community Guidelines or Terms of Service" to cut back on hate speech, harassment, and misinformation. Studies have noted instances where individuals were radicalized through content encountered on TikTok. For example, in early 2023, Austrian authorities thwarted a plot against an LGBTQ+ pride parade that involved two teenagers and a 20-year-old who were inspired by jihadist content on TikTok. The youngest suspect, 14 years old, had been exposed to videos created by Islamist influencers glorifying jihad. These videos led him to further engagement with similar content, eventually resulting in his involvement in planning an attack. Another case involved the arrest of several teenagers in Vienna, Austria, in 2024, who were planning to carry out a terrorist attack at a Taylor Swift concert. The investigation revealed that some of the suspects had been radicalized online, with TikTok being one of the platforms used to disseminate extremist content that influenced their beliefs and actions. == Self-radicalization == The U.S. Department of Justice defines 'Lone-wolf' (self) terrorism as "someone who acts alone in a terrorist attack without the help or encouragement of a government or a terrorist organization". Through social media outlets on the internet, 'Lone-wolf' terrorism has been on the rise, being linked to algorithmic radicalization. Through echo-chambers on the internet, viewpoints typically seen as radical were accepted and quickly adopted by other extremists. These viewpoints are encouraged by forums, group chats, and social media to reinforce their beliefs. == References in media == === The Social Dilemma === The Social Dilemma is a 2020 docudrama about how algorithms behind social media enables addiction, while possessing abilities to manipulate people's views, emotions, and behavior to spread conspiracy theories and disinformation. The film repeatedly uses buzz words such as 'echo chambers' and 'fake news' to prove psychological manipulation on social media, therefore leading to political manipulation. In the film, Ben falls deeper into a social media addiction as the algorithm found that his social media page has a 62.3% chance of long-term engagement. This leads into more videos on the recommended feed for Ben and he eventually becomes more immersed into propaganda and conspiracy theories, becoming more polarized with each video. == Proposed solutions == === United States: Weakening Section 230 protections === In the Communications Decency Act, Section 230 states t
Read more →
List of publications in data science

This is a list of publications in data science, generally organized by order of use in a data analysis workflow. See the list of publications in statistics for more research-based and fundamental publications; while this list is more applied, business oriented, and cross-disciplinary. General article inclusion criteria are: Papers from notable practitioners or notable professors, either with a Wikipedia page or reference to their notability Common knowledge all data professionals should know, with references validating this claim Highly cited applied statistics and machine learning publications Discussion-facilitating papers on the field of data science as a whole (for example, the Attention Is All You Need paper is arguably a landmark paper that can be added here, but it is specific to generative artificial intelligence, not for all practitioners of data) Some reasons why a particular publication might be regarded as important: Topic creator – A publication that created a new topic Breakthrough – A publication that changed scientific knowledge significantly Influence – A publication which has significantly influenced the world or has had a massive impact on the teaching of data science. When possible, a reference is used to validate the inclusion of the publication in this list. == History == Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) Author: Leo Breiman Publication data: Online version: https://projecteuclid.org/journals/statistical-science/volume-16/issue-3/Statistical-Modeling--The-Two-Cultures-with-comments-and-a/10.1214/ss/1009213726.pdf Description: Describes two cultures of statistics, one using a parsimonious and generative stochastic model, while the other is an algorithmic model with no known mechanism for how the data is generated. Breiman argues that while statistics has traditionally favored using the stochastic model, there is value in expanding the methods that statisticians can use to study phenomenon. Importance: Influence on the philosophies of statisticians right before the increased use of machine learning and deep learning methods. In a 20-year retrospective on this article, "Breiman's words are perhaps more relevant than ever". Notable statisticians at the time wrote opinion pieces about the publication. Although overall critical of the publication, David Cox writes that the publication "contains enough truth and exposes enough weaknesses to be thought-provoking." Bradley Efron commented that this publication is a "stimulating paper". Emanuel Parzen also comments about this publication that "Breiman alerts us to systematic blunders (leading to wrong conclusions) that have been committed applying current statistical practice of data modeling". Data Scientist: The Sexiest Job of the 21st Century Author: Thomas H. Davenport and DJ Patil Publication data: Online version: hbr.org/2022/07/is-data-scientist-still-the-sexiest-job-of-the-21st-century Description: Describes the new role at companies that is coined "Data scientist", what they do, how an organization might recruit one to their organization, and how to work with one effectively. Importance: This publication has been an influence on the data community as mentioned near the time it was published in 2012 by institutions like IEEE Spectrum, but also mentioned nearly a decade later asking the same question the title poses. In a retrospective response to their own publication 10 years earlier, authors Davenport and Patil have reflected that the role of a data scientist has "become better institutionalized, the scope of the job has been redefined, the technology it relies on has made huge strides, and the importance of non-technical expertise, such as ethics and change management, has grown". 50 Years of Data Science Author: David Donoho Publication data: Online version: https://www.tandfonline.com/doi/full/10.1080/10618600.2017.1384734 Description: Retrospective discussion paper on the history and origins of data science, with a number of commentary from notable statisticians. Importance: This has been described as "the first in the field to present such a comprehensive and in-depth survey and overview", and helps to define the field that has many definitions. The Composable Data Management System Manifesto Author: Pedro Pedreira, Orri Erling, Konstantinos Karanasos, Scott Schneider, Wes McKinney, Satya R Valluri, Mohamed Zait, Jacques Nadeau Publication data: Online version: https://www.vldb.org/pvldb/vol16/p2679-pedreira.pdf Description: The vision paper advocating for a paradigm shift in how data management systems are designed using standard, composable, interoperable tools rather than siloed software tools. Importance: A paradigm shifting view on how future data science software tools should be designed for more efficient workflows, the principles of which "will be especially crucial for addressing fragmentation, improving interoperability, and promoting user-centricity as data ecosystems grow increasingly complex". == Data collection and organization == Tidy Data Author: Hadley Wickham Publication data: Online version: https://www.jstatsoft.org/article/view/v059i10/ https://vita.had.co.nz/papers/tidy-data.pdf Description: Describes a framework for data cleaning that is summarized in the quote, "each variable is a column, each observation is a row, and each type of observational unit is a table". This allows a standard data structure for which data analysis tools can be consistently built around. Importance: Cited over 1,500 times, this effort for tidy data has been described by David Donoho as having "more impact on today's practice of data analysis than many highly regarded theoretical statistics articles". In the context of data visualization, this publication is said to support "efficient exploration and prototyping because variables can be assigned different roles in the plot without modifying anything about the original dataset". Data Organization in Spreadsheets Author: Karl W. Broman and Kara H. Woo Publication data: Online version: https://www.tandfonline.com/doi/full/10.1080/00031305.2017.1375989 Description: This article offers practical recommendations for organizing data in spreadsheets, like Microsoft Excel and Google Sheets, to reduce errors and lower the barrier for later analyses due to limitations in spreadsheets or quirks in the software. Importance: Influences teaching both data and non-data practitioners to create more analysis-friendly spreadsheets, and has been described to outline "spreadsheet best practices". == Data visualizations == Quantitative Graphics in Statistics: A Brief History Author: James R. Beniger and Dorothy L. Robyn Publication data: Online version: https://www.jstor.org/stable/2683467 Description: Outlines history and evolution of quantitative graphics in statistics, going through spatial organization (17th and 18th centuries), discrete comparison (18th and 19th centuries), continuous distribution (19th century), and multivariate distribution and correlation (late 19th and 20th centuries). Importance: Helps put into perspective for learning data practitioners the recency of graphics that are used. A later publication "Graphical Methods in Statistics" by Stephen Fienberg in 1979 writes that his publication "owes much to the work of Beniger and Robyn". == Practice == Data Science for Business Author: Foster Provost and Tom Fawcett Publication data: Online version: N/A Description: Broadly outlines principles of data science and data-analytic thinking for businesses. Importance: Cited over 3,000 times, it is "highly recommended for students" but also it is also recommended due to its "relevance to senior management leaders who want to build and lead a team of data scientists and implement data science in solving complex business problems". == Tooling == Hidden Technical Debt in Machine Learning Systems Author: D. Sculley, Gary Holy, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-François Crespo, Dan Dennison Publication data: Online version: https://proceedings.neurips.cc/paper_files/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf Description: This paper argues that it is "dangerous to think of [complex machine learning] quick wins as coming for free" and overviews risk factors to account for when implementing a machine learning system. Importance: All authors worked for Google, article is cited over 2,000 times, and helped practitioners thinking about quickly implementing a machine learning tool without understanding the long-term maintenance of the tool. A few useful things to know about machine learning Author: Pedro Domingos Publication data: Online version: https://dl.acm.org/doi/10.1145/2347736.2347755 https://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf Description: The purpose of this paper is to distill inaccessible "folk knowledge" to effectively implement machine learning projects because "machin
Read more →
Link-richness

Link-richness is the quality, possessed by some websites, of having many hyperlinks. Classified advertising sites like Craigslist tend to be very link-rich, sometimes with hundreds of links on their main page. They help users find the links they are looking for by grouping links into clusters. Inadequate link richness has been described as frustrating to readers, as it reduces transparency of site content from the main page. Students new to wiki collaboration were found to need guidance in how to take full advantage of the medium's potential for creating link-rich content. Link-richness in some contexts can be distracting, as when an article is surrounded by extraneous links. Indeed, it is becoming accepted as a best practice for universities to have link-rich home pages that do not rely on user categorisation and exploration of long sequences of links and are not constrained by traditional boundaries between departments. Tools are sometimes needed to make the publishing of link-rich web sites tractable, and many people may lack the technical skills, time, or inclination to engage in hand- crafting new digital document forms. A link-rich site that is low on content is sometimes referred to as a "gateway site." Link-rich portals were popular on the Web in 2000. Yahoo! and other sites featuring categories with many links were heavily used and often required fewer than three clicks to reach the content. Web designers were creating flat sites with content positioned close to the top of pages.
Read more →
Hardware backdoor

A hardware backdoor is a backdoor implemented within the physical components of a computer system, also known as its hardware. They can be created by introducing malicious code to a component's firmware, or even during the manufacturing process of an integrated circuit. Often, they are used to undermine security in smartcards and cryptoprocessors, unless investment is made in anti-backdoor design methods. They have also been considered for car hacking. Backdoors differ from hardware Trojans as backdoors are introduced intentionally by the original designer or during the design process, whereas hardware Trojans are inserted later by an external party. == Background == The existence of hardware backdoors poses significant security risks for several reasons. They are difficult to detect and are impossible to remove using conventional methods like antivirus software. They can also bypass other security measures, such as disk encryption. Hardware trojans can be introduced during manufacturing where the end-user lacks control over the production chain. == History == In 2008, the FBI reported the discovery of approximately 3,500 counterfeit Cisco network components in the United States, some of which were introduced in military and government infrastructure. In the same year, the possibility of a backdoor SPARC CPU was demonstrated with an FPGA running Linux that supported various hidden malicious services. A few years later, in 2011, Jonathan Brossard presented "Rakshasa", a proof-of-concept hardware backdoor. This backdoor could be installed by an individual with physical access to the hardware. It utilized coreboot to re-flash the BIOS with a SeaBIOS and iPXE-based bootkit composed of legitimate, open-source tools, allowing malware to be fetched from the internet during the boot process. The following year, in 2012, Sergei Skorobogatov and Christopher Woods from the University of Cambridge Computer Laboratory reported the discovery of a backdoor in a military-grade FPGA device, which could be exploited to access and modify sensitive information. It has been said that this was proven to be a software problem and not a deliberate attempt at sabotage. This still brought to attention that equipment manufacturers should ensure that microchips operate as intended. Later that year, two mobile phones developed by the Chinese company ZTE were found to carry a root access backdoor. According to security researcher Dmitri Alperovitch, the exploit used a hard-coded password in its software. Starting in 2012, the United States stated that Huawei might have backdoors present in their products. In 2013, researchers at the University of Massachusetts devised a method of breaking a CPU's internal cryptographic mechanisms by introducing specific impurities into the crystalline structure of transistors to change Intel's random-number generator. Documents revealed from 2013 onwards during the surveillance disclosures initiated by Edward Snowden showed that the Tailored Access Operations (TAO) unit and other NSA employees intercepted servers, routers, and other network gear being shipped to organizations targeted for surveillance to install covert implant firmware onto them before delivery. These tools include custom BIOS exploits that survive the reinstallation of operating systems and USB cables with spy hardware and radio transceiver packed inside. In June 2016 it was reported that University of Michigan Department of Electrical Engineering and Computer Science had built a hardware backdoor that leveraged "analog circuits to create a hardware attack" so that after the capacitors store up enough electricity to be fully charged, it would be switched on, to give an attacker complete access to whatever system or device − such as a PC − that contains the backdoored chip. In the study that won the "best paper" award at the IEEE Symposium on Privacy and Security they also note that microscopic hardware backdoor wouldn't be caught by practically any modern method of hardware security analysis, and could be planted by a single employee of a chip factory. In October 2018 Bloomberg reported that an attack by Chinese spies reached almost 30 U.S. companies, including Amazon and Apple, by compromising America's technology supply chain. == Countermeasures == Skorobogatov has developed a technique capable of detecting malicious insertions into chips. New York University Tandon School of Engineering researchers have developed a way to corroborate a chip's operation using verifiable computing whereby "manufactured for sale" chips contain an embedded verification module that proves the chip's calculations are correct and an associated external module validates the embedded verification module. Another technique developed by researchers at University College London (UCL) relies on distributing trust between multiple identical chips from disjoint supply chains. Assuming that at least one of those chips remains honest the security of the device is preserved. Researchers at the University of Southern California Ming Hsieh Department of Electrical and Computer Engineering and the Photonic Science Division at the Paul Scherrer Institute have developed a new technique called Ptychographic X-ray laminography. This technique is the only current method that allows for verification of the chips blueprint and design without destroying or cutting the chip. It also does so in significantly less time than other current methods. Anthony F. J. Levi Professor of electrical and computer engineering at University of Southern California explains “It’s the only approach to non-destructive reverse engineering of electronic chips—[and] not just reverse engineering but assurance that chips are manufactured according to design. You can identify the foundry, aspects of the design, who did the design. It’s like a fingerprint.” This method currently is able to scan chips in 3D and zoom in on sections and can accommodate chips up to 12 millimeters by 12 millimeters easily accommodating an Apple A12 chip but not yet able to scan a full Nvidia Volta GPU. "Future versions of the laminography technique could reach a resolution of just 2 nanometers or reduce the time for a low-resolution inspection of that 300-by-300-micrometer segment to less than an hour, the researchers say."
Read more →