AI Data Bias

AI Data Bias — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

Application permissions

Permissions are a means of controlling and regulating access to specific system- and device-level functions by software. Typically, types of permissions cover functions that may have privacy implications, such as the ability to access a device's hardware features (including the camera and microphone), and personal data (such as storage devices, contacts lists, and the user's present geographical location). Permissions are typically declared in an application's manifest, and certain permissions must be specifically granted at runtime by the user—who may revoke the permission at any time. Permission systems are common on mobile operating systems, where permissions needed by specific apps must be disclosed via the platform's app store. == Mobile devices == On mobile operating systems for smartphones and tablets, typical types of permissions regulate: Access to storage and personal information, such as contacts, calendar appointments, etc. Location tracking. Access to the device's internal camera and/or microphone. Access to biometric sensors, including fingerprint readers and other health sensors.. Internet access. Access to communications interfaces (including their hardware identifiers and signal strength where applicable, and requests to enable them), such as Bluetooth, Wi-Fi, NFC, and others. Making and receiving phone calls. Sending and reading text messages The ability to perform in-app purchases. The ability to "overlay" themselves within other apps. Installing, deleting and otherwise managing applications. Authentication tokens (e.g., OAuth tokens) from web services stored in system storage for sharing between apps. Prior to Android 6.0 "Marshmallow", permissions were automatically granted to apps at runtime, and they were presented upon installation in Google Play Store. Since Marshmallow, certain permissions now require the app to request permission at runtime by the user. These permissions may also be revoked at any time via Android's settings menu. Usage of permissions on Android are sometimes abused by app developers to gather personal information and deliver advertising; in particular, apps for using a phone's camera flash as a flashlight (which have grown largely redundant due to the integration of such functionality at the system level on later versions of Android) have been known to require a large array of unnecessary permissions beyond what is actually needed for the stated functionality. iOS imposes a similar requirement for permissions to be granted at runtime, with particular controls offered for enabling of Bluetooth, Wi-Fi, and location tracking. == WebPermissions == WebPermissions is a permission system for web browsers. When a web application needs some data behind permission, it must request it first. When it does it, a user sees a window asking him to make a choice. The choice is remembered, but can be cleared lately. Currently the following resources are controlled: geolocation desktop notifications service workers sensors audio capturing devices, like sound cards, and their model names and characteristics video capturing devices, like cameras, and their identifiers and characteristics == Analysis == The permission-based access control model assigns access privileges for certain data objects to application. This is a derivative of the discretionary access control model. The access permissions are usually granted in the context of a specific user on a specific device. Permissions are granted permanently with few automatic restrictions. In some cases permissions are implemented in 'all-or-nothing' approach: a user either has to grant all the required permissions to access the application or the user can not access the application. There is still a lack of transparency when the permission is used by a program or application to access the data protected by the permission access control mechanism. Even if a user can revoke a permission, the app can blackmail a user by refusing to operate, for example by just crashing or asking user to grant the permission again in order to access the application. The permission mechanism has been widely criticized by researchers for several reasons, including; Intransparency of personal data extraction and surveillance, including the creation of a false sense of security; End-user fatigue of micro-managing access permissions leading to a fatalistic acceptance of surveillance and intransparency; Massive data extraction and personal surveillance carried out once the permissions are granted. Some apps, such as XPrivacy and Mockdroid spoof data in order to act as a measure for privacy. Further transparency methods include longitudinal behavioural profiling and multiple-source privacy analysis of app data access.
Read more →
Digital history

Digital history is the use of digital media to further historical analysis, presentation, and research. It is a branch of the digital humanities and an extension of quantitative history, cliometrics, and computing. Digital history is commonly known as digital public history, concerned primarily with engaging online audiences with historical content, or digital research methods, that further academic research. Digital history outputs include: digital archives, online presentations, data and information visualizations, interactive maps, timelines, audio files, and virtual worlds. These outputs are designed to enhance accessibility to users, facilitating engagement with historical content. Recent digital history projects focus on creativity, collaboration, and technical innovation, text mining, corpus linguistics, network analysis, 3D modeling, and big data analysis. By utilizing these resources, the user can rapidly develop new analyses that can link to, extend, and bring to life existing histories. == History == Rooted in earlier social science history work, particularly around the history of enslavement in the United States, early digital history in the 1960s and 70s focused on using computers to conduct quantitative analyses, primarily of demographic and social history data - censuses, election returns, city directories, and other tabular or countable data. - with the aim of producing defensible research findings These early computers could be programmed to conduct statistical analyses of these records, creating tallies, or seeking trends across records. This research into historical demography was rooted in the rise of social history as a field of historical interest. The historians involved in this work sought to quantify past societies, to come to new conclusions about communities and population. Computers proved capable tools for that type of work. By the late 1970s younger historians turned to cultural studies, most of these studies involved online databases that were checked by Professionals in Great Britain about once a year. The outpouring of quantitative studies by established scholars continued. Since then, quantitative history and cliometrics have been used primarily by historically minded economists and political scientists. In the late 1980s quantifiers founded the Association for History and Computing. This movement provided some of the impetus for the rise of digital history in the 1990s. The more recent roots of digital history were in software rather than online networks. In 1982, the Library of Congress embarked on its Optical Disk Pilot Project, which placed text and images from its collection on to laserdiscs and CD-ROMs. The library started offering online exhibits in 1992 when it launched Selected Civil War Photographs. In 1993, Roy Rosenzweig, along with Steve Brier and Josh Brown, produced their award-winning CD-ROM Who Built America? From the Centennial Exposition of 1876 to the Great War of 1914, designed for Apple, Inc. that integrated images, text, film and sound clips, displayed in a visual interface that supported a text narrative. Among the earliest online digital history projects were The Heritage Project of the University of Kansas, and medieval historian Dr. Lynn Nelson's World History Index and History Central Catalogue. Another was The Valley of the Shadow, conceived in 1991 by current University of Richmond professor of humanities and president emeritus, Edward L. Ayers, who was then at the University of Virginia. The Institute for Advanced Technology in the Humanities (IATH) at the University of Virginia adopted the Valley Project and partnered with IBM to collect and transcribe historical sources into digital files. The project collected data related to Augusta County in Virginia and Franklin County in Pennsylvania during the American Civil War. In 1996, William G. Thomas III joined Ayers on the Valley Project. Together, they produced an online article entitled "The Differences Slavery Made: A Close Analysis of Two American Communities," which also appeared in The American Historical Review in 2003. A CD-ROM also accompanied the Valley Project, published by W. W. Norton and Company in 2000. Rosenzweig, who died October 11, 2007, founded the Center for History and New Media (CHNM) at George Mason University in 1994. Today, CHNM boasts several digital tools available to historians, such as Zotero, Omeka or Tropy. In 1997, Ayers and Thomas used the term "digital history" when they proposed and founded the Virginia Center for Digital History (VCDH) at the University of Virginia, the earliest center devoted exclusively to history. Several other institutions promoting digital history include the Center for Humane Arts, Letters, and Social Sciences Online (MATRIX) at Michigan State University, Maryland's Institute for Technology in the Humanities, and the Center for Digital Research in the Humanities at the University of Nebraska. In 2004, Emory University launched Southern Spaces, a "peer-reviewed Internet journal and scholarly forum" examining the history of the South. == Applications == There are many potential benefits to the use of digital history when combined with traditional historical methods. Some of these applications include: Combining traditional historical methods and new research methods in order to come to new conclusions. Using different tools to extract and analyse larger amounts of data that would not be manageable otherwise. Create models and maps of data extracted to create a visualisation of the data. Data extracted and analysed can be placed alongside existing historiography to increase combined historical knowledge. By adding new research methods to existing historical method, historians can benefit greatly from the ability to work with larger amounts of data and develop new interpretations from this. == Notable Projects == The collaborative nature of most digital history endeavors has meant that the discipline has developed primarily at institutions with the resources to sponsor content research and technical innovation. Two of the first centers, George Mason University's Center for History and New Media and the Virginia Center for Digital History at the University of Virginia have been among the leaders in the development of digital history projects and the education of digital historians. Some of the noteworthy projects emerging from these pioneering centers are The Geography of Slavery, The Texas Slavery Project, and The Countryside Transformed at VCDH and Liberty, Equality, Fraternity: Exploring the French Revolution and The Lost Museum at the CHNM. In each of these projects, mediated archives holding multiple types of sources are combined with digital tools to analyze and illuminate an historical question to a varying degree; this integration of content and tools with analysis is one of the hallmarks of digital history—projects move beyond archives or collections and into scholarly analysis and the use of digital tools to develop that analysis. The differences between the ways projects incorporate these integrations are a measure of the development of the field and point to the ongoing debates over what digital history can and should be. While many of the projects at VCDH, CHNM, and other university's centers have been geared towards academics and post-secondary education, the University of Victoria (British Columbia), in conjunction with the Université de Sherbrooke and the Ontario Institute for Studies in Education at the University of Toronto, has created as series of projects for all ages, "Great Unsolved Mysteries in Canadian History." Laden with instructional aids, this site asks teachers to introduce students to historical research methods to help them develop analytical skills and a sense of the complexities of their national history. Issues of race, religion, and gender are addressed in carefully constructed modules that cover incidents in Canadian history from Viking exploration through the 1920s. One of the original co-creators of the project, John Lutz has also developed Victoria's Victoria with the University of Victoria and Malaspina University-College. In addition to Ayers, Thomas, Lutz, and Rosenzweig, numerous other individual scholars work with digital history techniques and have made and/or continue to make important contributions to the field. Robert Darnton's 2000 article, "An Early Information Society: News and the Media in Eighteenth-Century Paris" was supplemented with electronic resources and is an early model of the discussions around digital history and its future in the humanities. One of the first major digital projects to be reviewed by the American Historical Review (AHR) was Philip Ethington's "Los Angeles and the Problem of Urban Historical Knowledge"—a multimedia exploration of changes to Los Angeles' physical profile over the course of several decades. In this essay, he also expresses his beliefs that historians have major power in
Read more →
Digital anthropology

Digital anthropology is the anthropological study of the relationship between humans and digital-era technology. The field is new, and thus has a variety of names with a variety of emphases. These include techno-anthropology, digital ethnography, cyberanthropology, and virtual anthropology. == Definition and scope == Most anthropologists who use the phrase "digital anthropology" are specifically referring to online and Internet technology. The study of humans' relationship to a broader range of technology may fall under other subfields of anthropological study, such as cyborg anthropology. The Digital Anthropology Group (DANG) is classified as an interest group in the American Anthropological Association. DANG's mission includes promoting the use of digital technology as a tool of anthropological research, encouraging anthropologists to share research using digital platforms, and outlining ways for anthropologists to study digital communities. Cyberspace or the "virtual world" itself can serve as a "field" site for anthropologists, allowing the observation, analysis, and interpretation of the sociocultural phenomena springing up and taking place in any interactive space. National and transnational communities, enabled by digital technology, establish a set of social norms, practices, traditions, storied history and associated collective memory, migration periods, internal and external conflicts, potentially subconscious language features and memetic dialects comparable to those of traditional, geographically confined communities. This includes the various communities built around free and open-source software, online platforms such as Facebook, Twitter/X, Instagram, 4chan and Reddit and their respective sub-sites, and politically motivated groups like Anonymous, WikiLeaks, or the Occupy movement. A number of academic anthropologists have conducted traditional ethnographies of virtual worlds, such as Bonnie Nardi's study of World of Warcraft or Tom Boellstorff's study of Second Life. Academic Gabriella Coleman has done ethnographic work on the Debian software community and the Anonymous hacktivist network. Theorist Nancy Mauro-Flude conducts ethnographic field work on computing arts and computer subcultures such as systerserver.net a part of the communities of feminist web servers and the Feminist Internet network. Eitan Y. Wilf examines the intersection of artists' creativity and digital technology and artificial intelligence. Yongming Zhou studied how in China the internet is used to participate in politics. Eve M. Zucker and colleagues study the shift to digital memorialization of mass atrocities and the emergent role of artificial intelligence in these processes. Victoria Bernal conducted ethnographic research on the themes of nationalism and citizenship among Eritreans participating in online political engagement with their homeland. Anthropological research can help designers adapt and improve technology. Australian anthropologist Genevieve Bell did extensive user experience research at Intel that informed the company's approach to its technology, users, and market. == Methodology == === Digital fieldwork === Many digital anthropologists who study online communities use traditional methods of anthropological research. They participate in online communities in order to learn about their customs and worldviews, and back their observations with private interviews, historical research, and quantitative data. Their product is an ethnography, a qualitative description of their experience and analyses. Other anthropologists and social scientists have conducted research that emphasizes data gathered by websites and servers. However, academics often have trouble accessing user data on the same scale as social media corporations like Facebook and data mining companies like Acxiom. In terms of method, there is a disagreement in whether it is possible to conduct research exclusively online or if research will only be complete when the subjects are studied holistically, both online and offline. Tom Boellstorff, who conducted a three-year research as an avatar in the virtual world Second Life, defends the first approach, stating that it is not just possible, but necessary to engage with subjects “in their own terms”. Others, such as Daniel Miller, have argued that an ethnographic research should not exclude learning about the subject's life outside the internet. === Digital technology as a tool of anthropology === The American Anthropological Association offers an online guide for students using digital technology to store and share data. Data can be uploaded to digital databases to be stored, shared, and interpreted. Text and numerical analysis software can help produce metadata, while a codebook may help organize data. == Ethics == Online fieldwork offers new ethical challenges. According to the American Anthropological Association's ethics guidelines, anthropologists researching a community must make sure that all members of that community know they are being studied and have access to data the anthropologist produces. However, many online communities' interactions are publicly available for anyone to read, and may be preserved online for years. Digital anthropologists debate the extent to which lurking in online communities and sifting through public archives is ethical. The Association also asserts that anthropologists' ability to collect and store data at all is "a privilege", and researchers have an ethical duty to store digital data responsibly. This means protecting the identity of participants, sharing data with other anthropologists, and making backup copies of all data. == Prominent figures == Genevieve Bell is an Australian cultural anthropologist credited for pioneering the User Experience field. During her time working for Intel Corporation, Bell studied how various cultures from around the world interacted with and experienced technology. Researching and improving user experience allows companies and designers to gather data regarding how users utilize their digital products and what requires improvement or expansion. Tom Boellstorff is an anthropologist known for Coming of Age in Second Life: An Anthropologist Explores the Virtually Human where he conducted research on how engaging in virtual worlds affects the player’s sense of self. Gabriella Coleman is an American anthropologist concerned with the politics, ethics, and culture of hacking and online activism. Coleman’s most notable ethnography features the hacktivist collective Anonymous, where she argues that various genres of hacking exist according to the social conditions at play. Coleman is dedicated to making her ethnography accessible to a diverse audience, including academics and non-academics. Diana E. Forsythe was an American anthropologist of science and technology and the author of the essays featured in Studying Those Who Study Us: An Anthropologist in the World of Artificial Intelligence. She asked relevant questions such as how should humans interact with computers and how gender roles are maintained in technology-oriented occupations. Heather Horst is a sociocultural anthropologist interested in the relationship between digital social relations and material culture. Nancy Mauro-Flude is a design anthropologist whose work explores the tacit relations between embodied cognition, computational materiality, maker culture, self-hosted webserver cooperatives, creative practice, and artistic research in digital infrastructure and Internet publishing. Mizuko Ito is a Japanese cultural anthropologist specializing in technology use and the intersection between computers and the social sciences. Her primary interest is in how young people utilize media technology and how it can be used to engage students in education. Daniel Miller is an anthropologist with a concentration in digital anthropology. His research includes the smartphone and perpetual opportunism, the intent and consequences of posting on social media in various geographical locations, and how hospice patients use media to socialize in the last stage of their lives. Mike Wesch is a cultural anthropologist interested in how people share their lives, cultures, and beliefs through digital media.
Read more →
Digital entertainment

Digital entertainment Industry includes, but is not restricted to, any combination of the following industries (that themselves have a considerable degree of overlap): digital media new media video on demand video games interactive entertainment online gambling mobile entertainment social media streaming services "Digital entertainment", largely a hard to define marketing term, rests upon entertainment technology and ultimately on the enabling basic technologies computers, Internet/World Wide Web, digital rights management, multimedia and streaming media. Apart from pure entertainment, the term rests upon the observation that already in 2011 in the UK, for example, "nearly half of people’s waking hours are spent using media content and communications services" ("screen time"). Digital entertainment is inextricably connected with digital marketing. People who follow influencers on social media for entertainment will receive a fair share of advertising at the same time. Digital merchandise is distributed with every computer game and popup ads or similar are ubiquitous in the online (gaming) world.
Read more →
Trello

Trello is a web-based, kanban-style list-making application developed by Atlassian. Created in 2011 by Fog Creek Software, it was spun out to form the basis of a separate company in New York City in 2014 and sold to Atlassian in January 2017. == History == The name Trello is derived from the word trellis, which had been a code name for the project at its early stages. Trello was released at a TechCrunch event by Fog Creek founder Joel Spolsky. In September 2011 Wired magazine named the application one of "The 7 Coolest Startups You Haven't Heard of Yet". Lifehacker said "it makes project collaboration simple and kind of enjoyable". In 2014, it raised US$10.3 million in funding from Index Ventures and Spark Capital. Prior to its acquisition, Trello had sold 22% of its shares to investors, with the remaining shares held by founders Michael Pryor and Joel Spolsky. In May 2016, Trello claimed it had more than 1.1 million daily active users and 14 million total signups. In May 2015, Trello expanded internationally with localized interfaces for Brazil, Germany, and Spain. In 2016 Trello launched the Power-Up platform, allowing 3rd party developers to build and distribute extensions known as Power-Ups to Trello. Initial integrations included Zendesk, SurveyMonkey and Giphy. By January 2022 there were a total of 247 power-ups listed in the Power-Up directory. On 9 January 2017, Atlassian announced its intent to acquire Trello for $425 million. The transaction was made with $360 million in cash and $65 million in shares and options. In December 2018, Trello announced its acquisition of Butler, a company that developed a leading power-up for automating tasks within a Trello board. Trello announced 35 million users in March 2019 and 50 million users in October 2019. In 2020 Craig Jones, then cybersecurity operations director at Sophos, found that the company exposed the personally identifiable information (PII) data of its users, exposed through public Trello boards; the researcher first tweeted about this issue in the year 2018. On 16 January 2024 Trello suffered a data breach containing over 15 million unique email addresses, names and usernames, when the data was posted on a popular hacking forum. The data was obtained by enumerating a publicly accessible resource using email addresses from previous breach corpuses; it was then added on 22 January 2024 to the famous website collecting data breaches "Have I Been Pwned?". == Uses == Users can create task boards with different columns and move the tasks between them. Typically columns include task statuses such as To Do, In Progress, Done. The tool can be used for personal and business purposes including real estate management, software project management, school bulletin boards, lesson planning, accounting, web design, gaming, and law office case management. == Architecture == According to a Fog Creek blog post in January 2012, the client was a thin web layer which downloads the main app, written in CoffeeScript and compiled to minified JavaScript, using Backbone.js, HTML5 .pushState(), and the Mustache templating language. The server was built on top of MongoDB, Node.js and a modified version of Socket.io. == Reception == On 26 January 2017, PC Magazine gave Trello a 3.5 / 5 rating, calling it "flexible" and saying that "you can get rather creative", while noting that "it may require some experimentation to figure out how to best use it for your team and the workload you manage."
Read more →
Digital data

Digital data or digital information, in information theory and information systems, is data or information represented as a string of discrete symbols, each of which can take on one of only a finite number of values from some alphabet, such as letters or digits. An example is a text document, which consists of a string of alphanumeric characters. The most common form of digital data in modern information systems is binary data, which is represented by a string of binary digits (bits) each of which can have one of two values, either 0 or 1. Digital data can be contrasted with analog data, which is represented by a value from a continuous range of real numbers. Analog data is transmitted by an analog signal, which not only takes on continuous values but can vary continuously with time, a continuous real-valued function of time. An example is the air pressure variation in a sound wave. Data requires interpretation to become information. In modern (post-1960) computer systems, all data is digital. The word digital comes from the same source as the words digit and digitus (the Latin word for finger), as fingers are often used for counting. Mathematician George Stibitz of Bell Telephone Laboratories used the word digital in reference to the fast electric pulses emitted by a device designed to aim and fire anti-aircraft guns in 1942. The term is most commonly used in computing and electronics, especially where real-world information is converted to binary numeric form as in digital audio and digital photography. == Symbol to digital conversion == Since symbols (for example, alphanumeric characters) are not continuous, representing symbols digitally is rather simpler than conversion of continuous or analog information to digital. Instead of sampling and quantization as in analog-to-digital conversion, such techniques as polling and encoding are used. A symbol input device usually consists of a group of switches that are polled at regular intervals to see which switches are switched. Data will be lost if, within a single polling interval, two switches are pressed, or a switch is pressed, released, and pressed again. This polling can be done by a specialized processor in the device to prevent burdening the main CPU. When a new symbol has been entered, the device typically sends an interrupt, in a specialized format, so that the CPU can read it. For devices with only a few switches (such as the buttons on a joystick), the status of each can be encoded as bits (usually 0 for released and 1 for pressed) in a single word. This is useful when combinations of key presses are meaningful, and is sometimes used for passing the status of modifier keys on a keyboard (such as shift and control). But it does not scale to support more keys than the number of bits in a single byte or word. Devices with many switches (such as a computer keyboard) usually arrange these switches in a scan matrix, with the individual switches on the intersections of x and y lines. When a switch is pressed, it connects the corresponding x and y lines together. Polling (often called scanning in this case) is done by activating each x line in sequence and detecting which y lines then have a signal, thus which keys are pressed. When the keyboard processor detects that a key has changed state, it sends a signal to the CPU indicating the scan code of the key and its new state. The symbol is then encoded or converted into a number based on the status of modifier keys and the desired character encoding. A custom encoding can be used for a specific application with no loss of data. However, using a standard encoding such as ASCII is problematic if a symbol such as 'ß' needs to be converted but is not in the standard. It is estimated that in the year 1986, less than 1% of the world's technological capacity to store information was digital and in 2007 it was already 94%. The year 2002 is assumed to be the year when humankind was able to store more information in digital than in analog format (the "beginning of the digital age"). == States == Digital data come in these three states: data at rest, data in transit, and data in use. The confidentiality, integrity, and availability have to be managed during the entire lifecycle from 'birth' to the destruction of the data. === Data at rest === Data at rest in information technology means data that is housed physically on computer data storage in any digital form (e.g. cloud storage, file hosting services, databases, data warehouses, spreadsheets, archives, tapes, off-site or cloud backups, mobile devices etc.). Data at rest includes both structured and unstructured data. This type of data is subject to threats from hackers and other malicious threats to gain access to the data digitally or physical theft of the data storage media. To prevent this data from being accessed, modified or stolen, organizations will often employ security protection measures such as password protection, data encryption, or a combination of both. The security options used for this type of data are broadly referred to as data-at-rest protection (DARP). Definitions include: "...all data in computer storage while excluding data that is traversing a network or temporarily residing in computer memory to be read or updated." "...all data in storage but excludes any data that frequently traverses the network or that which resides in temporary memory. Data at rest includes but is not limited to archived data, data which is not accessed or changed frequently, files stored on hard drives, USB thumb drives, files stored on backup tape and disks, and also files stored off-site or on a storage area network (SAN)." While it is generally accepted that archive data (i.e. which never changes), regardless of its storage medium, is data at rest and active data subject to constant or frequent change is data in use. “Inactive data” could be taken to mean data which may change, but infrequently. The imprecise nature of terms such as “constant” and “frequent” means that some stored data cannot be comprehensively defined as either data at rest or in use. These definitions could be taken to assume that Data at Rest is a superset of data in use; however, data in use, subject to frequent change, has distinct processing requirements from data at rest, whether completely static or subject to occasional change. ==== Security ==== Because of its nature data at rest is of increasing concern to businesses, government agencies and other institutions. Mobile devices are often subject to specific security protocols to protect data at rest from unauthorized access when lost or stolen and there is an increasing recognition that database management systems and file servers should also be considered as at risk; the longer data is left unused in storage, the more likely it might be retrieved by unauthorized individuals outside the network. Data encryption, which prevents data visibility in the event of its unauthorized access or theft, is commonly used to protect data in motion and increasingly promoted for protecting data at rest. The encryption of data at rest should only include strong encryption methods such as AES or RSA. Encrypted data should remain encrypted when access controls such as usernames and password fail. Increasing encryption on multiple levels is recommended. Cryptography can be implemented on the database housing the data and on the physical storage where the databases are stored. Data encryption keys should be updated on a regular basis. Encryption keys should be stored separately from the data. Encryption also enables crypto-shredding at the end of the data or hardware lifecycle. Periodic auditing of sensitive data should be part of policy and should occur on scheduled occurrences. Finally, only store the minimum possible amount of sensitive data. Tokenization is a non-mathematical approach to protecting data at rest that replaces sensitive data with non-sensitive substitutes, referred to as tokens, which have no extrinsic or exploitable meaning or value. This process does not alter the type or length of data, which means it can be processed by legacy systems such as databases that may be sensitive to data length and type. Tokens require significantly less computational resources to process and less storage space in databases than traditionally encrypted data. This is achieved by keeping specific data fully or partially visible for processing and analytics while sensitive information is kept hidden. Lower processing and storage requirements makes tokenization an ideal method of securing data at rest in systems that manage large volumes of data. A further method of preventing unwanted access to data at rest is the use of data federation especially when data is distributed globally (e.g. in off-shore archives). An example of this would be a European organisation which stores its archived data off-site in the US. Under the terms of the USA PATRIOT Act the American authorities can demand
Read more →
HtmlUnit

HtmlUnit is a headless web browser written in Java. It allows high-level manipulation of websites from other Java code, including filling and submitting forms and clicking hyperlinks. It also provides access to the structure and the details within received web pages. HtmlUnit emulates parts of browser behaviour including the lower-level aspects of TCP/IP and HTTP. A sequence such as getPage(url), getLinkWith("Click here"), click() allows a user to navigate through hypertext and obtain web pages that include HTML, JavaScript, Ajax and cookies. This headless browser can deal with HTTPS security, basic HTTP authentication, automatic page redirection and other HTTP headers. It allows Java test code to examine returned pages either as text, an XML DOM, or as collections of forms, tables, and links. The goal is to simulate real browsers; namely Chrome, Firefox and Edge. The most common use of HtmlUnit is test automation of web pages, but sometimes it can be used for web scraping, or downloading website content. == Benefits == Provides high-level API, taking away lower-level details away from the user. Compared to other WebDriver implementations, HtmlUnitDriver is the fastest to implement. It can be configured to simulate a specific browser. == Drawbacks == Element layout and rendering can not be tested. The JavaScript support is not complete, which is one of the areas of ongoing enhancements. == Used technologies == W3C DOM HTTP connection, using Apache HttpComponents JavaScript, using forked Rhino HTML Parsing, NekoHTML CSS: using CSS Parser XPath support, using Xalan == Libraries using HtmlUnit == Selenium WebDriver Spring MVC Test Framework Google Web Toolkit tests WebTest Wetator
Read more →
GeForce RTX 50 series

The GeForce RTX 50 series of consumer graphics cards is the successor of Nvidia's GeForce 40 series. Announced at CES 2025, it debuted with the release of the RTX 5070, RTX 5080 and RTX 5090 in January 2025. It is based on Nvidia's Blackwell architecture featuring Nvidia RTX's fourth-generation RT cores for hardware-accelerated real-time ray tracing, and fifth-generation deep learning–focused Tensor Cores. The GPUs are manufactured by TSMC on a custom 4N process node. == Background == In March 2024, Nvidia announced the Blackwell architecture for its datacenter products. Like Ampere, the architecture is shared by consumer and datacenter products rather than having distinct architectures released simultaneously like Ada Lovelace for consumers and Hopper for datacenter. At the Game Awards in December 2024, a cinematic trailer for The Witcher IV was shown that had been pre-rendered on an "unannounced Nvidia GeForce RTX GPU". This was assumed to be an upcoming GeForce RTX 50 series GPU. Following the RTX 50 series announcement, Nvidia confirmed that the trailer was "pre-rendered in Unreal Engine 5 on a GeForce RTX 5090". Later in the same month, it was reported that Nvidia had begun stockpiling GeForce RTX 50 series units in U.S. warehouses due to a threatened 10% import tariff and 60% tariff on Chinese imports that Donald Trump promised in his re-election campaign. === Announcement === On January 6, 2025, the GeForce RTX 50 series was officially announced for desktop and mobile devices during Nvidia's CES keynote in Las Vegas. The pricing announcement was met with surprise as the RTX 5080 at $999 was the same price that the RTX 4080 Super released at a year earlier despite the anticipated price increases. Nvidia CEO Jensen Huang falsely claimed that the RTX 5070 could reach "RTX 4090 performance at $549", a figure that relies on the use of DLSS 4 upscaling and Multi Frame generation, and is not an indication of raw performance. == Features == === Blackwell architecture === The GeForce RTX 50 series is powered by the Blackwell microarchitecture, which continues Ada Lovelace's emphasis on high graphics frequencies and large L2 caches. The Blackwell architecture introduces Nvidia RTX's fourth-generation RT cores for hardware-accelerated real-time ray tracing and fifth-generation Tensor Cores for AI compute and performing floating-point calculations. === GDDR7 === RTX 50 series GPUs are the first consumer GPUs to feature GDDR7 video memory for greater memory bandwidth over the same bus width compared to the GDDR6 and GDDR6X memory used in the GeForce 40 series. RTX 50 series desktop GPUs use GDDR7 modules from Samsung due to them being available for validation earlier than modules from SK Hynix and Micron. === 12V-2×6 connector === The GeForce RTX 50 series uses the 16-pin 12V-2×6 connector, which is a revision of the 12VHPWR connector featured on the GeForce 40 series. There were problems with the 12VHPWR connector melting on some RTX 4090 GPUs due to the connector not being fully seated and connector design flaws that did not implement a high enough safety and error tolerance. The 12V-2×6 connector revision, published by PCI-SIG in July 2023, addressed this by shortening the four sense pins so the connector will not push any power if it has not been fully seated. The 12VHPWR design would still draw up to 150W of power even if the sense pins were not making full contact. 12V-2×6 is backwards compatible with existing 12VHPWR cables and adapters. Nvidia has mandated to its AIB partners that the 16-pin 12V-2×6 connector be used on all RTX 50 series designs. With the GeForce 40 series, the 12VHPWR connector was only mandated on higher power cards such as the RTX 4070 Super, RTX 4070 Ti, RTX 4070 Ti Super, RTX 4080, RTX 4080 Super and RTX 4090 while RTX 4060, RTX 4060 Ti and RTX 4070 AIB designs had the option of using 8-pin PCIe connectors. The 600W-capable 12VHPWR connector would not have been necessary on sub-200W cards. === DLSS 4 === The fourth generation of Deep Learning Super Sampling (DLSS) was unveiled alongside the RTX 50 series. DLSS 4 upscaling uses a new vision transformer-based model for enhanced image quality with reduced ghosting and greater image stability in motion compared to the previous convolutional neural network (CNN) model. DLSS 4 also allows a greater number of frames to be generated and interpolated based on a single traditionally rendered frame. This form of frame generation called Multi Frame Generation is exclusive to the RTX 50 series while the GeForce 40 series is limited to one interpolated frame per traditionally rendered frame. Nvidia claims that DLSS 4's frame generation model uses 30% less video memory with the example of Warhammer 40,000: Darktide using 400 MB less memory at 4K resolution with frame generation enabled. Nvidia claims that 75 titles will integrate DLSS 4 Multi Frame Generation at launch, including Alan Wake 2, Cyberpunk 2077, Indiana Jones and the Great Circle, and Star Wars Outlaws. === Media Engine and I/O === The RTX 50 series includes DisplayPort 2.1b UHBR20 (80Gbps) with higher display output data rates to support high resolution and high refresh rate displays. The GeForce 40 series received criticism for only including DisplayPort 1.4a (32Gbps) while the competing Radeon RX 7000 series included DisplayPort 2.1 UHBR13.5 (54Gbps). At CES 2025, VESA announced a collaboration with Nvidia on the new DP80LL ("low loss") UHBR20 active cable standard. DP80LL allows for 80Gbps DisplayPort 2.1 cables up to 3 meters long as passive DP80 cables are limited in length due to signal integrity concerns. The RTX 50 series introduces the ninth-generation NVENC encoder and sixth-generation NVDEC video decoder. For the first time in a consumer GeForce GPU, encoding and decoding video in the 4:2:2 color format for professional-grade higher color depth is supported. == List of GPUs == === Desktop === GeForce RTX 50 series desktop GPUs are the second consumer GPUs to utilize a PCIe 5.0 interface and the first to feature GDDR7 video memory (except for the entry level RTX 5050 that still uses GDDR6). They are fabricated by TSMC using a custom 5 nm process dubbed 4N. === Mobile === Laptops featuring GeForce RTX 50 series laptop GPUs were shown at CES 2025. Laptops with RTX 50 series GPUs were paired with Intel's Arrow Lake-HX and AMD's Strix Point and Fire Range CPUs. Nvidia claims that Blackwell architecture's new Max-Q features can increase battery life by up to 40% over GeForce 40 series laptops. For example, Advanced Power Gating saves power by turning off areas of the GPU that are unused and the paired GDDR7 memory can run in an "ultra" low-voltage state. Initial RTX 50 series laptops will become available in March 2025 starting at $1,299. == Controversies == === 12V-2x6 power connector issue === The 12V-2x6 connector used by multiple 5090 cards faces criticism due to a design flaw that can potentially cause the connector to melt. The flaw primarily affect Nvidia's own RTX 5090 FE and RTX 5080 FE cards and are similar to the failures seen on the RTX 40 series but models by third party OEMs have been affected as well. === Availability and pricing === The releases of the RTX 5090, 5080 and 5070 Ti were marked by severe availability issues and pricing well above MSRP. Pricing became an issue again at the end of 2025 due to an ongoing memory supply shortage. Nvidia has been rumored to cut production of 16GB VRAM cards, affecting the availability of the RTX 5060 Ti 16GB and RTX 5070 Ti SKUs. === 32-bit support removal for CUDA, OpenCL, and GPU PhysX === Support for 32-bit OpenCL, and CUDA applications (and as a result 32-bit GPU-accelerated PhysX), was dropped for the GeForce RTX 50 series, which resulted in several applications encountering performance issues with GPU PhysX options or not being able to run at all, causing negative reactions from numerous gaming communities. On December 4, 2025, with the release of driver version 591.44, 32-bit GPU-accelerated PhysX support was restored for certain games. Support for more games was promised in the future. === Incomplete dies and missing ROPs === The dies of certain RTX 5090/5090D, 5080, and 5070 Ti cards were missing eight render output units (ROPs), resulting in slower graphics while pure compute and AI workloads are unaffected. Nvidia claimed that less than 0.5% of cards are affected and that the "production anomaly" has been rectified. === Black screen issues === Some RTX 5080 and 5090 users reported an issue where the system would boot into a black screen after installing Nvidia drivers. Nvidia confirmed the issue and said that a new driver update would fix it for people who hadn't received a VBIOS update yet. Released on February 27, 2025 Nvidia drivers version 572.60 claim to have fixed the issue. Nvidia has since released multiple hotfix and Game Ready drivers that contain additional fixes for the issue. === Windows driver branch quality and stabilit
Read more →
Cross-validation (statistics)

Cross-validation, sometimes called rotation estimation or out-of-sample testing, is any of various similar model validation techniques for assessing how the results of a statistical analysis will generalize to an independent data set. Cross-validation includes resampling and sample splitting methods that use different portions of the data to test and train a model on different iterations. It is often used in settings where the goal is prediction, and one wants to estimate how accurately a predictive model will perform in practice. It can also be used to assess the quality of a fitted model and the stability of its parameters. In a prediction problem, a model is usually given a dataset of known data on which training is run (training dataset), and a dataset of unknown data (or first seen data) against which the model is tested (called the validation dataset or testing set). The goal of cross-validation is to test the model's ability to predict new data that was not used in estimating it, in order to flag problems like overfitting or selection bias and to give an insight on how the model will generalize to an independent dataset (i.e., an unknown dataset, for instance from a real problem). One round of cross-validation involves partitioning a sample of data into complementary subsets, performing the analysis on one subset (called the training set), and validating the analysis on the other subset (called the validation set or testing set). To reduce variability, in most methods multiple rounds of cross-validation are performed using different partitions, and the validation results are combined (e.g. averaged) over the rounds to give an estimate of the model's predictive performance. In summary, cross-validation combines (averages) measures of fitness in prediction to derive a more accurate estimate of model prediction performance. == Motivation == Assume a model with one or more unknown parameters, and a data set to which the model can be fit (the training data set). The fitting process optimizes the model parameters to make the model fit the training data as well as possible. If an independent sample of validation data is taken from the same population as the training data, it will generally turn out that the model does not fit the validation data as well as it fits the training data. The size of this difference is likely to be large especially when the size of the training data set is small, or when the number of parameters in the model is large. Cross-validation is a way to estimate the size of this effect. === Example: linear regression === In linear regression, there exist real response values y 1 , … , y n {\textstyle y_{1},\ldots ,y_{n}} , and n p-dimensional vector covariates x1, ..., xn. The components of the vector xi are denoted xi1, ..., xip. If least squares is used to fit a function in the form of a hyperplane ŷ = a + βTx to the data (xi, yi) 1 ≤ i ≤ n, then the fit can be assessed using the mean squared error (MSE). The MSE for given estimated parameter values a and β on the training set (xi, yi) 1 ≤ i ≤ n is defined as: MSE = 1 n ∑ i = 1 n ( y i − y ^ i ) 2 = 1 n ∑ i = 1 n ( y i − a − β T x i ) 2 = 1 n ∑ i = 1 n ( y i − a − β 1 x i 1 − ⋯ − β p x i p ) 2 {\displaystyle {\begin{aligned}{\text{MSE}}&={\frac {1}{n}}\sum _{i=1}^{n}(y_{i}-{\hat {y}}_{i})^{2}={\frac {1}{n}}\sum _{i=1}^{n}(y_{i}-a-{\boldsymbol {\beta }}^{T}\mathbf {x} _{i})^{2}\\&={\frac {1}{n}}\sum _{i=1}^{n}(y_{i}-a-\beta _{1}x_{i1}-\dots -\beta _{p}x_{ip})^{2}\end{aligned}}} If the model is correctly specified, it can be shown under mild assumptions that the expected value of the MSE for the training set is (n − p − 1)/(n + p + 1) < 1 times the expected value of the MSE for the validation set (the expected value is taken over the distribution of training sets). Thus, a fitted model and computed MSE on the training set will result in an optimistically biased assessment of how well the model will fit an independent data set. This biased estimate is called the in-sample estimate of the fit, whereas the cross-validation estimate is an out-of-sample estimate. Since in linear regression it is possible to directly compute the factor (n − p − 1)/(n + p + 1) by which the training MSE underestimates the validation MSE under the assumption that the model specification is valid, cross-validation can be used for checking whether the model has been overfitted, in which case the MSE in the validation set will substantially exceed its anticipated value. (Cross-validation in the context of linear regression is also useful in that it can be used to select an optimally regularized cost function.) === General case === In most other regression procedures (e.g. logistic regression), there is no simple formula to compute the expected out-of-sample fit. Cross-validation is, thus, a generally applicable way to predict the performance of a model on unavailable data using numerical computation in place of theoretical analysis. == Types == Two types of cross-validation can be distinguished: exhaustive and non-exhaustive cross-validation. === Exhaustive cross-validation === Exhaustive cross-validation methods are cross-validation methods which learn and test on all possible ways to divide the original sample into a training and a validation set. ==== Leave-p-out cross-validation ==== Leave-p-out cross-validation (LpO CV) involves using p observations as the validation set and the remaining observations as the training set. This is repeated on all ways to cut the original sample on a validation set of p observations and a training set. LpO cross-validation require training and validating the model C p n {\displaystyle C_{p}^{n}} times, where n is the number of observations in the original sample, and where C p n {\displaystyle C_{p}^{n}} is the binomial coefficient. For p > 1 and for even moderately large n, LpO CV can become computationally infeasible. For example, with n = 100 and p = 30, C 30 100 ≈ 3 × 10 25 . {\displaystyle C_{30}^{100}\approx 3\times 10^{25}.} A variant of LpO cross-validation with p=2 known as leave-pair-out cross-validation has been recommended as a nearly unbiased method for estimating the area under ROC curve of binary classifiers. ==== Leave-one-out cross-validation ==== Leave-one-out cross-validation (LOOCV) is a particular case of leave-p-out cross-validation with p = 1. The process looks similar to jackknife; however, with cross-validation one computes a statistic on the left-out sample(s), while with jackknifing one computes a statistic from the kept samples only. LOO cross-validation requires less computation time than LpO cross-validation because there are only C 1 n = n {\displaystyle C_{1}^{n}=n} passes rather than C p n {\displaystyle C_{p}^{n}} . However, n {\displaystyle n} passes may still require quite a large computation time, in which case other approaches such as k-fold cross validation may be more appropriate. Pseudo-code algorithm: Input: x, {vector of length N with x-values of incoming points} y, {vector of length N with y-values of the expected result} interpolate( x_in, y_in, x_out ), { returns the estimation for point x_out after the model is trained with x_in-y_in pairs} Output: err, {estimate for the prediction error} Steps: err ← 0 for i ← 1, ..., N do // define the cross-validation subsets x_in ← (x[1], ..., x[i − 1], x[i + 1], ..., x[N]) y_in ← (y[1], ..., y[i − 1], y[i + 1], ..., y[N]) x_out ← x[i] y_out ← interpolate(x_in, y_in, x_out) err ← err + (y[i] − y_out)^2 end for err ← err/N === Non-exhaustive cross-validation === Non-exhaustive cross validation methods do not compute all ways of splitting the original sample. These methods are approximations of leave-p-out cross-validation. ==== k-fold cross-validation ==== In k-fold cross-validation, the original sample is randomly partitioned into k equal sized subsamples, often referred to as "folds". Of the k subsamples, a single subsample is retained as the validation data for testing the model, and the remaining k − 1 subsamples are used as training data. The cross-validation process is then repeated k times, with each of the k subsamples used exactly once as the validation data. The k results can then be averaged to produce a single estimation. The advantage of this method over repeated random sub-sampling (see below) is that all observations are used for both training and validation, and each observation is used for validation exactly once. 10-fold cross-validation is commonly used, but in general k remains an unfixed parameter. For example, setting k = 2 results in 2-fold cross-validation. In 2-fold cross-validation, the dataset is randomly shuffled into two sets d0 and d1, so that both sets are equal size (this is usually implemented by shuffling the data array and then splitting it in two). We then train on d0 and validate on d1, followed by training on d1 and validating on d0. When k = n (the number of observations), k-fold cross-validation is equivalent to leave-one-out cr
Read more →
Atomtronics

Atomtronics is an emerging field concerning the quantum technology of matter-wave circuits which coherently guide propagating ultra-cold atoms. The systems typically include components analogous to those found in electronics, quantum electronics or optical systems, such as beam splitters, transistors, and atomic counterparts of superconducting quantum interference devices (SQUIDs). Applications range from studies of fundamental physics to the development of practical devices such as quantum superfluids for the computation of large models for artificial general intelligence. == Etymology == Atomtronics is a portmanteau of "atom" and "electronics", in reference to the creation of atomic analogues of electronic components, such as transistors and diodes, and also electronic materials such as semiconductors. The field itself has considerable overlap with atom optics and quantum simulation, and is not strictly limited to the development of electronic-like components. However, this field develops into the research of ultra-cold atoms for the applied research implications of computations in quantum science. == Methodology == Three major elements are required for an atomtronic circuit. The first is a Bose-Einstein condensate, which is needed for its coherent and superfluid properties, although an ultracold Fermi gas may also be used for certain applications. The second is a tailored trapping potential, which can be generated optically, magnetically, or using a combination of both. The final element is a method to induce the movement of atoms within the potential, which can be achieved in several ways, for various research advancements around fields not limited to distributed computing, supercomputing, and quantum computing. For example, a transistor-like atomtronic circuit may be realized by a ring-shaped trap divided into two by two moveable weak barriers, with the two separate parts of the ring acting as the drain and the source and the barriers acting as the gate. As the barriers move, atoms flow from the source to the drain. It is now possible to coherently guide matterwaves over distances of up to 40 cm in ring-shaped atomtronic matterwave guide measurement. == Applications == The field of atomtronics is still very nascent and any schemes realized thus far are proof-of-concept. Applications include: gravimetry rotational sensing via the Sagnac effect quantum computing Obstacles to the development of practical sensing devices are largely due to the technical challenges of creating Bose-Einstein condensates. They require bulky lab-based setups not easily suitable for transportation. However, creating portable experimental setups is an active area of research.
Read more →
Algorithmic radicalization

Algorithmic radicalization is the concept that recommender algorithms on popular social media sites, such as YouTube and Facebook, drive users toward progressively more extreme content over time, leading to the development of radicalized extremist political views. Algorithms meticulously record user interactions, encompassing likes, dislikes and the duration of time watching content, with the objective of generating an endless stream of media designed to sustain user engagement. The phenomenon of echo chamber channels has been demonstrated to exacerbate the polarization of consumers, primarily through the reinforcement of media preferences and the validation of one's existing beliefs. Algorithmic radicalization remains a controversial phenomenon as it is often not in the best interest of social media companies to remove echo chamber channels. To what extent recommender algorithms are actually responsible for radicalization remains disputed. Studies have found contradictory results regarding the promotion of extremist content by algorithms. == Social media echo chambers and filter bubbles == Social media platforms learn the interests and likes of the user to modify their experiences in their feed to keep them engaged and scrolling, known as a filter bubble. An echo chamber is formed when users come across beliefs that magnify or reinforce their thoughts and form a group of like-minded users in a closed system. Echo chambers spread information without any opposing beliefs and can possibly lead to confirmation bias. According to group polarization theory, an echo chamber can potentially lead users and groups towards more extreme radicalized positions. According to the National Library of Medicine, "Users online tend to prefer information adhering to their worldviews, ignore dissenting information, and form polarized groups around shared narratives. Furthermore, when polarization is high, misinformation quickly proliferates." == By site == === Facebook === Facebook's algorithm focuses on recommending content that makes the user want to interact. They rank content by prioritizing popular posts by friends, viral content, and sometimes divisive content. Each feed is personalized to the user's specific interests which can sometimes lead users towards an echo chamber of troublesome content. Users can find their list of interests the algorithm uses by going to the "Your ad Preferences" page. According to a Pew Research study, 74% of Facebook users did not know that list existed until they were directed towards that page in the study. It is also relatively common for Facebook to assign political labels to their users. In recent years, Facebook has started using artificial intelligence to change the content users see in their feed and what is recommended to them. A document known as The Facebook Files has revealed that their AI system prioritizes user engagement over everything else. The Facebook Files has also demonstrated that controlling the AI systems has proven difficult to handle. In an August 2019 internal memo leaked in 2021, Facebook has admitted that "the mechanics of our platforms are not neutral", concluding that in order to reach maximum profits, optimization for engagement is necessary. In order to increase engagement, algorithms have found that hate, misinformation, and politics are instrumental for app activity. As referenced in the memo, "The more incendiary the material, the more it keeps users engaged, the more it is boosted by the algorithm." According to a 2018 study, "false rumors spread faster and wider than true information... They found falsehoods are 70% more likely to be retweeted on Twitter than the truth, and reach their first 1,500 people six times faster. This effect is more pronounced with political news than other categories." === YouTube === YouTube has been around since 2005 and has more than 2.5 billion monthly users. YouTube discovery content systems focus on the user's personal activity (watched, favorites, likes) to direct them to recommended content. YouTube's algorithm is accountable for roughly 70% of users' recommended videos and what drives people to watch certain content. According to a 2022 study by the Mozilla Foundation, users have little power to keep unsolicited videos out of their suggested recommended content. This includes videos about hate speech, livestreams, etc. YouTube has been identified as an influential platform for spreading radicalized content. Al-Qaeda and similar extremist groups have been linked to using YouTube for recruitment videos and engaging with international media outlets. In a research study published by the American Behavioral Scientist Journal, they researched "whether it is possible to identify a set of attributes that may help explain part of the YouTube algorithm's decision-making process". The results of the study showed that YouTube's algorithm recommendations for extremism content factor into the presence of radical keywords in a video's title. In February 2023, in the case of Gonzalez v. Google, the question at hand is whether or not Google, the parent company of YouTube, is protected from lawsuits claiming that the site's algorithms aided terrorists in recommending ISIS videos to users. Section 230 is known to generally protect online platforms from civil liability for the content posted by its users. Multiple studies have found little to no evidence to suggest that YouTube's algorithms direct attention towards far-right content to those not already engaged with it. === TikTok === TikTok is a platform that recommends videos to a user's 'For You Page' (FYP), making every users' page different. With the nature of the algorithm behind the app, TikTok's FYP has been linked to showing more explicit and radical videos over time based on users' previous interactions on the app. Since TikTok's inception, the app has been scrutinized for misinformation and hate speech as those forms of media usually generate more interactions to the algorithm. Various extremist groups, including jihadist organizations, have utilized TikTok to disseminate propaganda, recruit followers, and incite violence. The platform's algorithm, which recommends content based on user engagement, can expose users to extremist content that aligns with their interests or interactions. As of 2022, TikTok's head of US Security has put out a statement that "81,518,334 videos were removed globally between April – June for violating our Community Guidelines or Terms of Service" to cut back on hate speech, harassment, and misinformation. Studies have noted instances where individuals were radicalized through content encountered on TikTok. For example, in early 2023, Austrian authorities thwarted a plot against an LGBTQ+ pride parade that involved two teenagers and a 20-year-old who were inspired by jihadist content on TikTok. The youngest suspect, 14 years old, had been exposed to videos created by Islamist influencers glorifying jihad. These videos led him to further engagement with similar content, eventually resulting in his involvement in planning an attack. Another case involved the arrest of several teenagers in Vienna, Austria, in 2024, who were planning to carry out a terrorist attack at a Taylor Swift concert. The investigation revealed that some of the suspects had been radicalized online, with TikTok being one of the platforms used to disseminate extremist content that influenced their beliefs and actions. == Self-radicalization == The U.S. Department of Justice defines 'Lone-wolf' (self) terrorism as "someone who acts alone in a terrorist attack without the help or encouragement of a government or a terrorist organization". Through social media outlets on the internet, 'Lone-wolf' terrorism has been on the rise, being linked to algorithmic radicalization. Through echo-chambers on the internet, viewpoints typically seen as radical were accepted and quickly adopted by other extremists. These viewpoints are encouraged by forums, group chats, and social media to reinforce their beliefs. == References in media == === The Social Dilemma === The Social Dilemma is a 2020 docudrama about how algorithms behind social media enables addiction, while possessing abilities to manipulate people's views, emotions, and behavior to spread conspiracy theories and disinformation. The film repeatedly uses buzz words such as 'echo chambers' and 'fake news' to prove psychological manipulation on social media, therefore leading to political manipulation. In the film, Ben falls deeper into a social media addiction as the algorithm found that his social media page has a 62.3% chance of long-term engagement. This leads into more videos on the recommended feed for Ben and he eventually becomes more immersed into propaganda and conspiracy theories, becoming more polarized with each video. == Proposed solutions == === United States: Weakening Section 230 protections === In the Communications Decency Act, Section 230 states t
Read more →
Ethiopian feminists facing digital gender-based violence

Against a background of traditional views of women, rising internet use, a young population and an unsafe offline life, women and girls in Ethiopia are facing increasing amounts of digital violence. Some women, feeling endangered, have left the country as a result. Researchers, activists and lawyers have called for online content to be taken down and specific digital legislation to be drafted and enforced. == Online violence and its offline effects == Sexual violence against women and girls in Ethiopia is common. In 2023, in the Women, Peace and Security Index by Georgetown University, Ethiopia came 146 out of 177 countries. Over several years online harassment of and violence against women and girls in Ethiopia has increased. It can range from sexist remarks about appearance and women’s role in society, to revenge porn, threats of beating, acid attacks, abduction, rape or death. The real-life effect on women and girls of these attacks can include mental health problems, damaged reputations and a withdrawal from public and economic life. When the online attacks migrate to the real world, for example when online attackers find out where the targeted women and girls live, this can result in physical attacks, street harassment, threats to children and can cause victims to move house or job or even flee the country in fear of femicide. In a country that criminalises homosexuality, it can also lead to physical attacks on LGBTQI+ people in particular and indeed on anybody labelled as homosexual. == Research studies == The Centre for Information Resilience (CIR) conducted interviews with Ethiopian women holding public roles or being active online. The centre published a report on this in 2024 entitled ‘Silenced, Shamed and Threatened’. They found that technology-facilitated gender-based violence (TFGBV) had become “normalised to the point of invisibility.” In 2024, CER also published an analysis of gendered hate speech on social media in Ethiopia called ‘Normalised and invisible.’ It is thought that traditional views of women, the young population, the rise in internet use and the war in Tigray, when sexual violence was used as a weapon of war by Ethiopian and Eritrean soldiers, have all helped to create an online environment in which even femicide is considered unremarkable. AFP Fact Check collaborated with Deutsche Welle Akademie, to investigate the cyber harassment of women in Ethiopia, analysing misogynistic posts published on TikTok and Facebook. They discovered disparaging remarks about women’s physical appearance, threats of acid attacks and other physical violence, and the public sharing of women’s phone numbers. == Individuals affected == Women in particular jeopardy of digital gender-based violence are feminists, activists, politicians and those with a public profile. Some women are known to have fled Ethiopia fearing for their lives after online and offline threats. Yordanos Bezabih, an Ethiopian women’s rights activist, started a campaign with the hashtag #JusticeforHeaven to fight against gender-based cyberspace violence. As a result, she herself become a target. She experienced years of online threats of acid attacks, gang-rape and death. In 2025, subscribers to an online community organised a search for her address. Deepfake nude images of her were shared, she was filmed in real life, her house and online accounts were broken into, her private photos and messages posted on social media. When the attackers finally circulated her address, suggesting that she be executed, she left Ethiopia on a human rights defender scholarship. In 2023, Lella Misikir helped to start a campaign, called ‘My Whistle, My Voice’, that suggested women carry whistles and use them if they were harassed in the street. A TikTok video of the campaign became popular. Shortly after, videos of Misikir were circulated suggesting that she was gay. Her online attackers next searched for her address. In November 2024, Misikir left the country. == Legal issues == Ethiopia has some laws on online harassment and defamation, for example the Computer Crimes Proclamation. However, technology-facilitated, gender-based violence (TFGBV), such as deepfakes, non-consensual image sharing, and coordinated harassment, is not explicitly recognized as crime. In practice too, women are often not believed when reporting such violence and are not taken seriously. Police advice is often that women affected should simply leave the online space. Social media platforms can remove content when it is brought to their attention but the offenders are not banned. Users can only block them.
Read more →
Canva

Canva Pty Ltd. is an Australian multinational proprietary software company launched in 2013 based in Sydney, Australia. The platform provides a graphic design platform to create visual content for presentations, websites, and other digital products. Its uses include templates for presentations, posters, and social media content, as well as photo and video editing functionality. The platform uses a drag-and-drop interface designed for users without professional design training or experience. Canva operates on a freemium model and has added features such as print services and video editing tools over time. == History == === 2013–2020 === Canva was founded in Perth, Australia, by Melanie Perkins, Cliff Obrecht and Cameron Adams on 1 January 2013. One of the company's early investors was Susan Wu, an American entrepreneur. In its first year, Canva had more than 750,000 users. In 2017, the company reached profitability and had 294,000 paying customers. In January 2018, Perkins announced that the company had raised A$40 million from Sequoia Capital, Blackbird Ventures, and Felicis Ventures, and the company was valued at A$1 billion. It raised A$70 million in May 2019, followed by A$85 million in October 2019 and the launch of Canva for Enterprise. In December 2019, Canva announced Canva for Education, a free product for schools and other educational institutions intended to facilitate collaboration between students and teachers. === 2021–2025 === In June 2020, Canva announced a partnership with FedEx Office and with Office Depot the following month. As of June 2020, Canva's valuation had risen to A$6 billion, rising to A$40 billion by September 2021. In September 2021, Canva raised US$200 million, with its value peaking that year at US$40 billion. By September 2022, the valuation of the company had leveled at US$26 billion. While Canva's value declined from its 2021 peak by mid-2022, it remained one of Australia's most prominent technology companies, alongside Atlassian. In March 2022, Canva had over 75 million monthly active users. In 2023, the pair were named in the Australian Financial Review's AFR Rich List as among the 10 most wealthy people in Australia. On 7 December 2022, Canva launched Magic Write, which is the platform's AI-powered copywriting assistant. On 22 March 2023, Canva announced its new Assistant tool, which makes recommendations on graphics and styles that match the user's existing design. On 11 January 2024, Canva launched its own GPT in OpenAI's GPT Store. The company has announced it intends to compete with Google and Microsoft in the office software category with website and whiteboard products. In May 2024, the company announced the launch of Canva Enterprise, a plan designed for large organisations, alongside new tools including Work Kits, Courses and AI capabilities. In 2024, it announced a co-funded solar energy project to enhance its sustainability efforts. On 10 April 2025, Canva released Visual Suite 2. The new interface combines Canva's design and productivity tools. New features include a spreadsheets application (Canva Sheets), a generative AI coding assistant (Canva Code), a chatbot, and an updated photo editor that can modify or remove background objects. In August 2025, Canva launched a stock sale to employees, valuing the company at US$42 billion. == Acquisitions == In 2018, the company acquired presentations startup Zeetings for an undisclosed amount, as part of its expansion into the presentations space. In May 2019, the company announced the acquisitions of Pixabay and Pexels, two free stock photography sites based in Germany, which enabled Canva users to access their photos for designs. In February 2021, Canva acquired Austrian startup Kaleido.ai and the Czech-based Smartmockups. In 2022, Canva acquired Flourish, a London-based data visualization startup. In March 2024, Canva acquired UK-based Serif, the developers of the Affinity suite of graphic design software, for approximately $380 million. In August 2024, Canva acquired the AI image generation platform and startup, Leonardo AI, for an undisclosed amount. In June 2025, it was announced that Canva had acquired Australian AI marketing startup MagicBrief for an undisclosed amount. In February 2026, Canva acquired two startups: Cavalry, which specializes in animation software, and MangoAI, which focuses on improving advertising performance. In April 2026, Canva acquired Simtheory, an AI Workflow Tool, and Ortto, a marketing automation tool. == Philanthropy == Canva's co-founders, Melanie Perkins and Cliff Obrecht, have publicly stated their intention to donate a significant portion of their personal wealth to charity. In 2021, Canva started a partnership with GiveDirectly, a nonprofit organization operating in low income areas that makes unconditional cash transfers to families living in extreme poverty. Since then, the company has donated $50 million to support GiveDirectly's work across Malawi. In 2025, Canva announced an additional $100 million commitment to expand its GiveDirectly partnership. == Controversies == === Data breach === In May 2019, Canva experienced a data breach in which the data of roughly 139 million users was exposed. The exposed data included real names of users, usernames, email addresses, geographical information, and password hashes for some users. In January 2020, approximately 4 million user passwords were decrypted and shared online. Canva responded by resetting the passwords of every user who had not changed their password since the initial breach. === Russian operations === In May 2022 Canva was criticized for continuing to provide free access to its services in Russia, even after suspending payment processing in the country. Activists from the Ukrainian diaspora in Australia and others said this could be viewed as indirectly supporting Russia’s war effort. They noted the company was the only one of several major Australian firms to receive the lowest “digging in” rating on a tracker run by the Yale School of Management for failing to pull out of Russia. Canva responded that it had suspended financial transactions in Russia from March 2022 and maintained the free version to allow the continued creation and sharing of “pro-peace and anti-war” content for its 1.4 million Russian users.
Read more →
Paperless society

A paperless society is a society in which paper communication (written documents, email, letters, etc.) is replaced by electronic communication and storage. The concept was first introduced by Frederick Wilfrid Lancaster in 1978. Furthermore, libraries would no longer be needed to handle printed documents. "Librarians will, in time, become information specialists in a deinstitutionalized setting". Lancaster also stated that both computers and libraries will not always give us the information that other people and living life will. == Literature == Brodman, E. (1979). Review of Toward Paperless Information Systems. Bulletin of the Medical Library Association, 67(4), 437–439. Buckland, M. K. (1980). Review of Toward Paperless Information Systems. Journal of Academic Librarianship, 5(6), 349. Grosch, A. (1979). Review of Toward Paperless Information Systems. College & Research Libraries, 40(1), 88–89. Kohl, D. F. (2004). From the editor . . . The paperless society . . . Not quite yet. Journal of Academic Librarianship, 30(3), 177–178. Lancaster, F. W. (1978a). Toward paperless information systems. New York: Academic Press. Lancaster, F. W. (1980b). The future of the librarian lies outside of the library. Catholic Library World, 51, 388–391. Lancaster, F. W. (1982a). Libraries and librarians in an age of electronics. Arlington, VA: Information Resources Press. Lancaster, F. W. (1982b). The evolving paperless society and its implications for libraries. International Forum on Information and Documentation, 7(4), 3–10. Lancaster, F. W. (1983). Future librarianship: Preparing for an unconventional career. Wilson Library Bulletin, 57, 747–753. Lancaster, F. W. (1985). The paperless society revisited. American Libraries, 16, 553–555. Lancaster, F. W. (1993). Libraries and the future: Essays on the library in the twenty-first century. New York: Haworth Press. Lancaster, F. W. (1999). Second thoughts on the paperless society. Library Journal, 124(15), 48– 50. Lancaster, F. W., & Smith, L. C. (1980c). On-Line systems in the communication process: Projections. Journal of the American Society for Information Science, 31(3), 193–200. Miall, D. S. (2001). The library versus the Internet: Literary studies under siege? Proceedings of the Modern Language Association, 116(5), 1405–1414. Salton, G. (1979). Review of Toward Paperless Information Systems. Journal of Documentation, 35(3), 250–252. Sellen, A. J., & Harper, R. H. R. (2003). The myth of the paperless office. Cambridge, MA: MIT Press. Stevens, N. D. (2006). The fully electronic academic library. College & Research Libraries, 67(1),5–14. Young, Arthur P. (2008).Aftermath of a Prediction: F. W. Lancaster and the Paperless Society LIBRARY TRENDS, 56(4),(“The Evaluation and Transformation of Information Systems: Essays Honoring the Legacy of F. W. Lancaster,” edited by Lorraine J. Haricombe and Keith Russell), pp. 843–858.
Read more →
Texas House Bill 20

An Act Relating to censorship of or certain other interference with digital expression, including expression on social media platforms or through electronic mail messages, also known as Texas House Bill 20 (HB20), is a Texas anti-deplatforming law enacted on September 9, 2021. It prohibits large social media platforms from removing, moderating, or labeling posts made by users in the state of Texas based on their "viewpoints", unless considered illegal under federal law or otherwise falling into exempted categories. It also requires them to make various public disclosures relating to their business practices (including the impact of algorithmic and moderation decisions on the content that is delivered to users). The bill is part of a wider array of Republican-backed legislation seeking to prohibit the censorship of political speech, based on allegations that the moderation policies of large social media platforms are not politically neutral. It has been challenged in NetChoice, LLC v. Paxton, and is currently the subject of a circuit split between the Fifth Circuit, and a decision by the Eleventh Circuit that struck down a similar bill in the state of Florida. In September 2023, the U.S. Supreme Court agreed to hear NetChoice v. Paxton jointly with NetChoice v. Moody on questions of whether the Florida and Texas state laws are in compliance with the 1st Amendment. == Content == The law applies to "social media platforms" that serve users in the state of Texas, and have more than 50 million monthly active users in the United States. They are defined as any public internet website or application that allows users to "communicate with other users for the primary purpose of posting information, comments, messages, or images", excluding internet service providers, electronic mail, and services where communication features are "incidental to, directly related to, or dependent on" content that is pre-selected by the operator. In the bill, to "censor" is defined as to "block, ban, remove, deplatform, demonetize, de-boost, restrict, deny equal access or visibility to, or otherwise discriminate against" expression. The law prohibits social media platforms from "censoring on the basis of user viewpoint, user expression, or the ability of a user to receive the expression of others", or on the basis of a user's geographic location in Texas. This includes removal or labeling posts with warnings and disclaimers. Social media platforms may only censor content if it is unlawful, they are "specifically authorized" to do so by federal law, based on requests from "an organization with the purpose of preventing the sexual exploitation of children or protecting survivors of sexual abuse from ongoing harassment", or "directly incites" criminal activity or contains threats of violence against persons based on protected categories. It is disputed over whether this provision is actually enforceable, as it may be preempted by Section 230 of the Communications Decency Act (which states that the operators of interactive computer services are not responsible for the actions of their users). Social media platforms must make public disclosures regarding the algorithmic techniques and moderation polices that are used to determine the content provided to users, must publish a compliant acceptable use policy (AUP), and must publish a biannual transparency report containing specific details on all actions made by the service regarding the moderation of users and content. The law also prohibits email providers from "intentionally imped[ing] the transmission of another person's electronic mail message based on the content." == Legislative history == Texas Governor Greg Abbott signed the bill into law on September 9, 2021. Democrat-proposed amendments excluding Holocaust denial, terrorism content, and vaccine misinformation from the bill were rejected. Following a suit by the industry groups Computer & Communications Industry Association (CCIA) and NetChoice, NetChoice, LLC v. Paxton, the bill was blocked by U.S. District Judge Robert Pitman in December 2021, on First Amendment grounds. Texas appealed to the United States Court of Appeals for the Fifth Circuit. Judges Edith Jones, Andrew Oldham, and Leslie H. Southwick, lifted the injunction on May 11, 2022, but the decision was appealed to the Supreme Court which suspended the bill pending a full review in the Fifth Circuit. On September 16, 2022, the Fifth Circuit reversed the injunction, allowing the bill to take effect; Judge Oldham stated that the bill "chills censorship" and "does not chill speech", and accused the plaintiffs of "attempt[ing] to extract a freewheeling censorship right from the Constitution's free speech guarantee. The Platforms are not newspapers. Their censorship is not speech." Southwick dissented, stating that "we are in a new arena, a very extensive one, for speakers and for those who would moderate their speech. None of the precedents fit seamlessly." The CCIA and NetChoice requested a stay on the ruling and that the case be taken to the Supreme Court, arguing that the reversal conflicts with an Eleventh Circuit decision in NetChoice v. Moody which struck down a similar anti-moderation bill imposed by the state of Florida. On October 12, 2022, the Fifth Circuit granted the stay.
Read more →