AI Generator Text Free

AI Generator Text Free — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Medical imaging

    Medical imaging

    Medical imaging is the technique and process of imaging the interior of a body for clinical analysis and medical intervention, as well as visual representation of the function of some organs or tissues (physiology). Medical imaging seeks to reveal internal structures hidden by the skin and bones, as well as to diagnose and treat disease. Medical imaging also establishes a database of normal anatomy and physiology to make it possible to identify abnormalities. Although imaging of removed organs and tissues can be performed for medical reasons, such procedures are usually considered part of pathology instead of medical imaging. Measurement and recording techniques that are not primarily designed to produce images, such as electroencephalography (EEG), magnetoencephalography (MEG), electrocardiography (ECG), and others, represent other technologies that produce data susceptible to representation as a parameter graph versus time or maps that contain data about the measurement locations. In a limited comparison, these technologies can be considered forms of medical imaging in another discipline of medical instrumentation. As of 2010, 5 billion medical imaging studies had been conducted worldwide. Radiation exposure from medical imaging in 2006 made up about 50% of total ionizing radiation exposure in the United States. Medical imaging equipment is manufactured using technology from the semiconductor industry, including CMOS integrated circuit chips, power semiconductor devices, sensors such as image sensors (particularly CMOS sensors) and biosensors, and processors such as microcontrollers, microprocessors, digital signal processors, media processors and system-on-chip devices. As of 2015, annual shipments of medical imaging chips amount to 46 million units and $1.1 billion. The term "noninvasive" is used to denote a procedure where no instrument is introduced into a patient's body, which is the case for most imaging techniques used. == History == In 1972, engineer Godfrey Hounsfield from the British company EMI invented the X-ray computed tomography device for head diagnosis, which is commonly referred to as computed tomography (CT). The CT nucleus method is based on the projecting X-rays through a section of the human head, which are then processed by computer to reconstruct the cross-sectional image, known as image reconstruction. In 1975, EMI successfully developed a CT device for the entire body, enabling the clear acquisition of tomographic images of various parts of the human body. This revolutionary diagnostic technique earned Hounsfield and physicist Allan Cormack the Nobel Prize in Physiology or Medicine in 1979. Digital image processing technology for medical applications was inducted into the Space Foundation's Space Technology Hall of Fame in 1994. By 2010, over 5 billion medical imaging studies had been conducted worldwide. Radiation exposure from medical imaging in 2006 accounted for about 50% of total ionizing radiation exposure in the United States. Medical imaging equipment is manufactured using technology from the semiconductor industry, including CMOS integrated circuit chips, power semiconductor devices, sensors such as image sensors (particularly CMOS sensors) and biosensors, as well as processors like microcontrollers, microprocessors, digital signal processors, media processors and system-on-chip devices. As of 2015, annual shipments of medical imaging chips reached 46 million units, generating a market value of $1.1 billion. == Types == In the clinical context, "invisible light" medical imaging is generally equated to radiology or "clinical imaging". "Visible light" medical imaging involves digital video or still pictures that can be seen without special equipment. Dermatology and wound care are two modalities that use visible light imagery. Interpretation of medical images is generally undertaken by a physician specialising in radiology known as a radiologist; however, this may be undertaken by any healthcare professional who is trained and certified in radiological clinical evaluation. Increasingly interpretation is being undertaken by non-physicians, for example radiographers frequently train in interpretation as part of expanded practice. Diagnostic radiography designates the technical aspects of medical imaging and in particular the acquisition of medical images. The radiographer (also known as a radiologic technologist) is usually responsible for acquiring medical images of diagnostic quality; although other professionals may train in this area, notably some radiological interventions performed by radiologists are done so without a radiographer. As a field of scientific investigation, medical imaging constitutes a sub-discipline of biomedical engineering, medical physics or medicine depending on the context: Research and development in the area of instrumentation, image acquisition (e.g., radiography), modeling and quantification are usually the preserve of biomedical engineering, medical physics, and computer science; Research into the application and interpretation of medical images is usually the preserve of radiology and the medical sub-discipline relevant to medical condition or area of medical science (neuroscience, cardiology, psychiatry, psychology, etc.) under investigation. Many of the techniques developed for medical imaging also have scientific and industrial applications. === Radiography === Two forms of radiographic images are in use in medical imaging. Projection radiography and fluoroscopy, with the latter being useful for catheter guidance. These 2D techniques are still in wide use despite the advance of 3D tomography due to the low cost, high resolution, and depending on the application, lower radiation dosages with 2D technique. This imaging modality uses a wide beam of X-rays for image acquisition and is the first imaging technique available in modern medicine. Fluoroscopy produces real-time images of internal structures of the body in a similar fashion to radiography, but employs a constant input of X-rays, at a lower dose rate. Contrast media, such as barium, iodine, and air are used to visualize internal organs as they work. Fluoroscopy is also used in image-guided procedures when constant feedback during a procedure is required. An image receptor is required to convert the radiation into an image after it has passed through the area of interest. Early on, this was a fluorescing screen, which gave way to an Image Amplifier (IA) which was a large vacuum tube that had the receiving end coated with cesium iodide, and a mirror at the opposite end. Eventually the mirror was replaced with a TV camera. Projectional radiographs, more commonly known as X-rays, are often used to determine the type and extent of a fracture as well as for detecting pathological changes in the lungs. With the use of radio-opaque contrast media, such as barium, they can also be used to visualize the structure of the stomach and intestines – this can help diagnose ulcers or certain types of colon cancer. === Magnetic resonance imaging === A magnetic resonance imaging instrument (MRI scanner), or "nuclear magnetic resonance (NMR) imaging" scanner as it was originally known, uses powerful magnets to polarize and excite hydrogen nuclei (i.e., single protons) of water molecules in human tissue, producing a detectable signal that is spatially encoded, resulting in images of the body. The MRI machine emits a radio frequency (RF) pulse at the resonant frequency of the hydrogen atoms on water molecules. Radio frequency antennas ("RF coils") send the pulse to the area of the body to be examined. The RF pulse is absorbed by protons, causing their direction with respect to the primary magnetic field to change. When the RF pulse is turned off, the protons "relax" back to alignment with the primary magnet and emit radio waves in the process. This radio-frequency emission from the hydrogen atoms on water is what is detected and reconstructed into an image. The resonant frequency of a spinning magnetic dipole (of which protons are one example) is called the Larmor frequency and is determined by the strength of the main magnetic field and the chemical environment of the nuclei of interest. MRI uses three electromagnetic fields: a very strong (typically 1.5 to 3 teslas) static magnetic field to polarize the hydrogen nuclei, called the primary field; gradient fields that can be modified to vary in space and time (on the order of 1 kHz) for spatial encoding, often simply called gradients; and a spatially homogeneous radio-frequency (RF) field for manipulation of the hydrogen nuclei to produce measurable signals, collected through an RF antenna. Like CT, MRI traditionally creates a two-dimensional image of a thin "slice" of the body and is therefore considered a tomographic imaging technique. Modern MRI instruments are capable of producing images in the form of 3D blocks, which may be considered a generalization of the single-slice

    Read more →
  • Upworthy

    Upworthy

    Upworthy is a media brand that focuses on positive storytelling. It was started in March 2012 by Eli Pariser, the former executive director of MoveOn, and Peter Koechley, the former managing editor of The Onion. One of Facebook's co-founders, Chris Hughes, was an early investor. At its peak between 2012 and 2014, it reached up to 100 million people a month. In 2017, the company was acquired by Good Worldwide. == History == Upworthy was launched in 2012 with a focus on aggregating positive content, which aligned with Facebook's algorithm. Originally, Upworthy curators searched the internet for existing content to feature on the site. Once selected as an option, curators brainstormed different headlines and shareable images for the content, and tested it with a small sample of Upworthy's visitors before sharing it on the site. The site popularized a clickbait style of two-phrase headlines. The company simplifies issues that are controversial by nature, which are presented from a politically liberal point of view and are heavily fact-checked for accuracy. In June 2013, an article in Fast Company called Upworthy "the fastest growing media site of all time". It had 8.7 million unique monthly visitors in the first six months, and in November 2013, had a high of 87 million unique visitors in a single month. In 2013, Facebook changed its algorithm, leading to a significant decline in readers from that platform. Upworthy fired one round of writers in 2015, and another in 2016, after an unionization effort by some of the staff. The union involved, the Writers Guild of America, East, has organized several online "viral" news publishers. In January 2017, Upworthy was acquired by media company GOOD Worldwide. The newsrooms of the two organizations would merge as part of the acquisition. About 20 staffers were laid off as part of the merger. In March 2020, Upworthy saw a 65% increase in Instagram followers and a 47% increased interest in positive content on-site page views as a result of increased interest in positive content during the COVID-19 pandemic. In January 2023, National Geographic Books bought Good People: Stories From the Best of Humanity from Upworthy, with a publication date of September 3, 2024. The book is described as "a heartwarming collection of first-person tales that will provide comfort and inspiration to anyone who could use a little dose of joy right now". It was created by two senior Upworthy team members, Gabriel Reilich and Lucia Knell, and features 101 stories from Upworthy's audience. The co-creators encouraged Upworthy followers to connect with the brand through questions on their posts, opening the door for organic and personal stories to be shared in the comment sections. The book debuted on The New York Times nonfiction bestseller list on September 22, 2024, and remained on the list for two weeks. The book is seen in the top 10 on Publishers Weekly Fall 2024 Adult Preview: Lifestyle and on The Washington Post's "5 feel-good books".

    Read more →
  • Upworthy

    Upworthy

    Upworthy is a media brand that focuses on positive storytelling. It was started in March 2012 by Eli Pariser, the former executive director of MoveOn, and Peter Koechley, the former managing editor of The Onion. One of Facebook's co-founders, Chris Hughes, was an early investor. At its peak between 2012 and 2014, it reached up to 100 million people a month. In 2017, the company was acquired by Good Worldwide. == History == Upworthy was launched in 2012 with a focus on aggregating positive content, which aligned with Facebook's algorithm. Originally, Upworthy curators searched the internet for existing content to feature on the site. Once selected as an option, curators brainstormed different headlines and shareable images for the content, and tested it with a small sample of Upworthy's visitors before sharing it on the site. The site popularized a clickbait style of two-phrase headlines. The company simplifies issues that are controversial by nature, which are presented from a politically liberal point of view and are heavily fact-checked for accuracy. In June 2013, an article in Fast Company called Upworthy "the fastest growing media site of all time". It had 8.7 million unique monthly visitors in the first six months, and in November 2013, had a high of 87 million unique visitors in a single month. In 2013, Facebook changed its algorithm, leading to a significant decline in readers from that platform. Upworthy fired one round of writers in 2015, and another in 2016, after an unionization effort by some of the staff. The union involved, the Writers Guild of America, East, has organized several online "viral" news publishers. In January 2017, Upworthy was acquired by media company GOOD Worldwide. The newsrooms of the two organizations would merge as part of the acquisition. About 20 staffers were laid off as part of the merger. In March 2020, Upworthy saw a 65% increase in Instagram followers and a 47% increased interest in positive content on-site page views as a result of increased interest in positive content during the COVID-19 pandemic. In January 2023, National Geographic Books bought Good People: Stories From the Best of Humanity from Upworthy, with a publication date of September 3, 2024. The book is described as "a heartwarming collection of first-person tales that will provide comfort and inspiration to anyone who could use a little dose of joy right now". It was created by two senior Upworthy team members, Gabriel Reilich and Lucia Knell, and features 101 stories from Upworthy's audience. The co-creators encouraged Upworthy followers to connect with the brand through questions on their posts, opening the door for organic and personal stories to be shared in the comment sections. The book debuted on The New York Times nonfiction bestseller list on September 22, 2024, and remained on the list for two weeks. The book is seen in the top 10 on Publishers Weekly Fall 2024 Adult Preview: Lifestyle and on The Washington Post's "5 feel-good books".

    Read more →
  • Memory-hard function

    Memory-hard function

    In cryptography, a memory-hard function (MHF) is a function that costs a significant amount of memory to efficiently evaluate. It differs from a memory-bound function, which incurs cost by slowing down computation through memory latency. MHFs have found use in key stretching and proof of work as their increased memory requirements significantly reduce the computational efficiency advantage of custom hardware over general-purpose hardware compared to non-MHFs. == Introduction == MHFs are designed to consume large amounts of memory on a computer in order to reduce the effectiveness of parallel computing. In order to evaluate the function using less memory, a significant time penalty is incurred. As each MHF computation requires a large amount of memory, the number of function computations that can occur simultaneously is limited by the amount of available memory. This reduces the efficiency of specialised hardware, such as application-specific integrated circuits and graphics processing units, which utilise parallelisation, in computing a MHF for a large number of inputs, such as when brute-forcing password hashes or mining cryptocurrency. == Motivation and examples == Bitcoin's proof-of-work uses repeated evaluation of the SHA-256 function, but modern general-purpose processors, such as off-the-shelf CPUs, are inefficient when computing a fixed function many times over. Specialized hardware, such as application-specific integrated circuits (ASICs) designed for Bitcoin mining, can use 30,000 times less energy per hash than x86 CPUs whilst having much greater hash rates. This led to concerns about the centralization of mining for Bitcoin and other cryptocurrencies. Because of this inequality between miners using ASICs and miners using CPUs or off-the shelf hardware, designers of later proof-of-work systems utilised hash functions for which it was difficult to construct ASICs that could evaluate the hash function significantly faster than a CPU. As memory cost is platform-independent, MHFs have found use in cryptocurrency mining, such as for Litecoin, which uses scrypt as its hash function. They are also useful in password hashing because they significantly increase the cost of trying many possible passwords against a leaked database of hashed passwords without significantly increasing the computation time for legitimate users. == Measuring memory hardness == There are various ways to measure the memory hardness of a function. One commonly seen measure is cumulative memory complexity (CMC). In a parallel model, CMC is the sum of the memory required to compute a function over every time step of the computation. Other viable measures include integrating memory usage against time and measuring memory bandwidth consumption on a memory bus. Functions requiring high memory bandwidth are sometimes referred to as "bandwidth-hard functions". == Variants == MHFs can be categorized into two different groups based on their evaluation patterns: data-dependent memory-hard functions (dMHF) and data-independent memory-hard functions (iMHF). As opposed to iMHFs, the memory access pattern of a dMHF depends on the function input, such as the password provided to a key derivation function. Examples of dMHFs are scrypt and Argon2d, while examples of iMHFs are Argon2i and catena. Many of these MHFs have been designed to be used as password hashing functions because of their memory hardness. A notable problem with dMHFs is that they are prone to side-channel attacks such as cache timing. This has resulted in a preference for using iMHFs when hashing passwords. However, iMHFs have been mathematically proven to have weaker memory hardness properties than dMHFs.

    Read more →
  • List of security-focused operating systems

    List of security-focused operating systems

    This is a list of operating systems specifically focused on security. Similar concepts include security-evaluated operating systems that have achieved certification from an auditing organization, and trusted operating systems that provide sufficient support for multilevel security and evidence of correctness to meet a particular set of requirements. == Linux == === Android-based === GrapheneOS is a security-focused, Android-based mobile OS that uses a hardened kernel, C library, custom memory allocator (hardened_malloc), and a hardened Chromium-based browser named Vanadium. It also offers privacy/security features, such as Duress PIN/Password or disabling the USB-C port at a driver/hardware level to avoid exploitation. It deploys exploit mitigations such as hardware-based memory tagging, secure app spawning, restricted dynamic code loading, and more. === Debian-based === Linux Kodachi is a security-focused operating system. Tails is aimed at preserving privacy and anonymity. KickSecure is a security-focused Linux distribution that aims to be "hardened by default". It uses network hardening, kernel hardening, Strong Linux User Account Isolation, better randomness, root access restrictions, and app-specific hardening. Whonix is an anonymity focused operating system based on KickSecure. It consists of two virtual machines, And all communications are routed through Tor. === Other Linux distributions === Alpine Linux is designed to be small, simple, and secure. It uses musl, BusyBox, and OpenRC instead of the more commonly used glibc, GNU Core Utilities, and systemd. Owl - Openwall GNU/Linux, a security-enhanced Linux distribution for servers. Secureblue, a Fedora Silverblue based distro that uses a hardened kernel, custom memory allocator (hardened_malloc), Trivalent, a security-focused, Chromium-based browser inspired by Vanadium, and many other exploit mitigations. == BSD == OpenBSD is a Unix-like operating system that emphasizes portability, standardization, correctness, proactive security, and integrated cryptography. == Xen == Qubes OS aims to provide security through isolation. Isolation is provided through the use of virtualization technology. This allows the segmentation of applications into secure virtual machines.

    Read more →
  • Classora

    Classora

    Classora is a knowledge base for the Internet oriented to data analysis. From a practical point of view, Classora is a digital repository that stores structured information and allows it to be displayed in multiple formats: analytically, graphically, geographically (through maps); as well as carry out OLAP analysis. The information contained in Classora comes from public sources and is uploaded into the system through bots and ETL processes. The Knowledge Base has a commercial API for semantic enhancement, and an open web through which any user can access to part of the information collected (it also allows users to complete data and share opinions). Internally, Classora is organized into Knowledge Units and Reports. A «Knowledge Unit» is any element of the World about which information may be stored and presented in the form of a data sheet (a person, a company, a country, etc.) A «Report» is a group of Knowledge Units: a ranking of companies, a sport classification table, a survey about people, etc. In fact, one of the technical capabilities of Classora is that it allows the comparison of reports and knowledge units gathered from different sources, thereby generating an added value for the media in which this information is published: digital media, interactive TV, etc. == Key definitions == === Knowledge unit === The units of knowledge (also known as entries) in Classora are data sheets that have a certain semantic equivalence with the articles on the Wikipedia: they store information about any element of the world, be it a film, a country, a company or an animal. However, they differ from Wikipedia in that Classora stores structured information, enriched with a metadata layer; and therefore it is able to automatically interpret the meaning of each unit of knowledge. === Data report === A report is a group of units of knowledge in which the repetition of elements is not allowed. This definition includes any list, poll, ranking, etc.; and, in general, any consultation that involves more than one unit of knowledge. Classora excels at the reports management due to its visualization capabilities, being able to display data in the form of tables, graphs and maps. Types of reports: Sports scores: Sports competitions results sanctioned by the competent institution. Rankings and lists: All types of interesting and curious lists, whether they have an implicit order or not. Polls: Units of knowledge that are ranked according to users’ votes. Queries to the Knowledge Base: Questions from users using CQL. Networks of connections: automatically calculated from the reports and the taxonomy of each Knowledge Unit. === Organizational taxonomy === An organizational taxonomy (also referred to as entry type) is a data sheet that brings together the common attributes of a set of units of knowledge. For instance, the organizational taxonomy F1 Driver displays attributes such as date of debut, team, etc.; and the organizational taxonomy Football Club presents attributes such as city, stadium, etc. In Classora, taxonomies are hierarchically organized, so that they inherit attributes from their parent taxonomies. For instance, F1 Driver is a subsidiary taxonomy of Sportsperson, which is a subsidiary taxonomy of Person, which in turn is a subsidiary taxonomy of Organism. The simplest type of entry in Classora is Classora Object. All the other taxonomies are its subsidiaries and inherit its attributes. In fact, the only attribute Classora Object possesses is name (all units of knowledge are required to have one name at least). == Architecture of Classora == === Data Extraction Module === The Data Extraction Module consists of a set of robots coordinated by software that also manages the potential incidents. Most of the information available in Classora is automatically uploaded through those robots, which connect to the main online public sources to gather all types of data. There are three categories of robots: Extraction robots: responsible for the massive uploading of reports from official public sources (FIFA, CIA, IMF, Eurostat...). They are used for either absolute or incremental data uploading. Data scanner robots: responsible for looking for and updating the data of a unit of knowledge. They use specific sources to perform this task: Wikipedia, IMDB, World Bank, etc. Content aggregators: they don’t connect to external sources. Instead, they generate new information using Classora’s internal database. === Participatory Module === In Classora’s Open Website, Internet users may participate providing their knowledge as they would on the Wikipedia. There are different ways to participate: adding or correcting data in the Knowledge Base, voting in surveys (participatory rankings) and creating new Knowledge Units and Data Reports. === Connectivity Module === The Knowledge Base is designed to be embedded in multi-platform, multi-channel systems, thus enabling its integration into mobile devices, tablets, interactive TV, etc. This integration may be carried out through specific plugins (for navigators or other devices) or an API REST that provides content in XML or JSON formats. The API is divided into three blocks of operations. The first one is the block of general utility tools (ranging from autosuggest components about geographical hierarchies to operations to obtain the list of today’s celebrity birthdays, using CQL). The second one is the block of operations for widget generation (graphs, maps, rankings) using information from the knowledge base. Finally, there is a block of operations designed for the publication of free-source content. == Project statistics == As of April 2012, 2,000,000 Knowledge Units, 15,000 Reports, around 10,000 Maps and several million potential Comparative Analyses had been added to Classora. According to the site of web metrics Alexa, Classora Open Website is ranked at 100,557 globally and at 2,880 in the Spanish traffic ranking. Users spend an average of 9 ½ minutes in Classora.

    Read more →
  • Protecting Our Kids from Social Media Addiction Act

    Protecting Our Kids from Social Media Addiction Act

    Protecting Our Kids from Social Media Addiction Act also known as California SB 976 is a law that was enacted in September 2024 that is meant to address problematic social media usage among minors. The law prohibitions minors to have "addictive feeds" unless they have verifiable parental consent, minor's notifications are also restricted between 12 am to 6 am and during school hours between 8 am and 3 pm it also well requires minors to have default privacies settings and have social media companies to publicly disclose certain metrics about their users. The law was set to take effect in two steps the first being the restrictions on social media feeds, notifications, disclosures from social media companies and default settings which would have taken effect on January 1, 2025, and the age verification provision which would have taken effect on January 1, 2027. However, has faced legal challenges since its enactment delaying its enactment. == Legal Challenges == In November 2024 NetChoice a trade association representing many of the biggest social media companies such as YouTube, Facebook and Instagram sued the attorney general of California Rob Bonta hoping to get an injunction before the first set of the law's provisions would take effect in January of the next year. However, judge Edward Davila would only grant Netchoice's request as to the restrictions on notifications and public disclosures and would deny their request as to the rest of the law. The law was later fully enjoined temporarily by the District Court and Appellant Court pending appeal, and the case is now in the Ninth Circuit Court of Appeals and is pending a decision. === Social media platforms challenges to law === In November 2025 Meta, Google and TikTok filed lawsuits against the law arguing it violates the first amendment.

    Read more →
  • Instant messaging

    Instant messaging

    Instant messaging (IM) technology is a type of synchronous computer-mediated communication involving the immediate (real-time) transmission of messages between two or more parties over the Internet or another computer network. Originally involving simple text message exchanges, modern instant messaging applications and services (also variously known as instant messenger, messaging app, chat app, chat client, or simply a messenger) tend to also feature the exchange of multimedia, emojis, file transfer, VoIP (voice calling), and video chat capabilities. Instant messaging systems facilitate connections between specified known users (often using a contact list also known as a "buddy list" or "friend list") or in chat rooms, and can be standalone apps or integrated into a wider social media platform, or in a website where it can, for instance, be used for conversational commerce. Originally the term "instant messaging" was distinguished from "text messaging" by being run on a computer network instead of a cellular/mobile network, being able to write longer messages, real-time communication, presence ("status"), and being free (only cost of access instead of per SMS message sent). Instant messaging was pioneered in the early Internet era; the IRC protocol was the earliest to achieve wide adoption. Later in the 1990s, ICQ was among the first closed and commercialized instant messengers, and several rival services appeared afterwards as it became a popular use of the Internet. Beginning with its first introduction in 2005, BlackBerry Messenger became the first popular example of mobile-based IM, combining features of traditional IM and mobile SMS. Instant messaging remains very popular today; IM apps are the most widely used smartphone apps: in 2018 for instance there were 980 million monthly active users of WeChat and 1.3 billion monthly users of WhatsApp, the largest IM network. == Overview == Instant messaging (IM), sometimes also called "messaging" or "texting", consists of computer-based human communication between two users (private messaging) or more (chat room or "group") in real-time, allowing immediate receipt of acknowledgment or reply. This is in direct contrast to email, where conversations are not in real-time, and the perceived quasi-synchrony of the communications by the users (although many systems allow users to send offline messages that the other user receives when logging in). Earlier IM networks were limited to text-based communication, not dissimilar to mobile text messaging. As technology has moved forward, IM has expanded to include voice calling using a microphone, videotelephony using webcams, file transfer, location sharing, image and video transfer, voice notes, and other features. IM is conducted over the Internet or other types of networks (see also LAN messenger). Depending on the IM protocol, the technical architecture can be peer-to-peer (direct point-to-point transmission) or client–server (when all clients have to first connect to the central server). Primary IM services are controlled by their corresponding companies and usually follow the client-server model. At one point, the term "Instant Messenger" was a service mark of AOL Time Warner and could not be used in software not affiliated with AOL in the United States. For this reason, in April 2007, the instant messaging client formerly named Gaim (or gaim) announced that they would be renamed "Pidgin". === Clients === Modern IM services generally provide their own client, either a separately installed application or a browser-based client. They are normally centralised networks run by the servers of the platform's operators, unlike peer-to-peer protocols like XMPP. These usually only work within the same IM network, although some allow limited function with other services (see #Interoperability). Third-party client software applications exist that will connect with most of the major IM services. There is the class of instant messengers that uses the serverless model, which doesn't require servers, and the IM network consists only of clients. There are several serverless messengers: RetroShare, Tox, Bitmessage, Ricochet. See also: LAN messenger. Some examples of popular IM services today include Signal, Telegram, WhatsApp Messenger, WeChat, QQ Messenger, Viber, Line, and Snapchat. The popularity of certain apps greatly differ between different countries. Certain apps have an emphasis on certain uses - for example, Skype focuses on video calling, Slack focuses on messaging and file sharing for work teams, and Snapchat focuses on image messages. Some social networking services offer messaging services as a component of their overall platform, such as Facebook's Facebook Messenger, who also own WhatsApp. Others have a direct IM function as an additional adjunct component of their social networking platforms, like Instagram, Reddit, Tumblr, TikTok, Clubhouse and Twitter; this also includes for example dating websites, such as OkCupid or Plenty of Fish, and online gaming chat platforms. === Features === ==== Private and group messaging ==== Private chat allows users to converse privately with another person or a group. Privacy can also be enhanced in several ways, such as end-to-end encryption by default. Public and group chat features allow users to communicate with multiple people simultaneously. ==== Calling ==== Many major IM services and applications offer a call feature for user-to-user voice calls, conference calls, and voice messages. The call functionality is useful for professionals who utilize the application for work purposes and as a hands-free method. Videotelephony using a webcam is also possible by some. ==== Games and entertainment ==== Some IM applications include in-app games for entertainment. Yahoo! Messenger, for example, introduced these where users could play a game and viewed by friends in real-time. MSN Messenger featured a number of playable games within the interface. Facebook's Messenger has had a built-in option to play games with people in a chat, including games like Tetris and Blackjack. Discord features multiple games built inside the "activities" tab in voice channels. ==== Payments ==== A relatively new feature to instant messaging, peer-to-peer payments are available for financial tasks on top of communication. The lack of a service fee also makes these advantageous to financial applications. IM services such as Facebook Messenger and the WeChat 'super-app' for example offer a payment feature. == History == === Early systems === Though the term dates from the 1990s, instant messaging predates the Internet, first appearing on multi-user operating systems like Compatible Time-Sharing System (CTSS) and Multiplexed Information and Computing Service (Multics) in the mid-1960s. Initially, some of these systems were used as notification systems for services like printing, but quickly were used to facilitate communication with other users logged into the same machine. CTSS facilitated communication via text message for up to 30 people. Parallel to instant messaging were early online chat facilities, the earliest of which was Talkomatic (1973) on the PLATO system, which allowed 5 people to chat simultaneously on a 512 x 512 plasma display (5 lines of text + 1 status line per person). During the bulletin board system (BBS) phenomenon that peaked during the 1980s, some systems incorporated chat features which were similar to instant messaging; Freelancin' Roundtable was one prime example. The first such general-availability commercial online chat service (as opposed to PLATO, which was educational) was the CompuServe CB Simulator in 1980, created by CompuServe executive Alexander "Sandy" Trevor in Columbus, Ohio. As networks developed, the protocols spread with the networks. Some of these used a peer-to-peer protocol (e.g. talk, ntalk and ytalk), while others required peers to connect to a server (see talker and IRC). The Zephyr Notification Service (still in use at some institutions) was invented at MIT's Project Athena in the 1980s to allow service providers to locate and send messages to users. Early instant messaging programs were primarily real-time text, where characters appeared as they were typed. This includes the Unix "talk" command line program, which was popular in the 1980s and early 1990s. Some BBS chat programs (i.e. Celerity BBS) also used a similar interface. Modern implementations of real-time text also exist in instant messengers, such as AOL's Real-Time IM as an optional feature. In the latter half of the 1980s and into the early 1990s, the Quantum Link online service for Commodore 64 computers offered user-to-user messages between concurrently connected customers, which they called "On-Line Messages" (or OLM for short), and later "FlashMail." Quantum Link later became America Online and made AOL Instant Messenger (AIM, discussed later). While the Quantum Link client software ran on a Commodore 64, using only

    Read more →
  • Language model benchmark

    Language model benchmark

    A language model benchmark is a standardized test designed to evaluate the performance of language models on various natural language processing tasks. These tests are intended for comparing different models' capabilities in areas such as language understanding, generation, and reasoning. Benchmarks generally consist of a dataset and corresponding evaluation metrics. The dataset provides text samples and annotations, while the metrics measure a model's performance on tasks like answering questions, text classification, and machine translation. These benchmarks are developed and maintained by academic institutions, research organizations, and industry players to track progress in the field. In addition to accuracy, the metrics can include throughput, energy efficiency, bias, trust, and sustainability. == Overview == === Types === Benchmarks may be described by the following adjectives, not mutually exclusive: Classical: These tasks are studied in natural language processing, even before the advent of deep learning. Examples include the Penn Treebank for testing syntactic and semantic parsing, as well as bilingual translation benchmarked by BLEU scores. Question answering: These tasks have a text question and a text answer, often multiple-choice. They can be open-book or closed-book. Open-book QA resembles reading comprehension questions, with relevant passages included as annotation in the question, in which the answer appears. Closed-book QA includes no relevant passages. Closed-book QA is also called open-domain question-answering. Before the era of large language models, open-book QA was more common, and understood as testing information retrieval methods. Closed-book QA became common since GPT-2 as a method to measure knowledge stored within model parameters. Omnibus: An omnibus benchmark combines many benchmarks, often previously published. It is intended as an all-in-one benchmarking solution. Reasoning: These tasks are usually in the question-answering format, but are intended to be more difficult than standard question answering. Multimodal: These tasks require processing not only text, but also other modalities, such as images and sound. Examples include OCR and transcription. Agency: These tasks are for a language-model–based software agent that operates a computer for a user, such as editing images, browsing the web, etc. Adversarial: A benchmark is "adversarial" if the items in the benchmark are picked specifically so that certain models do badly on them. Adversarial benchmarks are often constructed after state of the art (SOTA) models have saturated (achieved 100% performance) a benchmark, to renew the benchmark. A benchmark is "adversarial" only at a certain moment in time, since what is adversarial may cease to be adversarial as newer SOTA models appear. Public/Private: A benchmark might be partly or entirely private, meaning that some or all of the questions are not publicly available. The idea is that if a question is publicly available, then it might be used for training, which would be "training on the test set" and invalidate the result of the benchmark. Usually, only the guardians of the benchmark have access to the private subsets, and to score a model on such a benchmark, one must send the model weights, or provide API access, to the guardians. The boundary between a benchmark and a dataset is not sharp. Generally, a dataset contains three "splits": training, test, and validation. Both the test and validation splits are essentially benchmarks. In general, a benchmark is distinguished from a test/validation dataset in that a benchmark is typically intended to be used to measure the performance of many different models that are not trained specifically for doing well on the benchmark, while a test/validation set is intended to be used to measure the performance of models trained specifically on the corresponding training set. In other words, a benchmark may be thought of as a test/validation set without a corresponding training set. Conversely, certain benchmarks may be used as a training set, such as the English Gigaword or the One Billion Word Benchmark, which in modern language is just the negative log-likelihood loss on a pretraining set with 1 billion words. Indeed, the distinction between benchmark and dataset in language models became sharper after the rise of the pretraining paradigm, whereby a model is first trained on massive, unlabeled datasets to learn general language patterns, syntax, and knowledge (pretraining), and the base model is then adapted to specific, downstream tasks using smaller, labeled datasets (fine-tuning). === Lifecycle === Generally, the life cycle of a benchmark consists of the following steps: Inception: A benchmark is published. It can be simply given as a demonstration of the power of a new model (implicitly) that others then picked up as a benchmark, or as a benchmark that others are encouraged to use (explicitly). Growth: More papers and models use the benchmark, and the performance on the benchmark grows. Maturity, degeneration or deprecation: A benchmark may be saturated, after which researchers move on to other benchmarks. Progress on the benchmark may also be neglected as the field moves to focus on other benchmarks. Renewal: A saturated benchmark can be upgraded to make it no longer saturated, allowing further progress. === Construction === Like datasets, benchmarks are typically constructed by several methods, individually or in combination: Web scraping: Ready-made question-answer pairs may be scraped online, such as from websites that teach mathematics and programming. Conversion: Items may be constructed programmatically from scraped web content, such as by blanking out named entities from sentences, and asking the model to fill in the blank. This was used for making the CNN/Daily Mail Reading Comprehension Task. Crowd sourcing: Items may be constructed by paying people to write them, such as on Amazon Mechanical Turk. This was used for making the MCTest. === Evaluation === Generally, benchmarks are fully automated. This limits the questions that can be asked. For example, with mathematical questions, "proving a claim" would be difficult to automatically check, while "calculate an answer with a unique integer answer" would be automatically checkable. With programming tasks, the answer can generally be checked by running unit tests, with an upper limit on runtime. The benchmark scores are of the following kinds: For multiple choice or cloze questions, common scores are accuracy (frequency of correct answer), precision, recall, F1 score, etc. pass@n: The model is given n {\displaystyle n} attempts to solve each problem. If any attempt is correct, the model earns a point. The pass@n score is the model's average score over all problems. k@n: The model makes n {\displaystyle n} attempts to solve each problem, but only k {\displaystyle k} attempts out of them are selected for submission. If any submission is correct, the model earns a point. The k@n score is the model's average score over all problems. cons@n: The model is given n {\displaystyle n} attempts to solve each problem. If the most common answer is correct, the model earns a point. The cons@n score is the model's average score over all problems. Here "cons" stands for "consensus" or "majority voting". The pass@n score can be estimated more accurately by making N > n {\displaystyle N>n} attempts, and use the unbiased estimator 1 − ( N − c n ) ( N n ) {\displaystyle 1-{\frac {\binom {N-c}{n}}{\binom {N}{n}}}} , where c {\displaystyle c} is the number of correct attempts. For less well-formed tasks, where the output can be any sentence, there are the following commonly used scores including BLEU ROUGE, METEOR, NIST, word error rate, LEPOR, CIDEr, and SPICE. === Issues === error: Some benchmark answers may be wrong. ambiguity: Some benchmark questions may be ambiguously worded. subjective: Some benchmark questions may not have an objective answer at all. This problem generally prevents creative writing benchmarks. Similarly, this prevents benchmarking writing proofs in natural language, though benchmarking proofs in a formal language is possible. open-ended: Some benchmark questions may not have a single answer of a fixed size. This problem generally prevents programming benchmarks from using more natural tasks such as "write a program for X", and instead uses tasks such as "write a function that implements specification X". inter-annotator agreement: Some benchmark questions may be not fully objective, such that even people would not agree with 100% on what the answer should be. This is common in natural language processing tasks, such as syntactic annotation. shortcut: Some benchmark questions may be easily solved by an "unintended" shortcut. For example, in the SNLI benchmark, having a negative word like "not" in the second sentence is a strong signal for the "Contradiction" category, regardless of what the se

    Read more →
  • Convergent encryption

    Convergent encryption

    Convergent encryption, also known as content hash keying, is a cryptosystem that produces identical ciphertext from identical plaintext files. This has applications in cloud computing to remove duplicate files from storage without the provider having access to the encryption keys. The combination of deduplication and convergent encryption was described in a backup system patent filed by Stac Electronics in 1995. This combination has been used by Farsite, Permabit, Freenet, MojoNation, GNUnet, flud, and the Tahoe Least-Authority File Store. The system gained additional visibility in 2011 when cloud storage provider Bitcasa announced they were using convergent encryption to enable de-duplication of data in their cloud storage service. == Overview == The system computes a cryptographic hash of the plaintext in question. The system then encrypts the plaintext by using the hash as a key. Finally, the hash itself is stored, encrypted with a key chosen by the user. == Known Attacks == Convergent encryption is open to a "confirmation of a file attack" in which an attacker can effectively confirm whether a target possesses a certain file by encrypting an unencrypted, or plain-text, version and then simply comparing the output with files possessed by the target. This attack poses a problem for a user storing information that is non-unique, i.e. also either publicly available or already held by the adversary - for example: banned books or files that cause copyright infringement. An argument could be made that a confirmation of a file attack is rendered less effective by adding a unique piece of data such as a few random characters to the plain text before encryption; this causes the uploaded file to be unique and therefore results in a unique encrypted file. However, some implementations of convergent encryption where the plain-text is broken down into blocks based on file content, and each block then independently convergently encrypted may inadvertently defeat attempts at making the file unique by adding bytes at the beginning or end. Even more alarming than the confirmation attack is the "learn the remaining information attack" described by Drew Perttula in 2008. This type of attack applies to the encryption of files that are only slight variations of a public document. For example, if the defender encrypts a bank form including a ten digit bank account number, an attacker that is aware of generic bank form format may extract defender's bank account number by producing bank forms for all possible bank account numbers, encrypt them and then by comparing those encryptions with defender's encrypted file deduce the bank account number. Note that this attack can be extended to attack a large number of targets at once (all spelling variations of a target bank customer in the example above, or even all potential bank customers), and the presence of this problem extends to any type of form document: tax returns, financial documents, healthcare forms, employment forms, etc. Also note that there is no known method for decreasing the severity of this attack -- adding a few random bytes to files as they are stored does not help, since those bytes can likewise be attacked with the "learn the remaining information" approach. The only effective approach to mitigating this attack is to encrypt the contents of files with a non-convergent secret before storing (negating any benefit from convergent encryption), or to simply not use convergent encryption in the first place.

    Read more →
  • Social media use in African politics

    Social media use in African politics

    Since the Egyptian Revolution in 2011 and the Tunisian Revolution, social media, especially Facebook, Twitter, and YouTube, began to gain traction as a political tool in Africa. Various political actors have used social media to pursue a wide range of political objectives. State actors can use social media to encourage political discourse, campaign, or implement censorship and surveillance. Non-state actors, such as civil society organizations and opposition movements, can use social media to address political concerns and to organize widespread uprisings, such as the 2014 Burkinabé uprising. Meanwhile, extremist organizations can use social media to further their propaganda and recruitment. However, social media has been criticized for its limited accessibility and for facilitating the spread of misinformation, causing some skepticism about its effectiveness. Due to low entry barriers and user-generated content, social media provides a platform where people from different social classes can engage and interact with one another. Under traditional media, the public had limited opportunities to voice their political opinions. Social media enables people to both create and consume content. The public has become increasingly comfortable and confident in expressing political opinions online, often away from government scrutiny. Scholars argue that social media use has democratizing effects in African countries. == State actors == === Promoting political discourse === Through social media, the government and its citizens can discuss policy ideas, policy implementation, and political actions. Regardless of geographical location and distance, people are able to voice their opinions to the government. Social media includes citizens who were previously not able to express their discontent or share their ideas to the government. As state actors keep the public informed, social media can increase civic engagement. With more civic engagement, policies can be discussed without politicization. Before the commonplace use of social media, African countries faced weak feedback mechanisms that effectively excluded the average African citizen from policy discourse. In South Africa, the government uses social media to connect with constituencies. The South African president runs an official Twitter, Facebook, YouTube, and Flickr accounts to engage with the public. === Campaigning === Political parties also use social media for political campaigns during election periods. In South Africa, the ANC (African National Congress) and DA (Democratic Alliance) use social media for political purposes. These parties specifically use Facebook as a tool for campaigning and engaging with the public to improve their relationship with citizens. Nigerian President Goodluck Jonathan employed social media to campaign for the presidential election in 2011, which he won. When President Goodluck Jonathan announced his bid for the presidency on social media in 2010, it reached about 217,000 people. As his campaign progressed, President Goodluck Jonathan was able to increase his followers to half a million by early 2011. === Censorship & Surveillance === While state actors can use social media to encourage their party or discourse, social media can be used to censor and surveil citizens. For example, the ANC and DA use Facebook to monitor South Africans. The government is able to track down people who have spoken against the government and translate this information into physical action to stop any possibility of a revolution. Social media platforms can be shut down to manipulate the flow of information. In Chad, citizens cannot access information through online platforms. This censorship blocked "Facebook, Twitter, WhatsApp and Viber". In the Democratic Republic of Congo, the government shut down the internet before contested elections. In Zimbabwe, the government shut down the internet to hide civilian protests against fuel price increases. == Non-state actors == === Civil society organizations (CSOs) === Civil society organizations have also used social media networks in an effort to recruit supporters and communicate with the public. CSOs can use social media to mobilize people to support their cause, such as the Ghanaian Committee for Joint Action (CJA). In 2005 and 2006, the CJA gathered support to protest against the 50% fuel price increase. CSOs can play the role of a counterforce against state actors and state propaganda during times of crises, such as protests and military clashes. In some cases, CSOs release their own videos and photos on social media which challenges traditional forms of media. CSOs have also served to monitor elections to reduce corruption and violence during election day. For instance, the Zambian Bantu Watch started the #bantuwatch social media campaign to monitor the 2011 presidential election. Zambians used Facebook and Twitter to report polling station results to mitigate election fraud and election violence. In South Africa, CSOs created 'amandla.mobi' to campaign for public policies by creating petitions. Through 'amandla.mobi', CSOs are able to circulate petitions on social media to collect signatures. South African CSOs reported how social media helped their organizations to gain support and share ideas. However, CSOs struggle to attract media attention and often have to pay for media coverage. === Opposition forces against the government === Social media is also used by the public or opposition forces against the government. Through horizontal social media, organizing can lead to street protests and revolutions, some of which are successful. For instance, during the Egyptian revolution of 2011, "The Day of the Revolution Against Torture, Poverty, Corruption, and Unemployment" and "We Are All Khaled Said" gathered support against President Hosni Mubarak. In particular, "We Are All Khaled Said" had Egyptian citizens gather around the death of Khaled Said who was brutally tortured and killed by the Egyptian government because Said wanted to uncover government corruption. As unrest erupted into public demonstrations, President Hosni Mubarak was forced to resign. Witnessing the success of social media during the Egyptian revolution, the Tunisian Revolution, or the Jasmine Revolution, mobilized through Facebook and Twitter. Likewise, in South Africa, Malawi, and Mozambique, these countries have used social media as "new protest drums." Due to social media's low entry barrier, opposition forces against the government can facilitate political discourse that can lead to accountability. Whistleblowers and opposition forces are able to expose corruption through social media, where they face less repression while reaching a larger audience. For example, the youth of Zimbabwe and South Africa use Facebook to discuss politics without judgment. Specifically, in Zimbabwe, political youth used Facebook to avoid state surveillance. Social media is used as a supplemental tool for activism. In 2015, South African student activists started the hashtag #RhodesMustFall to push the issue of colonialism and racism at the forefront of the public. === Extremist organizations === Social media is easily accessible and created by user-based content. Therefore, marginalized groups are able to use social media to spread extremist ideas. For instance, Boko Haram created the Media Office of West Africa Province and perpetuated propaganda through Twitter and YouTube. Boko Haram's online propaganda campaign targets and persuades young dissuaded Nigerians to join their cause. It is important to note that social media has also been used against Boko Haram. In April 2014, Boko Haram kidnapped 276 schoolgirls and an international campaign fought for their return through #BringBackOurGirls. Another extremist group, Al-Shabaab, has created an online presence through Twitter and YouTube. Through these social media networks, Al-Shabaab recruits new members to their extremist group through their propaganda which emphasizes the group's successes. Albeit their efforts, Al-Shabaab has not been very successful in coordinating their members but they are successful in financing their group. Furthermore, the Islamic State of Iraq and the Levant (ISIL) use social media to target and recruit individuals to their cause. ISIL's social media usage is more diverse compared to Boko Haram and Al-Shabaab; ISIL uses "Facebook, Twitter, YouTube, WhatsApp, Telegram, JustPaste.it, Kik and Ask.fm." Since ISIL's Twitter accounts kept getting shut down, ISIL uses Telegram and WhatsApp chat rooms to privately conduct meetings. Due to the spread of extremist ideology, Zhuravskaya et al. acknowledge social media's potential to be misused. == Challenges == Although social media can be used as a political tool, it faces challenges in Africa. Due to low literacy rates in Africa, social media networks exclude many of the population members. In addition, lack of access to electricity and the internet can fur

    Read more →
  • Social profiling

    Social profiling

    Social profiling is the process of constructing a social media user's profile using their social data. In general, profiling refers to the data science process of generating a person's profile with computerized algorithms and technology. There are various platforms for sharing this information with the proliferation of growing popular social networks, including but not limited to LinkedIn, Google+, Facebook and Twitter. == Social profile and social data == A person's social data refers to the personal data that they generate either online or offline (for more information, see social data revolution). A large amount of these data, including one's language, location and interest, is shared through social media and social network. Users join multiple social media platforms and their profiles across these platforms can be linked using different methods to obtain their interests, locations, content, and friend list. Altogether, this information can be used to construct a person's social profile. Meeting the user's satisfaction level for information collection is becoming more challenging. This is because of too much "noise" generated, which affects the process of information collection due to explosively increasing online data. Social profiling is an emerging approach to overcome the challenges faced in meeting user's demands by introducing the concept of personalized search while keeping in consideration user profiles generated using social network data. A study reviews and classifies research inferring users social profile attributes from social media data as individual and group profiling. The existing techniques along with utilized data sources, the limitations, and challenges were highlighted. The prominent approaches adopted include machine learning, ontology, and fuzzy logic. Social media data from Twitter and Facebook have been used by most of the studies to infer the social attributes of users. The literature showed that user social attributes, including age, gender, home location, wellness, emotion, opinion, relation, influence are still need to be explored. === Personalized meta-search engines === The ever-increasing online content has resulted in the lack of proficiency of centralized search engine's results. It can no longer satisfy user's demand for information. A possible solution that would increase coverage of search results would be meta-search engines, an approach that collects information from numerous centralized search engines. A new problem thus emerges, that is too much data and too much noise is generated in the collection process. Therefore, a new technique called personalized meta-search engines was developed. It makes use of a user's profile (largely social profile) to filter the search results. A user's profile can be a combination of a number of things, including but not limited to, "a user's manual selected interests, user's search history", and personal social network data. == Social media profiling == According to Samuel D. Warren II and Louis Brandeis (1890), disclosure of private information and the misuse of it can hurt people's feelings and cause considerable damage in people's lives. Social networks provide people access to intimate online interactions; therefore, information access control, information transactions, privacy issues, connections and relationships on social media have become important research fields and are subjects of concern to the public. Ricard Fogues and other co-authors state that "any privacy mechanism has at its base an access control", that dictate "how permissions are given, what elements can be private, how access rules are defined, and so on". Current access control for social media accounts tend to still be very simplistic: there is very limited diversity in the category of relationships on for social network accounts. User's relationships to others are, on most platforms, only categorized as "friend" or "non-friend" and people may leak important information to "friends" inside their social circle but not necessarily users to they consciously want to share the information to. The below section is concerned with social media profiling and what profiling information on social media accounts can achieve. === Privacy leaks === A lot of information is voluntarily shared on online social networks, such as photos and updates on life activities (new job, hobbies, etc.). People rest assured that different social network accounts on different platforms will not be linked as long as they do not grant permission to these links. However, according to Diane Gan, information gathered online enables "target subjects to be identified on other social networking sites such as Foursquare, Instagram, LinkedIn, Facebook and Google+, where more personal information was leaked". The majority of social networking platforms use the "opt out approach" for their features. If users wish to protect their privacy, it is user's own responsibility to check and change the privacy settings as a number of them are set to default option. A major social network platforms have developed geo-tag functions and are in popular usage. This is concerning because 39% of users have experienced profiling hacking; 78% burglars have used major social media networks and Google Street-view to select their victims; and an astonishing 54% of burglars attempted to break into empty houses when people posted their status updates and geo-locations. === Facebook === Formation and maintenance of social media accounts and their relationships with other accounts are associated with various social outcomes. In 2015, for many firms, customer relationship management is essential and is partially done through Facebook. Before the emergence and prevalence of social media, customer identification was primarily based upon information that a firm could directly acquire: for example, it may be through a customer's purchasing process or voluntary act of completing a survey/loyalty program. However, the rise of social media has greatly reduced the approach of building a customer's profile/model based on available data. Marketers now increasingly seek customer information through Facebook; this may include a variety of information users disclose to all users or partial users on Facebook: name, gender, date of birth, e-mail address, sexual orientation, marital status, interests, hobbies, favorite sports team(s), favorite athlete(s), or favorite music, and more importantly, Facebook connections. However, due to the privacy policy design, acquiring true information on Facebook is no trivial task. Often, Facebook users either refuse to disclose true information (sometimes using pseudonyms) or setting information to be only visible to friends, Facebook users who "LIKE" your page are also hard to identify. To do online profiling of users and cluster users, marketers and companies can and will access the following kinds of data: gender, the IP address and city of each user through the Facebook Insight page, who "LIKED" a certain user, a page list of all the pages that a person "LIKED" (transaction data), other people that a user follow (even if it exceeds the first 500, which we usually can not see) and all the publicly shared data. === Twitter === First launched on the Internet in March 2006, Twitter is a platform on which users can connect and communicate with any other user in just 280 characters. Like Facebook, Twitter is also a crucial tunnel for users to leak important information, often unconsciously, but able to be accessed and collected by others. According to Rachel Nuwer, in a sample of 10.8 million tweets by more than 5,000 users, their posted and publicly shared information are enough to reveal a user's income range. A postdoctoral researcher from the University of Pennsylvania, Daniel Preoţiuc-Pietro and his colleagues were able to categorize 90% of users into corresponding income groups. Their existing collected data, after being fed into a machine-learning model, generated reliable predictions on the characteristics of each income group. The mobile app called Streamd.in displays live tweets on Google Maps by using geo-location details attached to the tweet, and traces the user's movement in the real world. === Profiling photos on social network === The advent and universality of social media networks have boosted the role of images and visual information dissemination. Many types of visual information on social media transmit messages from the author, location information and other personal information. For example, a user may post a photo of themselves in which landmarks are visible, which can enable other users to determine where they are. In a study done by Cristina Segalin, Dong Seon Cheng and Marco Cristani, they found that profiling user posts' photos can reveal personal traits such as personality and mood. In the study, convolutional neural networks (CNNs) is introduced. It builds on the main characteristics of computational

    Read more →
  • Image analysis

    Image analysis

    Image analysis or imagery analysis is the extraction of meaningful information from images; mainly from digital images by means of digital image processing techniques. Image analysis tasks can be as simple as reading bar coded tags or as sophisticated as identifying a person from their face. Computers are indispensable for the analysis of large amounts of data, for tasks that require complex computation, or for the extraction of quantitative information. On the other hand, the human visual cortex is an excellent image analysis apparatus, especially for extracting higher-level information, and for many applications — including medicine, security, and remote sensing — human analysts still cannot be replaced by computers. For this reason, many important image analysis tools such as edge detectors and neural networks are inspired by human visual perception models. == Digital == Digital Image Analysis or Computer Image Analysis is when a computer or electrical device automatically studies an image to obtain useful information from it. Note that the device is often a computer but may also be an electrical circuit, a digital camera or a mobile phone. It involves the fields of computer or machine vision, and medical imaging, and makes heavy use of pattern recognition, digital geometry, and signal processing. This field of computer science developed in the 1950s at academic institutions such as the MIT A.I. Lab, originally as a branch of artificial intelligence and robotics. It is the quantitative or qualitative characterization of two-dimensional (2D) or three-dimensional (3D) digital images. 2D images are, for example, to be analyzed in computer vision, and 3D images in medical imaging. The field was established in the 1950s—1970s, for example with pioneering contributions by Azriel Rosenfeld, Herbert Freeman, Jack E. Bresenham, or King-Sun Fu. == Techniques == There are many different techniques used in automatically analysing images. Each technique may be useful for a small range of tasks, however there still aren't any known methods of image analysis that are generic enough for wide ranges of tasks, compared to the abilities of a human's image analysing capabilities. Examples of image analysis techniques in different fields include: 2D and 3D object recognition, image segmentation, motion detection e.g. Single particle tracking, video tracking, optical flow, medical scan analysis, 3D Pose Estimation. == Deep learning == Since the early 2010s, deep learning methods have substantially advanced the field of image analysis. In 2012, a deep convolutional neural network (CNN) known as AlexNet achieved a significant reduction in error rates on the ImageNet large-scale image classification benchmark, demonstrating the effectiveness of deep learning for visual recognition tasks. Subsequent architectures such as ResNet introduced residual connections that enabled training of much deeper networks, further improving accuracy across image analysis tasks. Real-time object detection became practical with frameworks such as YOLO (You Only Look Once), which unified detection and classification into a single network pass. In 2020, the Vision Transformer (ViT) demonstrated that transformer architectures, originally developed for natural language processing, could achieve competitive results on image classification when applied directly to sequences of image patches. More recently, foundation models trained on large-scale datasets have enabled zero-shot generalisation across image analysis tasks. The Segment Anything Model (SAM), trained on over one billion masks, can segment arbitrary objects in images without task-specific fine-tuning. These advances have made image analysis techniques increasingly accessible through browser-based tools and open-source implementations. == Applications == The applications of digital image analysis are continuously expanding through all areas of science and industry, including: anatomy, allows for precise measurements, visualization, and statistical analysis of anatomical structures. assay micro plate reading, such as detecting where a chemical was manufactured. astronomy, such as calculating the size of a planet. automated species identification (e.g. plant and animal species) defense error level analysis filtering machine vision, such as to automatically count items in a factory conveyor belt. materials science, such as determining if a metal weld has cracks. medicine, such as detecting cancer in a mammography scan. metallography, such as determining the mineral content of a rock sample. microscopy, such as counting the germs in a swab. automatic number plate recognition; optical character recognition, such as automatic license plate detection. remote sensing, such as detecting intruders in a house, and producing land cover/land use maps. robotics, such as to avoid steering into an obstacle. security, such as detecting a person's eye color or hair color. == Object-based == Object-based image analysis (OBIA) involves two typical processes, segmentation and classification. Segmentation helps to group pixels into homogeneous objects. The objects typically correspond to individual features of interest, although over-segmentation or under-segmentation is very likely. Classification then can be performed at object levels, using various statistics of the objects as features in the classifier. Statistics can include geometry, context and texture of image objects. Over-segmentation is often preferred over under-segmentation when classifying high-resolution images. Object-based image analysis has been applied in many fields, such as cell biology, medicine, earth sciences, and remote sensing. For example, it can detect changes of cellular shapes in the process of cell differentiation.; it has also been widely used in the mapping community to generate land cover. When applied to earth images, OBIA is known as geographic object-based image analysis (GEOBIA), defined as "a sub-discipline of geoinformation science devoted to (...) partitioning remote sensing (RS) imagery into meaningful image-objects, and assessing their characteristics through spatial, spectral and temporal scale". The international GEOBIA conference has been held biannually since 2006. OBIA techniques are implemented in software such as eCognition or the Orfeo toolbox.

    Read more →
  • Cryptographic bill of materials

    Cryptographic bill of materials

    Cryptographic bill of materials (CBOM—also cryptography bill of materials) is a structured inventory of all cryptographic assets present in a software, firmware, device, or system. It enumerates algorithms (and parameters such as key sizes and modes), cryptographic libraries or modules, digital certificates, keys and related material, and protocols in use, and maps their relationships to the components that implement or invoke them. CBOMs are used to improve security analysis, compliance, and cryptographic agility, and are increasingly referenced in guidance for post‑quantum cryptography (PQC) migration. == Definition and scope == A CBOM inventories cryptographic primitives and materials—such as encryption and signature algorithms (with specific variants and modes), key sizes, cryptographic libraries/modules, digital certificates (e.g., X.509), keys and other related cryptographic material, and security protocols (e.g., TLS, IPsec). It also documents dependencies (for example, an application uses an algorithm provided by a library; a protocol uses several algorithms) and can capture certificate lifecycles, cryptographic module certifications (e.g., FIPS 140‑3), and policy conformance metadata. In common practice, a CBOM may be embedded within an SBOM format (such as CycloneDX) or exported as a separate, linked artifact. === Typical CBOM fields === The exact schema varies by implementation, but common fields are summarized below (see CycloneDX CBOM guide and NIST SP 1800‑38B). == Relation to SBOM == A CBOM is complementary to, but distinct from, a software bill of materials (SBOM). Whereas an SBOM lists software components and their versions, a CBOM focuses specifically on the cryptography present and how it is configured and used. For example, an SBOM might enumerate inclusion of a library such as OpenSSL, while the CBOM would identify which algorithms and parameters that library enables (e.g., RSA‑2048, ECDH P‑256, AES‑GCM) and list relevant keys and certificates. The pairing enables both supply‑chain transparency and cryptographic transparency. == History == The term and practice emerged in the early–mid 2020s alongside software‑supply‑chain transparency and PQC planning. The OWASP CycloneDX standard introduced native CBOM support (v1.6 and later), modeling algorithms, keys, certificates, and protocols as first‑class “cryptographic assets” and providing dependency semantics (uses/implements) between software and cryptography. Open tooling from industry and researchers (e.g., IBM's CBOMkit and related generators/viewers) appeared to automate discovery and representation of cryptographic use in the CycloneDX CBOM schema. == Regulatory and policy context == In the United States, policy has emphasized cryptographic inventories as a prerequisite to PQC migration. The White House's National Security Memorandum 10 (2022) directed a government‑wide transition to quantum‑resistant cryptography; the Office of Management and Budget's M‑23‑02 (November 2022) operationalized this by requiring agencies to submit a prioritized inventory of cryptographic systems (with algorithm and key details) by 4 May 2023 and annually thereafter, and tasked CISA/NSA/NIST to develop automated discovery and inventory strategies. A 2024 Office of the National Cyber Director report reiterated that a “comprehensive cryptographic inventory” is the baseline for PQC planning and must be maintained iteratively with both automated and manual discovery. NIST's NCCoE practice guide (SP 1800‑38B, preliminary draft) provides concrete methods for cryptographic discovery and documentation across enterprises, aligning with CBOM‑style representations. CISA later published a strategy to migrate federal agencies to automated cryptography discovery and inventory tools to support continuous reporting. Separately, NSA, CISA, and NIST issued joint guidance encouraging all organisations to prepare cryptographic inventories and roadmaps for PQC, beyond government environments. == Role in quantum readiness and cryptographic agility == Because large‑scale quantum computing threatens widely used public‑key algorithms (e.g., RSA, ECC), organisations are planning multi‑year transitions to post-quantum cryptography. CBOMs enable that planning by identifying where quantum‑vulnerable algorithms appear, prioritising high‑impact systems, and tracking replacements over time. A machine‑readable CBOM also supports cryptographic agility and incident response: if an algorithm, library, or certificate lifecycle becomes non‑compliant or vulnerable, the CBOM indicates which products and systems are affected and where mitigations must be applied first. == Standards and tooling == CycloneDX (OWASP): Native CBOM modelling (v1.6+) for algorithms, certificates, keys/related material, and protocols, with dependency semantics and examples. The project publishes a CBOM guide and use‑case profiles (e.g., certificate and algorithm inventories). NIST NCCoE SP 1800‑38 series: Practice guides for PQC migration include enterprise cryptographic discovery methods that produce CBOM‑like inventories and integrate multiple discovery tools. Government automation initiatives: Following M‑23‑02, CISA issued a strategy to migrate to automated cryptography discovery and inventory tools to support agency reporting and continuous inventory management. Open‑source and vendor tools: IBM's CBOMkit and related components generate, analyse, and visualise CBOMs; the IBM CBOM specification work was upstreamed into CycloneDX 1.6. === Data model and interchange (example) === CycloneDX provides machine‑readable encodings (JSON/XML) for CBOM content. The example below (subset) shows an application depending on a crypto library that provides the AES‑256‑GCM algorithm, and the application also depends on a leaf X.509 certificate. See the CycloneDX CBOM guide, JSON reference, and the “Implementation details” use‑case for the semantics of `dependsOn` and `provides`. == Relationship to cybersecurity supply chain initiatives == CBOMs complement SBOM‑focused supply‑chain transparency introduced by U.S. Executive Order 14028 and NTIA/NIST SBOM work. SBOMs document software components; CBOMs add detail on embedded cryptography to support risk management, policy compliance (e.g., disallowing deprecated algorithms), and PQC transition planning.

    Read more →
  • Back-Up Interceptor Control

    Back-Up Interceptor Control

    Backup Interceptor Control (BUIC, ) was the Electronic Systems Division 416M System to backup the SAGE 416L System in the United States and Canada. BUIC deployed Cold War command, control, and coordination systems to SAGE radar stations to create dispersed NORAD Control Centers. == Background == Prior to the SAGE Direction Centers becoming operational, the USAF deployed data link systems at NORAD Control Centers with ground computers for controlling crewed interceptors. After SAGE IBM AN/FSQ-7 Combat Direction Centrals became operational and the Super Combat Centers with improved (digital) computers were cancelled, a backup to SAGE was planned in the event the above-ground SAGE Air Defense Direction Center failed. == General Electric AN/GPA-37 Course Directing Group == BUIC began with deployment of General Electric AN/GPA-37 Course Directing Groups to several Long Range Radar stations. Units designated included the "U.S. Air Force 858th Air Defense Group (BUIC) [which became] a permanent operating facility" at Naval Air Station Fallon in Nevada. == BUIC II == BUIC II was used to command and control sites using the Burroughs AN/GSA-51 Radar Course Directing Group. North Truro AFS became the first ADC installation configured for BUIC II. == BUIC III == The AN/GYK-19 (initially AN/GSA-51A) was an upgraded version of the BUIC II system designated AN/GSA-51A and required a larger building than the AN/GSA-51. The first BUIC III site was Fort Fisher AFS, and Air Defense Command's was first installed at Fort Fisher Air Force Station, North Carolina. Although more advanced systems were contemplated, the final design of the BUIC III system was an upgraded version of the BUIC II with around twice the performance. == Closure and upgrade == In 1972, the USAF decided to shut down most of the BUIC sites; most of the sites mothballed by 1974, except for the BUIC III site at Tyndall Air Force Base. In Canada the BUIC site at Senneterre was shut down, but St Margarets remained open. The remaining sites were closed between 1983-1984 when SAGE was replaced by the Joint Surveillance System. The AN/FYQ-47 Common Digitizer for the Joint Surveillance System, and the Radar Video Data Processor (RVDP) was a combined system for the Air Force and Federal Aviation Administration (FAA), it replaced the SAGE Burroughs AN/FST-2 Coordinate Data Transmitting Sets.

    Read more →