AI Data Bias

AI Data Bias — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Operation Serenata de Amor

    Operation Serenata de Amor

    Operation Serenata de Amor is an artificial intelligence project designed to analyze public spending in Brazil. The project has been funded by a recurrent financing campaign since September 7, 2016, and came in the wake of major scandals of misappropriation of public funds in Brazil, such as the Mensalão scandal and what was revealed in the Operation Car Wash investigations. The analysis began with data from the National Congress then expanded to other types of budget and instances of government, such as the Federal Senate. The project is built through collaboration on GitHub and using a public group with more than 600 participants on Telegram. The name "Serenata de Amor," which means "serenade of love," was taken from a popular cashew cream bonbon produced by Chocolates Garoto in Brazil. == Modules == Throughout development of the project, new modules have been newly introduced in addition to the main repository: The main repository, serenata-de-amor, serves as the starting point for investigative work. Rosie is the robot programmed to identify public funds expenses with discrepancies, starting with CEAP (Quota for Exercise of Parliamentary Activity); it analyzes each of the reimbursements requested by the deputies and senators, indicating the reasons that lead it to believe they are suspicious. From Rosie was born whistleblower, which tweets under the name of @RosieDaSerenata, distributing the results found on social media. Jarbas (Github repository) is a data visualization tool which shows a complete list of reimbursements made available by the Chamber of Deputies and mined by Rosie. Toolbox is a Python installable package that supports the development of Serenata de Amor and Rosie. == History == Operation Serenata de Amor is an Artificial intelligence project for analysis of public expenditures. It was conceived in March 2016 by data scientist Irio Musskopf, sociologist Eduardo Cuducos and entrepreneur Felipe Cabral. The project was financed collectively in the Catarse platform, where it reached 131% of the collection goal paying 3 months of project development. Ana Schwendler, also a data scientist, Pedro Vilanova "Tonny", data journalist, Bruno Pazzim, software engineer, Filipe Linhares, a frontend engineer, Leandro Devegili, an entrepreneur and André Pinho took the first steps towards constructing the platform, such as collecting and structuring the first datasets. Jessica Temporal, data scientist and Yasodara Córdova "Yaso", researcher, Tatiana Balachova "Russa", UX designer, joined the project after the financing took place. The members created a recurring financing campaign, expanding the analysis of public spending to the Federal Senate. Donors make monthly payments ranging from 5 BRL to 200 BRL to maintain group activities. The monthly amount collected is around 10,000 BRL. == Results == In January 2017, concluding the period financed by the initial campaign, the group carried out an investigation into the suspicious activities found by the data analysis system. 629 complaints were made to the Ombudsman's Office of the Chamber of Deputies, questioning expenses of 216 federal deputies. In addition, the Facebook project page has more than 25,000 followers, and users frequently cite the operation as a benchmark in transparency in the Brazilian government. One of the examples of results obtained by the operation is the case of the Deputy who had to return about 700 BRL to the House after his expenses were analyzed by the platform. The platform was able to analyze more than 3 million notes, raising about 8,000 suspected cases in public spending. The community that supports the work of the team benefits from open source repositories, with licenses open for the collaboration. So much so that the two main data scientists of the project presented it at the CivicTechFest in Taipei, obtaining several mentions even in the international press. The technical leader presented the project in Poland during DevConf2017 in Kraków. It was also presented in the Google News Lab in 2017. It was presented by Yaso, when she was the Director of the initiative, at the MIT Media Lab/Berkman Klein Center Initiative for Artificial Intelligence ethics, and at the Artificial Intelligence and Inclusion Symposium, an initiative of the Global Network of Internet & Society Centers (NoC). It was also presented both by Irio and Yaso at the Digital Harvard Kennedy School, over a lunch seminar, where the transparency of the platform and the main solutions found were discussed, so that the code and data are always available to verify its suitability. This infographic provides information about the first results of Operation Serenata de Amor, a project that analyzes open data on public spending to find discrepancies. The project was presented by Yaso to the House Audit and Control Committee of the Chamber of Deputies in August 2017, and raised the interest of House officials who work with open data. The operation has been a source of inspiration for other civic projects that aim to work with similar goals, demonstrating the broader impact of artificial intelligence also in industry in Brazil. Participation of several team members in events throughout Brazil and abroad can be found on the Internet, such as presentation at OpenDataDay, held at Calango Hackerspace in the Federal District, Campus Party Bahia, Campus Party Brasilia, Friends of Tomorrow, XIII National Meeting of Internal Control, in the event USP Talks Hackfest against corruption in João Pessoa, the latter being also highlighted in the National Press.

    Read more →
  • Esdat

    Esdat

    ESdat is a data management, analysis and reporting software for environmental and groundwater data, developed by EarthScience Information Systems (EScIS). It is used to manage many types of environmental data including laboratory chemistry (analytical results, QA data, lab sample planning, and electronic Chain of Custody), field chemistry (water, gas, and soil), hydrogeological data (groundwater, borehole and well construction, lithological, geotechnical and stratigraphic, and LNAPL), meteorological data (rain, wind, and temperature), emission data (dust deposition, HiVol, air quality, and noise) and logger data. Data can be compared against environmental standards or site-specific trigger levels to generate exceedence tables, time series graphs, maps, statistics, and other outputs. ESdat integrates with Power BI and ArcGIS and data can also be exported in a range of other database formats, including USEPA Regions 2,4 & 5, and NYS DEC. ESdat is used by environmental consultants, government, mining and industry for validation, interrogation, and reporting of data derived from complex environmental programs, such as contaminated sites, groundwater investigations, and regulatory compliance for landfills or mining operations.

    Read more →
  • Genigraphics

    Genigraphics

    Genigraphics is a large-format printing service bureau specializing in providing poster session services to medical and scientific conferences throughout the US and Canada. The company began in 1973 as a division of General Electric. == History == Genigraphics began as a computer graphics system, developed by General Electric in the late 1960s, for NASA to use in space flight simulation. The technologies thus developed provided a foundation for the company's expansion into the commercial market. The Computed Images System & Services division (CISS, to become Genigraphics Corporation) of GE delivered the first presentation graphics system to Amoco Oil's corporate headquarters in 1973. It was named the 100 Series, and was based on DEC's PDP 11 series of mini computer systems. The first Genigraphics systems (100 Series and 100A Series) used an array of buttons, dials, knobs and joysticks, along with a built in keyboard, as the means of user interface. The PDP-11/40 computer was housed in a tall cabinet and used random access magnetic tape drives (DECtape) for storing completed presentations. The graphics generator (Forox recorder) was capable of outputting 2,000 line resolution, suitable for 35mm and 72mm film and large sheet film positive using larger cassettes for recording. 4000 and 8000 line resolution was later achieved with duplex scanning and 4x scanning by modifying to the Forox recorder's settings menu. Subsequent models (100B,C,D,D+ and D+/GVP) replaced the knobs and dials with an on screen, text based menu system, a graphics tablet and a pen. The pen/tablet combination gave way to a mouse like device in later models, and served to provide the interface with the graphics tools. User interaction with the computer for functions such as media initialization or modem to modem data transfer required a DECwriter serial terminal. In 1982, GE divested the Genigraphics division along with a host of other "non essential" business units (Genitext, Geniponics) and Genigraphics Corporation was born. Shortly after the divestiture, the headquarters of Genigraphics was moved from Liverpool, New York to Saddle Brook, New Jersey. Major success followed as the company grew exponentially over the next few years selling both systems and slide creation services. Genigraphics film recorders produced high-resolution digital images on 35mm film. The computer-generated scenes for The Last Starfighter were calculated on a Cray X-MP supercomputer and mastered with a Genigraphics film recorder. At its peak, Genigraphics Corporation employed roughly 300 people in 24 offices worldwide, with revenues upwards of $70 million annually. By the late 1980s Genigraphics saw demand for its proprietary systems dwindle and began selling the MASTERPIECE 8770 film recorder and GRAFTIME software as a peripheral for DEC Vaxes, IBM PC AT’s, and Mac NuBus machines. But the MASTERPIECE film recorder proved too expensive to sell in volume. In 1988, the company began a partnership with Microsoft to help develop the PowerPoint software. In exchange, every copy of PowerPoint included a “Send to Genigraphics” link to have files sent to a Genigraphics service bureau to be produced as 35mm slides. This partnership continued until 2001. In 1989, after three years of flat revenue, Genigraphics sold its hardware business in order to focus on its service bureau business and partnership with Microsoft via PowerPoint. In 1994, all assets of Genigraphics, including equipment, software development, in-house artwork, trademarks, and rights to the Microsoft partnership, were sold to InFocus Corporation of Wilsonville, Oregon who continued to operate under the Genigraphics brand name. The twenty-four service bureaus were consolidated to a 20,000 square foot facility next to the FedEx hub in Memphis, Tennessee. This allowed PowerPoint slide orders to be received until 10pm and delivered across the United States by the following morning. In 1995, InFocus registered www.genigraphics.com and was among the first to offer a form of ecommerce allowing 35mm slides, color prints and transparencies, printed booklets, and digital projectors to be purchased online. In 1998, then current management bought Genigraphics from InFocus and have operated it continuously ever since as Genigraphics LLC. That same year, InFocus projector rentals were added to the “Send to Genigraphics” link in PowerPoint and Genigraphics became the rental and repair center for all InFocus national accounts. It also marked Genigraphics entry into the new industry of large format printing; leveraging their knowledge of, and access to, PowerPoint programming code to develop a proprietary printer driver to output directly to an Epson 9500 wide format printer. At the time, Genigraphics was the exclusive 35mm slide vendor for all Kinko’s stores in the United States and poster printing was added to the arrangement. In 2003, Genigraphics closed their 35mm slide E6 photo lab – one of the last high-volume commercial E6 labs in the US – and expanded their large format printing capabilities. Since 2003, Genigraphics has become a major player in the poster session market, providing printing and on-site services to medical and scientific conferences throughout the US and Canada. As of February 2019, over 150,000 medical or scientific ‘ePosters’ are made available through their ResearchPosters.com archive service. === Partnership with Microsoft and development of PowerPoint === As presentations began to be created on personal computers in the late 80’s, Genigraphics sought presentation software partners in Silicon Valley who would be interested in sending files to Genigraphics via dial-up modem to be produced on 35mm slides. In 1987, Michael Beetner, Director of Marketing Planning for Genigraphics, met with Robert Gaskins, head of Microsoft's Graphics Business Unit, who was leading the development of the newly released PowerPoint software. A joint development agreement between Microsoft and Genigraphics was agreed upon and announced at Mac World 1988. According to Erica Robles-Anderson and Patrik Svensson, "It would be hard to overestimate Genigraphics’ influence on PowerPoint. PowerPoint 2.0 was designed for Genigraphics film recorders. It shipped with Genigraphics color palettes, schemes, and the distinctively Genigraphics color-gradient backgrounds. The application contained a ‘Send to Genigraphics’ menu item that wrote the presentation to floppy disk or transmitted the order directly via modem. Within three and a half months PowerPoint orders accounted for ten percent of revenue at Genigraphics service centers. PowerPoint 3.0 was even more intimately dependent upon Genigraphics. The software incorporated a collection of clip art images and symbols that had been produced by hundreds of artists at dozens of service centers across tens of thousands of presentations. Genigraphics artists designed PowerPoint 3.0 colors, templates, and sample presentations. The software even used Genigraphics (rather than Excel) chart style. Bar charts were rendered two-dimensionally with apparent thickness added to make them seemingly recede from the axes. The technique made it easier for viewers to compare bar heights and estimate values from axis ticks and labels. Pie charts were handled analogously. Microsoft paid Genigraphics to produce more than 500 clip art drawings and symbols used in Microsoft programs.” In exchange for Genigraphics development efforts, Microsoft included a “Send to Genigraphics” link in every copy of PowerPoint through the 10.0 version (2000/2001). The arrangement came to an end when Microsoft restructured as a result of anti-trust lawsuits.

    Read more →
  • Mobile Passport Control

    Mobile Passport Control

    Mobile Passport Control (MPC) is a mobile app that enables eligible travelers entering the United States to submit their passport information and customs declaration form to Customs and Border Protection via smartphone or tablet and go through the inspections process using an expedited lane. It is available to "U.S. citizens, U.S. lawful permanent residents, Canadian B1/B2 citizen visitors and returning Visa Waiver Program travelers with approved ESTA". The app is available on iOS and Android devices and is operational at 34 US airports, 14 international airports offering preclearance facilities, and 4 seaports. The use of Mobile Passport Control operations have increased threefold from 2016 to 2017. == History == Mobile Passport Control operations were launched in Atlanta at the Hartsfield-Jackson International Airport in 2016 and is now available at 34 U.S. airports, 14 international airports that offer preclearance and 4 U.S. cruise ports. The Mobile Passport app is authorized by CBP and sponsored by the Airports Council International-North America, Boeing, and the Port of Everglades. Airside Mobile, Inc. secured a Series A funding of $6 million in the fall of 2017. == How it works == During the customs process at the Federal Inspection Service (FIS) area of a U.S. airport, travelers arriving from international locations typically wait in long lines before presenting passports and paperwork and verbally answering questions made by CBP officials. Eligible travelers who have downloaded the Mobile Passport app can expedite this process by submitting information regarding their passport and trip details, and a newly-taken selfie, via their mobile device to CBP officials, then access an expedited line. Mobile Passport Control users will be required to show their physical passport(s) and briefly talk to a CBP officer. == Locations == === US airports === Atlanta (ATL) Baltimore (BWI) Boston (BOS) Charlotte (CLT) Chicago (ORD) Dallas/Ft Worth (DFW) Denver (DEN) Detroit (DTW) as of 7/2024 Ft. Lauderdale (FLL) Honolulu (HNL) Houston (HOU and IAH) Kansas City (MCI) Las Vegas (LAS) Los Angeles (LAX) Miami (MIA) Minneapolis (MSP) New York (JFK) Newark (EWR) Oakland (OAK) Orlando (MCO) Palm Beach (PBI) Philadelphia (PHL) Phoenix (PHX) Pittsburgh (PIT) Portland (PDX) Sacramento (SMF) San Diego (SAN) San Francisco (SFO) San Jose (SJC) San Juan (SJU) Seattle (SEA) Tampa (TPA) Washington Dulles (IAD) === International Preclearance locations === Abu Dhabi (AUH) Aruba (AUA) Bermuda (BDA) Calgary (YYC) Dublin (DUB) Edmonton (YEG) Halifax (YHZ) Montreal (YUL) Nassau (NAS) Ottawa (YOW) Shannon (SNN) Toronto (YYZ) Vancouver (YVR) Winnipeg (YWG) Sepinggan (BPN) === Seaports === Fort Lauderdale (PEV) Miami (MSE) San Juan (PUE) West Palm Beach (WPB)

    Read more →
  • ReactiveX

    ReactiveX

    ReactiveX (Rx, also known as Reactive Extensions) is a software library originally created by Microsoft that allows imperative programming languages to operate on sequences of data regardless of whether the data is synchronous or asynchronous. It provides a set of sequence operators that operate on each item in the sequence. It is an implementation of reactive programming and provides a blueprint for the tools to be implemented in multiple programming languages. == Overview == ReactiveX is an API for asynchronous programming with observable streams. Asynchronous programming allows programmers to call functions and then have the functions "callback" when they are done, usually by giving the function the address of another function to execute when it is done. Programs designed in this way often avoid the overhead of having many threads constantly starting and stopping. Observable streams (i.e. streams that can be observed) in the context of Reactive Extensions are like event emitters that emit three events: next, error, and complete. An observable emits next events until it either emits an error event or a complete event. However, at that point it will not emit any more events, unless it is subscribed to again. The examples below use the RxJS implementation of Reactive Extensions for the JavaScript programming language. === Motivation === For sequences of data, it combines the advantages of iterators with the flexibility of event-based asynchronous programming. It also works as a simple promise, eliminating the pyramid of doom that results from multiple layers of callbacks. === Observables and observers === ReactiveX is a combination of ideas from the observer and the iterator patterns and from functional programming. An observer subscribes to an observable sequence. The sequence then sends the items to the observer one at a time, usually by calling the provided callback function. The observer handles each one before processing the next one. If many events come in asynchronously, they must be stored in a queue or dropped. In ReactiveX, an observer will never be called with an item out of order or (in a multi-threaded context) called before the callback has returned for the previous item. Asynchronous calls remain asynchronous and may be handled by returning an observable. It is similar to the iterators pattern in that if a fatal error occurs, it notifies the observer separately (by calling a second function). When all the items have been sent, it completes (and notifies the observer by calling a third function). The Reactive Extensions API also borrows many of its operators from iterator operators in other programming languages. Reactive Extensions is different from functional reactive programming as the Introduction to Reactive Extensions explains: It is sometimes called "functional reactive programming" but this is a misnomer. ReactiveX may be functional, and it may be reactive, but "functional reactive programming" is a different animal. One main point of difference is that functional reactive programming operates on values that change continuously over time, while ReactiveX operates on discrete values that are emitted over time. (See Conal Elliott's work for more-precise information on functional reactive programming.) === Reactive operators === An operator is a function that takes one observable (the source) as its first argument and returns another observable (the destination, or outer observable). Then for every item that the source observable emits, it will apply a function to that item, and then emit it on the destination Observable. It can even emit another Observable on the destination observable. This is called an inner observable. An operator that emits inner observables can be followed by another operator that in some way combines the items emitted by all the inner observables and emits the item on its outer observable. Examples include: switchAll – subscribes to each new inner observable as soon as it is emitted and unsubscribes from the previous one. mergeAll – subscribes to all inner observables as they are emitted and outputs their values in whatever order it receives them. concatAll – subscribes to each inner observable in order and waits for it to complete before subscribing to the next observable. Operators can be chained together to create complex data flows that filter events based on certain criteria. Multiple operators can be applied to the same observable. Some of the operators that can be used in Reactive Extensions may be familiar to programmers who use functional programming language, such as map, reduce, group, and zip. There are many other operators available in Reactive Extensions, though the operators available in a particular implementation for a programming language may vary. ==== Reactive operator examples ==== Here is an example of using the map and reduce operators. We create an observable from a list of numbers. The map operator will then multiply each number by two and return an observable. The reduce operator will then sum up all the numbers provided to it (the value of 0 is the starting point). Calling subscribe will register an observer that will observe the values from the observable produced by the chain of operators. With the subscribe method, we are able to pass in an error-handling function, called whenever an error is emitted in the observable, and a completion function when the observable has finished emitting items. ==== Usage in stream-oriented programming ==== Certain RxJS primitives such as BehaviorSubject make it possible to create pure stateful streams to track application state of arbitrary complexity in simple terms. The button below will feed an event to the stream, which in turn will re-emit the next natural number every time, back into the tag that follows and displays the count of clicks detected. Libraries such as Rimmel.js, designed around RxJS Observables, enable integration between reactive streams and the HTML DOM: == History == Reactive Extensions was created by the Cloud Programmability Team at Microsoft around 2011, as a byproduct of a larger effort called Volta. It was originally intended to provide an abstraction for events across different tiers in an application to support tier splitting in Volta. The project's logo represents an electric eel, which is a reference to Volta. The extensions suffix in the name is a reference to the Parallel Extensions technology which was invented around the same time; the two are considered complementary. The initial implementation of Rx was for .NET Framework and was released on June 21, 2011. Later, the team started the implementation of Rx for other platforms, including JavaScript and C++. The technology was released as open source in late 2012, initially on CodePlex. Later, the code moved to GitHub and has been ported to several other languages, including Go, Java, Kotlin, PHP and Rust.

    Read more →
  • Supervisor Mode Access Prevention

    Supervisor Mode Access Prevention

    Supervisor Mode Access Prevention (SMAP) is a feature of some CPU implementations such as the Intel Broadwell microarchitecture that allows supervisor mode programs to optionally set user-space memory mappings so that access to those mappings from supervisor mode will cause a trap. This makes it harder for malicious programs to "trick" the kernel into using instructions or data from a user-space program. == History == Supervisor Mode Access Prevention is designed to complement Supervisor Mode Execution Prevention (SMEP), which was introduced earlier. SMEP can be used to prevent supervisor mode from unintentionally executing user-space code. SMAP extends this protection to reads and writes. == Benefits == Without Supervisor Mode Access Prevention, supervisor code usually has full read and write access to user-space memory mappings (or has the ability to obtain full access). This has led to the development of several security exploits, including privilege escalation exploits, which operate by causing the kernel to access user-space memory when it did not intend to. Operating systems can block these exploits by using SMAP to force unintended user-space memory accesses to trigger page faults. Additionally, SMAP can expose flawed kernel code which does not follow the intended procedures for accessing user-space memory. However, the use of SMAP in an operating system may lead to a larger kernel size and slower user-space memory accesses from supervisor code, because SMAP must be temporarily disabled any time supervisor code intends to access user-space memory. == Technical details == Processors indicate support for Supervisor Mode Access Prevention through the Extended Features CPUID leaf. SMAP is enabled when memory paging is active and the SMAP bit in the CR4 control register is set. SMAP can be temporarily disabled for explicit memory accesses by setting the EFLAGS.AC (Alignment Check) flag. The stac (Set AC Flag) and clac (Clear AC Flag) instructions can be used to easily set or clear the flag. When the SMAP bit in CR4 is set, explicit memory reads and writes to user-mode pages performed by code running with a privilege level less than 3 will always result in a page fault if the EFLAGS.AC flag is not set. Implicit reads and writes (such as those made to descriptor tables) to user-mode pages will always trigger a page fault if SMAP is enabled, regardless of the value of EFLAGS.AC. == Operating system support == Linux kernel support for Supervisor Mode Access Prevention was implemented by H. Peter Anvin. It was merged into the mainline Linux 3.7 kernel (released December 2012) and it is enabled by default for processors which support the feature. FreeBSD has supported Supervisor Mode Execution Prevention since 2012 and Supervisor Mode Access Prevention since 2018. OpenBSD has supported Supervisor Mode Access Prevention and the related Supervisor Mode Execution Prevention since 2012, with OpenBSD 5.3 being the first release with support for the feature enabled. NetBSD support for Supervisor Mode Execution Prevention (SMEP) was implemented by Maxime Villard in December 2015. Support for Supervisor Mode Access Prevention (SMAP) was also implemented by Maxime Villard, in August 2017. NetBSD 8.0 was the first release with both features supported and enabled. Haiku support for Supervisor Mode Execution Prevention (SMEP) was implemented by Jérôme Duval in January 2018. macOS has support for SMAP at least since macOS 10.13 released 2017.

    Read more →
  • Randonautica

    Randonautica

    Randonautica (a portmanteau of "random" + "nautica") is an app launched on February 22, 2020 founded by Auburn Salcedo and Joshua Lengfelder. It randomly generates coordinates that encourages the user to explore their local area and report what is found. According to its creators, the app is "an attractor of strange things," letting one choose specific coordinates based on a specific theme. It gained controversy after a report of two teenagers coincidentally finding a corpse while using the application. == Overview == The app, which creators claim to be inspired by chaos theory and Guy Debord's Theory of the Dérive, offers its users three types of coordinates to choose from: an attractor, a void, or an anomaly. The app has a cult following on YouTube and TikTok and there is a subreddit made by the creators for users of the app. == History == 29-year-old circus performer Joshua Lengfelder discovered a bot called Fatum Project in a fringe science chat group on Telegram in January 2019. According to The New York Times, "He absorbed the project’s theories about how random exploration could break people out of their predetermined realities, and how people could influence random outcomes with their minds." Lengfelder then created a Telegram bot using Fatum Project's code, generating coordinates. He then created the subreddit r/randonauts in March. In October, developer Simon Nishi McCorkindale made the bot's webpage. With the help of Auburn Salcedo, chief executive of a TV agency, both created Randonauts LLC. Salcedo became the chief operating officer while Lengfelder was the CEO. The app, called Randonautica, was launched on February 22, 2020. Later the same year the app and back-end got completely overhauled by a new team of developers and got a more visual and friendlier design and logo. In April 2022 Lengfelder exited Randonauts LLC and Auburn Salcedo became CEO. == Reception == The app has as many as 10.8 million users as of July 2020, gaining popularity amid the COVID-19 pandemic in the United States as restrictions have been lightened. Emma Chamberlain made a YouTube video about the app that helped increase its following. i-D reported that the hashtag #randonautica has gained 176.5 million views on TikTok, although it has not marketed itself yet. === Controversy === With the app's popularity, users started reporting coincidences which many find unsettling. The majority of reports were from TikTok and Reddit, as well as Telegram. The most notable controversy involved a group of people heading to a beach in Duwamish Head, Puget Sound, West Seattle per the app, where they found a bag with two dead bodies, a 27-year-old male and a 36-year-old female, as reported by the Seattle Police homicide detectives. In August 2020, police arrested and charged their landlord, Michael Lee Dudley, in connection with the murders. In March 2021, Dudley was denied bail while other people were under suspicion of aiding Dudley in the dismemberment and disposal of the bodies, but no one else had been charged. This has caused speculation that the app has an intended, puzzle-like theme. However, Lengfelder stated that it is "a shocking coincidence." Salcedo called the videos fake, and that "It’s so hard to manage, because people are really taking creative liberties after seeing how much traction the app is getting in that fear factor." In 2022, Michael Dudley was convicted of second degree murder for killing both victims, who were identified as Jessica Lewis and Austin Wenner. He was sentenced to 46 years in prison the following year. In their questions page, Randonautica's creators have said that if the app generates coordinates inside a private property, it is a violation of their terms and conditions to trespass. In addition, Randonautica has also received allegations that the app is used for human trafficking, which its creators have denied, saying that data collected by the app are anonymous. It also ensured that the app is not designed to violate religious customs, saying that "the app is simply a tool. Just as a knife can be used either to prepare dinner or to cut somebody."

    Read more →
  • List of security-focused operating systems

    List of security-focused operating systems

    This is a list of operating systems specifically focused on security. Similar concepts include security-evaluated operating systems that have achieved certification from an auditing organization, and trusted operating systems that provide sufficient support for multilevel security and evidence of correctness to meet a particular set of requirements. == Linux == === Android-based === GrapheneOS is a security-focused, Android-based mobile OS that uses a hardened kernel, C library, custom memory allocator (hardened_malloc), and a hardened Chromium-based browser named Vanadium. It also offers privacy/security features, such as Duress PIN/Password or disabling the USB-C port at a driver/hardware level to avoid exploitation. It deploys exploit mitigations such as hardware-based memory tagging, secure app spawning, restricted dynamic code loading, and more. === Debian-based === Linux Kodachi is a security-focused operating system. Tails is aimed at preserving privacy and anonymity. KickSecure is a security-focused Linux distribution that aims to be "hardened by default". It uses network hardening, kernel hardening, Strong Linux User Account Isolation, better randomness, root access restrictions, and app-specific hardening. Whonix is an anonymity focused operating system based on KickSecure. It consists of two virtual machines, And all communications are routed through Tor. === Other Linux distributions === Alpine Linux is designed to be small, simple, and secure. It uses musl, BusyBox, and OpenRC instead of the more commonly used glibc, GNU Core Utilities, and systemd. Owl - Openwall GNU/Linux, a security-enhanced Linux distribution for servers. Secureblue, a Fedora Silverblue based distro that uses a hardened kernel, custom memory allocator (hardened_malloc), Trivalent, a security-focused, Chromium-based browser inspired by Vanadium, and many other exploit mitigations. == BSD == OpenBSD is a Unix-like operating system that emphasizes portability, standardization, correctness, proactive security, and integrated cryptography. == Xen == Qubes OS aims to provide security through isolation. Isolation is provided through the use of virtualization technology. This allows the segmentation of applications into secure virtual machines.

    Read more →
  • AI Overviews

    AI Overviews

    AI Overviews is an artificial intelligence (AI) feature integrated into Google Search that produces AI-generated summaries of search results. The feature has been criticized for its inaccuracy and for reducing website traffic. == History and development == AI Overviews were first introduced as part of Google's Search Generative Experience (SGE), which was unveiled at the Google I/O conference in May 2023. In May 2024 at Google I/O 2024, the feature was rebranded as AI Overviews and launched in the United States. The introduction of AI Overviews was seen as a strategic move to compete with other generative AI advancements, including OpenAI's ChatGPT. By August 2024, AI Overviews was rolled out to several other countries, including the United Kingdom, India, Japan, Brazil, Mexico, and Indonesia, with support for multiple languages. In October 2024, Google expanded the feature globally, making it available in over 100 countries. In December 2024, Botify x Demandsphere released findings stating that when AI Overviews and featured snippets appear together on the search engine results page, they take up approximately 67.1% of the screen on desktop and 75.7% on mobile. Even if content is ranking in the #1 position, it may not be visible to consumers if other visual elements on the results page are more prominent. In March 2025, Google started testing an "AI Mode", where the search results page is AI-generated. The company was also considering adding advertisements to the AI Mode, as they already exist in AI Overviews. As of May 2025, AI Overviews are available in over 200 countries and territories and in more than 40 languages. As of March 2026, Google AI Overviews appear on more than 48% of total Google Search queries, compared to just 6.49% in the previous year (58% year-over-year growth). == Functionality == The AI Overviews feature uses large language models to generate summaries from web content. The overviews are designed to be concise, providing a snapshot of relevant information about the queried topic. Google allows users to adjust the language complexity in summaries, offering both simplified and detailed options. The overviews also include links to sources. According to a June 2025 study by Semrush, the most cited source is Quora, followed by Reddit. == Reception == The feature has faced criticism for inaccuracies, including instances where erroneous or nonsensical content was generated. Depending on what is searched for, the overview may also consist of hallucinated content, such as when searching for idioms that do not exist. In May 2024, Google temporarily restricted the AI tool after it provided suggestions that were seen as nonsensical and harmful, such as telling users to eat rocks or apply glue on pizza. Concerns were also raised by content publishers, who feared a decline in web traffic as users relied on the summaries instead of visiting source websites. A Google patent from 2026 raised the concern of webmasters that Google could entirely replace the landing page of websites by an AI optimized copy of the website in its results. There is also apprehension about the ethical implications of AI-driven content aggregation, including its impact on intellectual property rights and the visibility of smaller content providers. The European Commission announced in December 2025 that they were investigating whether AI Overviews breached European competition law. In response, Google has stated its commitment to improve content validation and refine the algorithms used to filter unreliable information. Google implemented measures to prioritize link placement within AI Overviews, aiming to balance user convenience with the needs of content creators. In January 2026, Google restricted AI Overviews on certain health-related searches following an investigation by The Guardian. == Lawsuits == On February 24, 2025, Chegg sued Alphabet over the AI Overviews feature, claiming that it was leading to students preferring "low-quality, unverified AI summaries", thus violating antitrust law. Chegg also said it was considering either a sale or a take-private transaction. In September 2025, Penske Media Corporation, the publisher of Rolling Stone and The Hollywood Reporter, sued Google, claiming that AI Overviews illegally regurgitate content from their websites and drive off potential site visitors by always appearing on top of the search results while leaving little incentive to see the linked sources. The company stated that "the future of digital media and [...] its integrity [...] is threatened by Google's current actions", alleging that 20% of searches that link to Penske-owned websites show AI Overviews and that the figure is expected to rise. Google spokesperson José Castañeda called the claims "meritless" and stated that "AI Overviews send traffic to a greater diversity of sites." In 2026, Canadian musician Ashley MacIsaac filed a lawsuit against Google claiming that the AI Overview feature had wrongly stated that MacIsaac had been convicted of numerous criminal offences and was on the sex offender registry. He claims this incorrect information led to the cancellation of a December 2025 gig organized by the Sipekne'katik First Nation.

    Read more →
  • Image-based modeling and rendering

    Image-based modeling and rendering

    In computer graphics and computer vision, image-based modeling and rendering (IBMR) methods rely on a set of two-dimensional images of a scene to generate a three-dimensional model and then render some novel views of this scene. The traditional approach of computer graphics has been used to create a geometric model in 3D and try to reproject it onto a two-dimensional image. Computer vision, conversely, is mostly focused on detecting, grouping, and extracting features (edges, faces, etc.) present in a given picture and then trying to interpret them as three-dimensional clues. Image-based modeling and rendering allows the use of multiple two-dimensional images in order to generate directly novel two-dimensional images, skipping the manual modeling stage. == Light modeling == Instead of considering only the physical model of a solid, IBMR methods usually focus more on light modeling. The fundamental concept behind IBMR is the plenoptic illumination function which is a parametrisation of the light field. The plenoptic function describes the light rays contained in a given volume. It can be represented with seven dimensions: a ray is defined by its position ( x , y , z ) {\displaystyle (x,y,z)} , its orientation ( θ , ϕ ) {\displaystyle (\theta ,\phi )} , its wavelength ( λ ) {\displaystyle (\lambda )} and its time ( t ) {\displaystyle (t)} : P ( x , y , z , θ , ϕ , λ , t ) {\displaystyle P(x,y,z,\theta ,\phi ,\lambda ,t)} . IBMR methods try to approximate the plenoptic function to render a novel set of two-dimensional images from another. Given the high dimensionality of this function, practical methods place constraints on the parameters in order to reduce this number (typically to 2 to 4). == IBMR methods and algorithms == View morphing generates a transition between images Panoramic imaging renders panoramas using image mosaics of individual still images Lumigraph relies on a dense sampling of a scene Space carving generates a 3D model based on a photo-consistency check

    Read more →
  • Truth discovery

    Truth discovery

    Truth discovery (also known as truth finding) is the process of choosing the actual true value for a data item when different data sources provide conflicting information on it. Several algorithms have been proposed to tackle this problem, ranging from simple methods like majority voting to more complex ones able to estimate the trustworthiness of data sources. Truth discovery problems can be divided into two sub-classes: single-truth and multi-truth. In the first case only one true value is allowed for a data item (e.g birthday of a person, capital city of a country). While in the second case multiple true values are allowed (e.g. cast of a movie, authors of a book). Typically, truth discovery is the last step of a data integration pipeline, when the schemas of different data sources have been unified and the records referring to the same data item have been detected. == General principles == The abundance of data available on the web makes more and more probable to find that different sources provide (partially or completely) different values for the same data item. This, together with the fact that we are increasing our reliance on data to derive important decisions, motivates the need of developing good truth discovery algorithms. Many currently available methods rely on a voting strategy to define the true value of a data item. Nevertheless, recent studies, have shown that, if we rely only on majority voting, we could get wrong results even in 30% of the data items. The solution to this problem is to assess the trustworthiness of the sources and give more importance to votes coming from trusted sources. Ideally, supervised learning techniques could be exploited to assign a reliability score to sources after hand-crafted labeling of the provided values; unfortunately, this is not feasible since the number of needed labeled examples should be proportional to the number of sources, and in many applications the number of sources can be prohibitive. == Single-truth vs multi-truth discovery == Single-truth and multi-truth discovery are two very different problems. Single-truth discovery is characterized by the following properties: only one true value is allowed for each data item; different values provided for a given data item oppose to each other; values and sources can either be correct or erroneous. While in the multi-truth case the following properties hold: the truth is composed by a set of values; different values could provide a partial truth; claiming one value for a given data item does not imply opposing to all the other values; the number of true values for each data item is not known a priori. Multi-truth discovery has unique features that make the problem more complex and should be taken into consideration when developing truth-discovery solutions. The examples below point out the main differences of the two methods. Knowing that in both examples the truth is provided by source 1, in the single truth case (first table) we can say that sources 2 and 3 oppose to the truth and as a result provide wrong values. On the other hand, in the second case (second table), sources 2 and 3 are neither correct nor erroneous, they instead provide a subset of the true values and at the same time they do not oppose the truth. == Source trustworthiness == The vast majority of truth discovery methods are based on a voting approach: each source votes for a value of a certain data item and, at the end, the value with the highest vote is select as the true one. In the more sophisticated methods, votes do not have the same weight for all the data sources, more importance is indeed given to votes coming from trusted sources. Source trustworthiness usually is not known a priori but estimated with an iterative approach. At each step of the truth discovery algorithm the trustworthiness score of each data source is refined, improving the assessment of the true values that in turn leads to a better estimation of the trustworthiness of the sources. This process usually ends when all the values reach a convergence state. Source trustworthiness can be based on different metrics, such as accuracy of provided values, copying values from other sources and domain coverage. Detecting copying behaviors is very important, in fact, copy allows to spread false values easily making truth discovery very hard, since many sources would vote for the wrong values. Usually systems decrease the weight of votes associated to copied values or even don’t count them at all. == Single-truth methods == Most of the currently available truth discovery methods have been designed to work well only in the single-truth case. Below are reported some of the characteristics of the most relevant typologies of single-truth methods and how different systems model source trustworthiness. === Majority voting === Majority voting is the simplest method, the most popular value is selected as the true one. Majority voting is commonly used as a baseline when assessing the performances of more complex methods. === Web-link based === These methods estimate source trustworthiness exploiting a similar technique to the one used to measure authority of web pages based on web links. The vote assigned to a value is computed as the sum of the trustworthiness of the sources that provide that particular value, while the trustworthiness of a source is computed as the sum of the votes assigned to the values that the source provides. === Information-retrieval based === These methods estimate source trustworthiness using similarity measures typically used in information retrieval. Source trustworthiness is computed as the cosine similarity (or other similarity measures) between the set of values provided by the source and the set of values considered true (either selected in a probabilistic way or obtained from a ground truth). === Bayesian based === These methods use Bayesian inference to define the probability of a value being true conditioned on the values provided by all the sources. P ( v ∣ ψ ( o ) ) = P ( ψ ( o ) ∣ v ) ⋅ P ( v ) P ( ψ ( o ) ) {\displaystyle P(v\mid \psi (o))={\frac {P(\psi (o)\mid v)\cdot P(v)}{P(\psi (o))}}} where v {\displaystyle \textstyle v} is a value provided for a data item o {\displaystyle \textstyle o} and ψ ( o ) {\displaystyle \textstyle \psi (o)} is the set of the observed values provided by all the sources for that specific data item. The trustworthiness of a source is then computed based on the accuracy of the values that provides. Other more complex methods exploit Bayesian inference to detect copying behaviors and use these insights to better assess source trustworthiness. == Multi-truth methods == Due to its complexity, less attention has been devoted to the study of the multi-truth discovery Below are reported two typologies of multi-truth methods and their characteristics. === Bayesian based === These methods use Bayesian inference to define the probability of a group of values being true conditioned on the values provided by all the data sources. In this case, since there could be multiple true values for each data item, and sources can provide multiple values for a single data item, it is not possible to consider values individually. An alternative is to consider mappings and relations between set of provided values and sources providing them. The trustworthiness of a source is then computed based on the accuracy of the values that provides. More sophisticated methods also consider domain coverage and copying behaviors to better estimate source trustworthiness. === Probabilistic Graphical Models based === These methods use probabilistic graphical models to automatically define the set of true values of given data item and also to assess source quality without need of any supervision. == Applications == Many real-world applications can benefit from the use of truth discovery algorithms. Typical domains of application include: healthcare, crowd/social sensing, crowdsourcing aggregation, information extraction and knowledge base construction. Truth discovery algorithms could be also used to revolutionize the way in which web pages are ranked in search engines, going from current methods based on link analysis like PageRank, to procedures that rank web pages based on the accuracy of the information they provide.

    Read more →
  • Foreign key

    Foreign key

    A foreign key is a set of attributes in a table that refers to the primary key of another table, linking these two tables. In the context of relational databases, a foreign key is subject to an inclusion dependency constraint that the tuples consisting of the foreign key attributes in one relation, R, must also exist in some other (not necessarily distinct) relation, S; furthermore that those attributes must also be a candidate key in S. In other words, a foreign key is a set of attributes that references a candidate key. For example, a table called TEAM may have an attribute, MEMBER_NAME, which is a foreign key referencing a candidate key, PERSON_NAME, in the PERSON table. Since MEMBER_NAME is a foreign key, any value existing as the name of a member in TEAM must also exist as a person's name in the PERSON table; in other words, every member of a TEAM is also a PERSON. == Summary == The table containing the foreign key is called the child table, and the table containing the candidate key is called the referenced or parent table. In database relational modeling and implementation, a candidate key is a set of zero or more attributes, the values of which are guaranteed to be unique for each tuple (row) in a relation. The value or combination of values of candidate key attributes for any tuple cannot be duplicated for any other tuple in that relation. Since the purpose of the foreign key is to identify a particular row of referenced table, it is generally required that the foreign key is equal to the candidate key in some row of the primary table, or else have no value (the NULL value.). This rule is called a referential integrity constraint between the two tables. Because violations of these constraints can be the source of many database problems, most database management systems provide mechanisms to ensure that every non-null foreign key corresponds to a row of the referenced table. For example, consider a database with two tables: a CUSTOMER table that includes all customer data and an ORDER table that includes all customer orders. Suppose the business requires that each order must refer to a single customer. To reflect this in the database, a foreign key column is added to the ORDER table (e.g., CUSTOMERID), which references the primary key of CUSTOMER (e.g. ID). Because the primary key of a table must be unique, and because CUSTOMERID only contains values from that primary key field, we may assume that, when it has a value, CUSTOMERID will identify the particular customer which placed the order. However, this can no longer be assumed if the ORDER table is not kept up to date when rows of the CUSTOMER table are deleted or the ID column altered, and working with these tables may become more difficult. Many real world databases work around this problem by 'inactivating' rather than physically deleting master table foreign keys, or by complex update programs that modify all references to a foreign key when a change is needed. Foreign keys play an essential role in database design. One important part of database design is making sure that relationships between real-world entities are reflected in the database by references, using foreign keys to refer from one table to another. Another important part of database design is database normalization, in which tables are broken apart and foreign keys make it possible for them to be reconstructed. Multiple rows in the referencing (or child) table may refer to the same row in the referenced (or parent) table. In this case, the relationship between the two tables is called a one to many relationship between the referencing table and the referenced table. In addition, the child and parent table may, in fact, be the same table, i.e. the foreign key refers back to the same table. Such a foreign key is known in SQL:2003 as a self-referencing or recursive foreign key. In database management systems, this is often accomplished by linking a first and second reference to the same table. A table may have multiple foreign keys, and each foreign key can have a different parent table. Each foreign key is enforced independently by the database system. Therefore, cascading relationships between tables can be established using foreign keys. A foreign key is defined as an attribute or set of attributes in a relation whose values match a primary key in another relation. The syntax to add such a constraint to an existing table is defined in SQL:2003 as shown below. Omitting the column list in the REFERENCES clause implies that the foreign key shall reference the primary key of the referenced table. Likewise, foreign keys can be defined as part of the CREATE TABLE SQL statement. If the foreign key is a single column only, the column can be marked as such using the following syntax: Foreign keys can be defined with a stored procedure statement. child_table: the name of the table or view that contains the foreign key to be defined. parent_table: the name of the table or view that has the primary key to which the foreign key applies. The primary key must already be defined. col3 and col4: the name of the columns that make up the foreign key. The foreign key must have at least one column and at most eight columns. == Referential actions == Because the database management system enforces referential constraints, it must ensure data integrity if rows in a referenced table are to be deleted (or updated). If dependent rows in referencing tables still exist, those references have to be considered. SQL:2003 specifies 5 different referential actions that shall take place in such occurrences: CASCADE RESTRICT NO ACTION SET NULL SET DEFAULT === CASCADE === Whenever rows in the parent (referenced) table are deleted (or updated), the respective rows of the child (referencing) table with a matching foreign key column will be deleted (or updated) as well. This is called a cascade delete (or update). === RESTRICT === A value cannot be updated or deleted when a row exists in a referencing or child table that references the value in the referenced table. Similarly, a row cannot be deleted as long as there is a reference to it from a referencing or child table. To understand RESTRICT (and CASCADE) better, it may be helpful to notice the following difference, which might not be immediately clear. The referential action CASCADE modifies the "behavior" of the (child) table itself where the word CASCADE is used. For example, ON DELETE CASCADE effectively says "When the referenced row is deleted from the other table (master table), then delete also from me". However, the referential action RESTRICT modifies the "behavior" of the master table, not the child table, although the word RESTRICT appears in the child table and not in the master table! So, ON DELETE RESTRICT effectively says: "When someone tries to delete the row from the other table (master table), prevent deletion from that other table (and of course, also don't delete from me, but that's not the main point here)." RESTRICT is not supported by Microsoft SQL 2012 and earlier. === NO ACTION === NO ACTION and RESTRICT are very much alike. The main difference between NO ACTION and RESTRICT is that with NO ACTION the referential integrity check is done after trying to alter the table. RESTRICT does the check before trying to execute the UPDATE or DELETE statement. Both referential actions act the same if the referential integrity check fails: the UPDATE or DELETE statement will result in an error. In other words, when an UPDATE or DELETE statement is executed on the referenced table using the referential action NO ACTION, the DBMS verifies at the end of the statement execution that none of the referential relationships are violated. This is different from RESTRICT, which assumes at the outset that the operation will violate the constraint. Using NO ACTION, the triggers or the semantics of the statement itself may yield an end state in which no foreign key relationships are violated by the time the constraint is finally checked, thus allowing the statement to complete successfully. === SET NULL, SET DEFAULT === In general, the action taken by the DBMS for SET NULL or SET DEFAULT is the same for both ON DELETE or ON UPDATE: the value of the affected referencing attributes is changed to NULL for SET NULL, and to the specified default value for SET DEFAULT. === Triggers === Referential actions are generally implemented as implied triggers (i.e. triggers with system-generated names, often hidden.) As such, they are subject to the same limitations as user-defined triggers, and their order of execution relative to other triggers may need to be considered; in some cases it may become necessary to replace the referential action with its equivalent user-defined trigger to ensure proper execution order, or to work around mutating-table limitations. Another important limitation appears with transaction isolation: your changes to a row may not be able to fully cascade because the row is ref

    Read more →
  • TensorFlow Hub

    TensorFlow Hub

    TensorFlow Hub (also styled TF Hub) is an open-source machine learning library and online repository that provides TensorFlow model components, called modules. It is maintained by Google as part of the TensorFlow ecosystem and allows developers to discover, publish, and reuse pretrained models for tasks such as computer vision, natural language processing, and transfer learning. == Overview == TensorFlow Hub provides a central platform where developers and researchers can access pre-trained models and integrate them directly into TensorFlow workflows. Each module encapsulates a computation graph and its trained weights, with standardized input and output signatures. Modules can be loaded using the hub.load() function or through Keras integration via hub.KerasLayer, enabling users to perform transfer learning or feature extraction. == History == TensorFlow Hub was announced by Google in March 2018, with the first public version released shortly after. Its introduction coincided with the growing adoption of transfer learning techniques and the need for standardized model packaging. Over time, the hub expanded to include models such as the BERT family, MobileNet, EfficientNet, and the Universal Sentence Encoder. In 2020, research on “Regret selection in TensorFlow Hub” explored the problem of identifying optimal models for downstream tasks given a large repository of alternatives. == Applications == TensorFlow Hub hosts a variety of models across machine learning domains: Natural language processing: BERT, ALBERT language model, and Universal Sentence Encoder. Computer vision: ResNet, Inception (deep learning), MobileNet, EfficientNet. Speech and audio: spectrogram feature extractors and automatic speech recognition models. Multilingual embeddings: cross-lingual and sentence-level representations for machine translation and semantic similarity. Modules are widely used in education, academic research, and industry for prototyping and production deployment.

    Read more →
  • Vx-underground

    Vx-underground

    vx-underground, also known as VXUG, is an educational website about malware and cybersecurity. It claims to have the largest online repository of malware. The site was launched in May, 2019 and has grown to host over 35 million pieces of malware samples. On their account on Twitter, VXUG reports on and verifies cybersecurity breaches. == Reception == Kim Crawley compared the site to VirusTotal and states that vx-underground is more susceptible to suspicion for law enforcement. == Data breach reports == In May 2024, the International Baccalaureate organizations faced allegations over supposed breaches in their IT infrastructure after an incident of examination leaks. Upon inspecting leaked data, VXUG were the first to report that the breach seemed legitimate on the morning of May 6.

    Read more →
  • Crackme

    Crackme

    A crackme is a small computer program designed to test a programmer's reverse engineering skills. Crackmes are made as a legal way to crack software, since no intellectual property is being infringed. == Description == Crackmes often incorporate protection schemes and algorithms similar to those used in proprietary software. However, they can sometimes be more challenging because they may use advanced packing or protection techniques, making the underlying algorithm harder to analyze and modify. == Keygenme == A keygenme is specifically designed for the reverser to not only identify the protection algorithm used in the application but also create a small key generator (keygen) in the programming language of their choice. Most keygenmes, when properly manipulated, can be made self-keygenning. For example, during validation, they might generate the correct key internally and compare it to the user's input. This allows the key generation algorithm to be easily replicated. Anti-debugging and anti-disassembly routines are often used to confuse debuggers or render disassembly output useless. Code obfuscation is also used to further complicate reverse engineering.

    Read more →