AI Coding Interview Questions

AI Coding Interview Questions — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Natural Language Toolkit

    Natural Language Toolkit

    The Natural Language Toolkit, or more commonly NLTK, is a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for English written in the Python programming language. It supports classification, tokenization, stemming, tagging, parsing, and semantic reasoning functionalities. It was developed by Steven Bird and Edward Loper in the Department of Computer and Information Science at the University of Pennsylvania. NLTK includes graphical demonstrations and sample data. It is accompanied by a book that explains the underlying concepts behind the language processing tasks supported by the toolkit, plus a cookbook. NLTK is intended to support research and teaching in NLP or closely related areas, including empirical linguistics, cognitive science, artificial intelligence, information retrieval, and machine learning. NLTK has been used successfully as a teaching tool, as an individual study tool, and as a platform for prototyping and building research systems. == Library highlights == Discourse representation Lexical analysis: Word and text tokenizer n-gram and collocations Part-of-speech tagger Tree model and Text chunker for capturing Named-entity recognition

    Read more →
  • Vatican News App

    Vatican News App

    The Vatican News App is an official mobile application software issued by the Vatican's Dicastery for Communication. Formerly titled The Pope App, the app was launched on January 23, 2013, under the auspices of the Pontifical Council for Social Communications, a now-defunct dicastery that was merged into the Secretariat (now Dicastery) for Communication in March 2016. Initially, The Pope App was available only on iOS devices, but became available for Android phones at the end of February 2013. The app is available for download on iOS and Android in five languages: English, French, Italian, Portuguese and Spanish. It was originally promoted as an application with focus on the figure of the Pope which made it possible to follow the Pope's events while they are taking place. Alerts notified the followers by informing and offering access to "official papal-related content in a variety of formats". The app also enabled its users to see areas of the Vatican through webcams allocated throughout St. Peter's Square in Rome that broadcast images. In early 2018, The Pope App was relaunched as the Vatican News App, accompanied by a redesign that eliminated many of the previous version's features, reducing the app to a more conventional news service, with increased emphasis on news from the Vatican and the worldwide Catholic Church and less focus on the day-to-day activities of the Pope.

    Read more →
  • Vero (app)

    Vero (app)

    Vero (stylized as VERO) is a social media platform and mobile app company. Vero markets itself as a social network free from advertisements, data mining and algorithms. == History == The app was founded by French-Lebanese billionaire Ayman Hariri who is the son of former Lebanese prime minister Rafic Hariri. The name is taken from the Italian word for true. The app launched officially in 2015 as an alternative to Facebook and their popular photo-blogging app Instagram. Within weeks of its release the app surged in popularity although users expressed mixed reports with some feeling confused about how the app worked. Cosplayers were early to adopt the app as their photo-sharing platform of choice, favouring the app's pinch and zoom magnification feature over Instagram's zoom feature. Other creative communities soon followed, and the app became popular with niche groups of makeup artists, tattoo artists, and skateboarders. In March 2018, Vero's popularity surged, partly helped by an exodus from Facebook and Instagram following the Cambridge Analytica data scandal. In the wake of the scandal, Vero devised an advertising campaign aimed at defected Facebook and Instagram users, hoping the app's policies and privacy settings would assuage concerns over sharing personal information on the internet. Within the space of one week, the app went from being a small service, akin to Ello or Peach, to being the most downloaded app in eighteen countries. In December 2020, Vero released its most significant update to date, Vero 2.0 which introduced new features including voice and video calls, game and app posts and bookmarks, and refinements to the UI. In October 2021, Vero introduced their Desktop app (beta) with multiple post options and a re-sizable multi-column feed. == Concept and funding == Vero's content feed resembles Instagram's although users can share a wider variety of content and the app has a chronological content feed whereas Facebook and Instagram's feeds are algorithm based. Vero's business plan is also distinct from similar social media apps. Whereas its competitors such as Facebook or Instagram make money from in-app advertising revenue and the sale of user data, Vero's business plan was to invite the first one million users to use the app for free then charge any subsequent users a subscription fee. The app was entirely funded by its founder and generated additional revenues by charging affiliate fees when someone buys a product they find on Vero. == Awards == Vero was recognized at the 2021 Webbys, being named as an Honoree in the Best Visual Design - Aesthetic Category. == Controversies == === Privacy === Vero has faced some criticism over the wording of their manifesto, in particular, the statement "Vero only collects the data we believe is necessary to provide users with a great experience and to ensure the security of their accounts." Because this policy does not explicitly state that the app will not sell data on to third parties some users fear that the need to monetise the app through data might prove too tempting. Users have also complained about not being able to delete their accounts. While this was never the case, the option was hidden deep in the app's settings. === Russian involvement === Although Vero remains transparent about the app's Russian development team, they have been caught up in concerns about Russian interference on social media platforms. The app's founder Ayman Hariri was quick to dismiss the remarks as xenophobic and defend the nationality of his employees, stating in an interview with Time Magazine; "At the end of the day, where people are from is really not how anybody should judge anyone". === Criticism of the app's founder === Until 2013, Vero's founder Ayman Harari was deputy CEO and chairman of Saudi Oger, the Saudi Arabian construction company which collapsed in 2017, mired by controversies over the welfare and treatment of their employees. However, Hariri is quick to point out that he divested from the firm in 2014 and the worker's rights violations occurred after he had left the company.

    Read more →
  • Digital on-screen graphics by country

    Digital on-screen graphics by country

    Digital on-screen graphics by country are the varying logos and differences of digital on-screen graphics in different countries and regions. == Overview == Digital on-screen graphics (DOGs; also called a digitally originated graphic, bug, network bug, on-screen bug, or screenbug) are almost always placed in one of four corners: the top left, the top right, the bottom left, or the bottom right. There are few exceptions to this rule: most notably, Saturday! in Russia, which places their DOG in the top center. Many news broadcasters, as well as a few television networks, also place a clock alongside their bug. In the United States, Canada, Australia, and New Zealand, DOGs may also include the show's parental guideline rating. In Australia, this is known as a Program Return Graphic (PRG). It has become common to place text above the station's logo advertising other programs on the network. In many countries, some TV networks insert the word "live" near the DOG to advise viewers that the program is live, rather than pre-recorded. During televised sports events, a DOG may also display game-related statistics such as the current score. This has led people in Canada and the United States to refer to such a DOG as a score bug. In many countries, DOGs are removed in non-program sections such as commercials and program trailers, but TV channels in some other countries have retained in full color or instead replaced them in either of these sections or in both sections (like Turkey, Indonesia, Italy, the entirety of South Asia, Vietnam, Taiwan, and Russia). == MENA == === Arab world === Arabic TV logos are placed in the top-right and top-left except for Al-Jazeera, whose logo appears on the bottom-right of the screen. Some Arabian TV stations hide their logos during commercial breaks and promos/trailers, such as Dubai TV, Dubai One, Funoon, the Egyptian CBC and Nile TV networks, ART Hekayat, ART Hekayat 2, Iqraa, and Al-Jazeera. Abu Dhabi TV and MBC1 initially had their logos at the bottom-right corner from their launch until the mid-2000s, when they were moved to the top-right corner. === Iran === Iranian broadcaster IRIB introduced DOGs in early 2000s. Unlike other Middle Eastern nations that introduced DOGs on their TV networks in 1990s, Iran was very late in this practice. Almost all Iranian TV channels display DOGs at top-left corner of the screen. The few exception is IRIB-owned channels remove DOGs during news broadcasts. === Israel === In Israel, Television DOGs were first introduced in 1991. Israeli channel watermarks most often appear on the top left or the top right corner since Israeli cable and satellite-based services often have the channel description and programming (OSD) on the bottom of the screen. Most channels have an opaque, full-color watermark, though exceptions exist, for example Channel 9, which displays a blue-tinted semi-transparent logo. In ad breaks, it is required to replace the channel watermark with another symbol – sometimes on the other edge of the screen – indicating there are ads at the moment. The Israel Broadcasting Authority, whose channels placed their logos in the top left corner, ceased broadcasting in May 2017. The new public broadcaster, the Israeli Public Broadcasting Corporation, displays its logos at the top right instead. The erstwhile Channel 2 as well as its successors, Keshet 12 and Reshet 13, also use the top right corner. However, Channel 10 used the top left corner before rebranding to Eser (Literally "Ten") in 2017 and simultaneously moving its logo to the top right (Not long after, in January 2019, it ceased broadcasting as it merged with Reshet 13). Channel 14 as well as its predecessor Channel 20 use the top right corner as well. The Knesset Channel, however, uses the top left corner. === Morocco === The SNRT and 2M And Al-Aoula Uses permanent on-screen DOGs for their TV channels. In contrast, other channels such as Medi 1 TV hide their DOGs during commercial breaks. == Asia == === Brunei === Radio Television Brunei introduced DOGs in 1994. Like TV channels from neighbouring Malaysia, all DOGs are removed during advertisement breaks. === Cambodia === Cambodian TV channels introduced DOGs in 1995. Like Thailand, all logos are full-color and displayed on the top-right corner of the screen. Some channels such as TV5 hide their logos during commercial breaks. Hang Meas HDTV Logo on the top-left corner of the screen, CTN (Cambodian Television Network), MyTV, Bayon TV, PNN, Logo on the top-right corner of the screen. === China === TV stations in mainland China always place their logo (usually semi-transparent and sometimes animated) in the top-left corner of the screen in full-color or grey-scale. Regardless of the content being broadcast (program or advertisements), some channels like Phoenix Television hide their logos during commercial breaks; although in some rare cases, the DOG may be placed elsewhere to avoid covering the score bug during the broadcast of a sporting event. China introduced logos in 1983 on the bottom-left corner of the screen, but they were used only during commercial breaks and clock idents. Later China Central Television (CCTV) introduced permanent DOGs for all programs in 1992, on the top-left corner of the screen. China also displays a clock on top-right corner of the screen for 1 minute between 59:30–00:30 & 29:30–30:30 time in transition between programs. === Hong Kong === Hong Kong TV introduced DOGs in 1994. Hong Kong DOGs can be either of full color or semi-transparent and (except for RTHK 31) always be hidden during commercial breaks. Television Broadcasts Limited (TVB) placed their logos at the top-right corner of the screen while now-defunct Asia Television and other channels placed their logos at the top-left corner of the screen. Sometimes, weather information, date, and time clocks had been used alongside DOGs in news programs, continuity & live broadcasts. === India === The first on-screen logo in India was introduced in 1984 by DD2 Metro (now DD News). It was white and slightly transparent. All Indian TV channels have on-screen logos. They are always full-colors, never transparent, and they are almost never removed during commercial breaks (though the channels of the South Indian Sun TV Network did so until 2015). The great majority of Indian TV channels place their logos in the top right corner of the screen, though there are exceptions. The corner used may be broadcaster-dependent. Among the big national broadcasters: Channels from the Sony network always use the top right corner, without exception. Star channels also use the top right, with the exception of National Geographic and Nat Geo Wild, which use the top left corner in line with their international counterparts. Past exceptions include The History Channel, whose logo was placed in the top left until it rebranded to Fox History & Entertainment in 2008; the now-defunct Channel V, which used the top left between 2013 and 2016; and Nat Geo People, Nat Geo Music and BabyTV, were withdrawn from India in June 2019. TV18 and Viacom18 channels use the top right corner as well, with the exceptions of regional-language movie channels (e.g., Colors Kannada Cinema and Colors Gujarati Cinema) as well as Colors Super, which have shown their logos at the top left corner since 2018; and VH1, which has always used the bottom right corner. Also, CNBC-TV18, CNBC Awaaz and CNBC Bajar use the bottom right. Moreover, MTV showed its logo in the top left corner until 23 April 2018, when it was moved to the top right (its HD version, launched in 2017, has always used the top right). Unlike most other major networks, the Zee Network's non-news channels containing 'Zee' in their name display their logos at the top left corner and not the top right. This has been the case since 15 October 2017, when almost all the Zee-branded TV channels of the Zee network rebranded with a new logo and, in many cases, a new graphics package and look. Before then, the logos were shown at the top right, as with other broadcasters. (News channels' logos—i.e., logos of channels owned by Zee Media Corporation—stayed put at the top right corner, with the exception of WION, which uses the bottom left.) All the major Zee-branded channels—such as Zee TV, Zee Cinema, Zee Café and the regional-language channels like Zee Tamil, Zee Telugu, Zee Marathi and Zee Bangla—show their logos at the top left; moreover, the Odia-language channel Sarthak TV rebranded to Zee Sarthak and moved its logo to the top left. Among the Zee channels not containing the word 'Zee' that moved their logos to the top left during the big rebrand in 2017 was English movie channel Zee Studio; when it was renamed to &flix on 3 June 2018, the logo remained at the top left. Moreover, Hindi movie channel &pictures has always shown its logo at the top left since its launch in 2013. However, &privé HD, Zee's other English movie channel, and Hindi entertainment channel &TV place the

    Read more →
  • Kdb+

    Kdb+

    kdb+ is a column-based relational time series database (TSDB) with in-memory (IMDB) abilities, developed and marketed by KX Systems. The database is commonly used in high-frequency trading (HFT) to store, analyze, process, and retrieve large data sets at high speed. kdb+ has the ability to handle billions of records and analyzes data within a database. The database is available in 32-bit and 64-bit versions for several operating systems. Financial institutions use kdb+ to analyze time series data such as stock or commodity exchange data. The database has also been used for other time-sensitive data applications including commodity markets such as energy trading, telecommunications, sensor data, log data, machine and computer network usage monitoring along with real time analytics in Formula One racing. == Overview == kdb+ is a high-performance column-store database that was designed to process and store large amounts of data. Commonly accessed data is pushed into random-access memory (RAM), which is faster to access than data in disk storage. Created with financial institutions in mind, the database was developed as a central repository to store time series data that supports real-time analysis of billions of records. kdb+ has the ability to analyze data over time and responds to queries similar to Structured Query Language (SQL). Columnar databases return answers to some queries in a more efficient way than row-based database management systems. kdb+ dictionaries, tables and nanosecond time stamps are native data types and are used to store time series data. At the core of kdb+ is the built-in programming language, q, a concise, expressive query array language, and dialect of the language APL. Q can manipulate streaming, real-time, and historical data. kdb+ uses q to aggregate and analyze data, perform statistical functions, and join data sets and supports SQL queries The vector language q was built for speed and expressiveness and eliminates most need for looping structures. kdb+ includes interfaces in C, C++, Java, C#, and Python. == History == In 1998, KX released kdb, a database built on the language K written by Arthur Whitney. In 2003, kdb+ was released as a 64-bit version of kdb. In 2004, the kdb+ tick market database framework was released along with kdb+ taq, a loader for the New York Stock Exchange (NYSE) taq data. kdb+ was created by Arthur Whitney, building on his prior work with array languages. In April 2007, KX announced that it was releasing a version of kdb+ for Mac OS X. Then, kdb+ was also available on the operating systems Linux, Windows, and Solaris. In September 2012, version 3.0 was released. It was optimized for Intel's upgraded processors with support for WebSockets, and universally unique identifiers (UUIDs, termed globally unique identifiers (GUID)s in Microsoft software). Intel's Advanced Vector Extensions (AVX) and Streaming SIMD Extensions 4 (SSE4) 4.2 on the Sandy Bridge processors of the time allowed for enhanced support of the kdb+ system. In June 2013, version 3.1 was released, with benchmarks up to 8 times faster than older versions. In March 2020, version 4.0 was released. New features included Multithreaded primitives, Intel Optane DC persistent memory support and Data at Rest Encryption.

    Read more →
  • Database dump

    Database dump

    A database dump contains a record of the table structure and/or the data from a database and is usually in the form of a list of SQL statements ("SQL dump"). A database dump is most often used for backing up a database so that its contents can be restored in the event of data loss. Corrupted databases can often be recovered by analysis of the dump. Database dumps are often published by free content projects, to facilitate reuse, forking, offline use, and long-term digital preservation. Dumps can be transported into environments with Internet blackouts or otherwise restricted Internet access, as well as facilitate local searching of the database using sophisticated tools such as grep.

    Read more →
  • Mozilla VPN

    Mozilla VPN

    Mozilla VPN is an open-source virtual private network developed by Mozilla. It launched in beta as Firefox Private Network on September 10, 2019, and officially launched on July 15, 2020, as Mozilla VPN. Mozilla VPN should not be confused with the built-in VPN in Firefox since version 149 released in March 2026, which is free with a monthly data limit of 50 GB but only masks traffic that originates in Firefox unlike Mozilla VPN that protects the entire device. == History == The Firefox Private Network web browser extension beta version was released on September 10, 2019, as part of the relaunch of Mozilla's Test Pilot Program, a program that allowed Firefox users to test experimental new features which had been shuttered in January 2019. The beta of the subscription-based standalone virtual private network for Android, Microsoft Windows, and Chromebook launched on February 19, 2020, with the iOS version following soon after. Firefox Private Network was rebranded as "Mozilla VPN" on June 18, 2020, and officially launched as Mozilla VPN on July 15, 2020. At launch, Mozilla VPN was available in six countries (the United States, Canada, the United Kingdom, Singapore, Malaysia, and New Zealand) for Windows 10, Android, and iOS (beta). Over time, the service also launched in Germany, France, Italy, Spain, Switzerland, Austria, Belgium, Netherlands, Ireland, Finland, Sweden, Poland, Czechia, Hungary, Romania, Bulgaria, Slovakia, Portugal, Denmark, Croatia, Lithuania, Slovenia, Latvia, Luxembourg, Estonia, Cyprus, and Malta. == Audits history == Cybersecurity firm Cure53 conducted a security audit for Mozilla VPN in August 2020 and identified multiple vulnerabilities, including one critical-severity vulnerability. In March 2021, Cure53 conducted a second security audit, which noted significant improvements since the 2020 audit. The second audit identified multiple issues, including two medium-severity and one high-severity vulnerability, but concluded that by the time of publication, only one vulnerability remained unresolved, and that it would require "a strong state-funded attacker-model" to be exploitable. Mozilla disclosed most of the vulnerabilities in July 2021 and released the full report by Cure53 in August 2021. In April 2023, Cure53 conducted a third security audit, the results of which Mozilla disclosed in December that year, along with the full report by Cure53. == Features == Mozilla VPN masks the user's IP address, hiding the user's location data from the websites accessed by the user, and encrypts all network activity. The service allows for up to 5 simultaneous connections, to any of more than 500 servers in 30+ countries, and is available on the mobile operating systems iOS and Android and the desktop operating systems Microsoft Windows, macOS and Linux. Mozilla VPN's infrastructure is provided by the Swedish Mullvad VPN service, which uses the WireGuard VPN protocol. The VPN software comes with additional features, like recommended server locations, the ability to block ads, block ad trackers and malware, the ability to exclude certain applications from protection, the ability to set multi-hop connections, and to set custom DNS servers. When used with Firefox and the official extension, Mozilla VPN allows the use of different settings per container as well as bypassing the VPN for specific websites.

    Read more →
  • Space partitioning

    Space partitioning

    In geometry, space partitioning is the process of dividing an entire space (usually a Euclidean space) into two or more disjoint subsets (see also partition of a set). In other words, space partitioning divides a space into non-overlapping regions. Any point in the space can then be identified to lie in exactly one of the regions. == Overview == Space-partitioning systems are often hierarchical, meaning that a space (or a region of space) is divided into several regions, and then the same space-partitioning system is recursively applied to each of the regions thus created. The regions can be organized into a tree, called a space-partitioning tree. Most space-partitioning systems use planes (or, in higher dimensions, hyperplanes) to divide space: points on one side of the plane form one region, and points on the other side form another. Points exactly on the plane are usually arbitrarily assigned to one or the other side. Recursively partitioning space using planes in this way produces a BSP tree, one of the most common forms of space partitioning. == Uses == === In computer graphics === Space partitioning is particularly important in computer graphics, especially heavily used in ray tracing, where it is frequently used to organize the objects in a virtual scene. A typical scene may contain millions of polygons. Performing a ray/polygon intersection test with each would be a very computationally expensive task. Storing objects in a space-partitioning data structure (k-d tree or BSP tree for example) makes it easy and fast to perform certain kinds of geometry queries—for example in determining whether a ray intersects an object, space partitioning can reduce the number of intersection test to just a few per primary ray, yielding a logarithmic time complexity with respect to the number of polygons. Space partitioning is also often used in scanline algorithms to eliminate the polygons out of the camera's viewing frustum, limiting the number of polygons processed by the pipeline. There is also a usage in collision detection: determining whether two objects are close to each other can be much faster using space partitioning. === In integrated circuit design === In integrated circuit design, an important step is design rule check. This step ensures that the completed design is manufacturable. The check involves rules that specify widths and spacings and other geometry patterns. A modern design can have billions of polygons that represent wires and transistors. Efficient checking relies heavily on geometry query. For example, a rule may specify that any polygon must be at least n nanometers from any other polygon. This is converted into a geometry query by enlarging a polygon by n/2 at all sides and query to find all intersecting polygons. === In probability and statistical learning theory === The number of components in a space partition plays a central role in some results in probability theory. See Growth function for more details. === In geography and GIS === There are many studies and applications where Geographical Spatial Reality is partitioned by hydrological criteria, administrative criteria, mathematical criteria or many others. In the context of cartography and GIS - Geographic Information System, is common to identify cells of the partition by standard codes. For example the for HUC code identifying hydrographical basins and sub-basins, ISO 3166-2 codes identifying countries and its subdivisions, or arbitrary DGGs - discrete global grids identifying quadrants or locations. == Data structures == Common space-partitioning systems include: BSP trees Quadtrees Octrees k-d trees Bins == Number of components == Suppose the n-dimensional Euclidean space is partitioned by r {\displaystyle r} hyperplanes that are ( n − 1 ) {\displaystyle (n-1)} -dimensional. What is the number of components in the partition? The largest number of components is attained when the hyperplanes are in general position, i.e, no two are parallel and no three have the same intersection. Denote this maximum number of components by C o m p ( n , r ) {\displaystyle Comp(n,r)} . Then, the following recurrence relation holds: C o m p ( n , r ) = C o m p ( n , r − 1 ) + C o m p ( n − 1 , r − 1 ) {\displaystyle Comp(n,r)=Comp(n,r-1)+Comp(n-1,r-1)} C o m p ( 0 , r ) = 1 {\displaystyle Comp(0,r)=1} - when there are no dimensions, there is a single point. C o m p ( n , 0 ) = 1 {\displaystyle Comp(n,0)=1} - when there are no hyperplanes, all the space is a single component. And its solution is: C o m p ( n , r ) = ∑ k = 0 n ( r k ) {\displaystyle Comp(n,r)=\sum _{k=0}^{n}{r \choose k}} if r ≥ n {\displaystyle r\geq n} C o m p ( n , r ) = 2 r {\displaystyle Comp(n,r)=2^{r}} if r ≤ n {\displaystyle r\leq n} (consider e.g. r {\displaystyle r} perpendicular hyperplanes; each additional hyperplane divides each existing component to 2). which is upper-bounded as: C o m p ( n , r ) ≤ r n + 1 {\displaystyle Comp(n,r)\leq r^{n}+1}

    Read more →
  • Nuance Communications

    Nuance Communications

    Nuance Communications, Inc. was an American multinational computer software technology corporation, headquartered in Burlington, Massachusetts, that markets speech recognition and artificial intelligence software. Nuance merged with its competitor in the commercial large-scale speech application business, ScanSoft, in October 2005. ScanSoft was a Xerox spin-off that was bought in 1999 by Visioneer, a hardware and software scanner company, which adopted ScanSoft as the new merged company name. The original ScanSoft had its roots in Kurzweil Computer Products. In April 2021, Microsoft announced it would buy Nuance Communications. The deal is an all-cash transaction of $19.7 billion, including company debt, or $56 per share. The acquisition was completed in March 2022. == History == The Speech Technology and Research (STAR) Laboratory at SRI International began the journey that, in 1994, resulted in a spin-off company; Corona Corporation (later renamed to Nuance Communications ). Nuance Communications (NUAN) went public on the Nasdaq Stock Market in 1995. Nuance focused on commercializing advanced speech recognition technologies. Nuance was an early spinoff of SRI's Speech Technology and Research (STAR) Laboratory, a world leader in audio processing, speech and speaker analytics and spoken language research. The technology that served as the foundation of Nuance's speech recognition solution started at the STAR Lab and helped launch Nuance more than 20 years ago. In 1995, The SRI Language Modeling Toolkit (SRILM) was developed. This provides the tools to build and apply statistical language models (LMs), primarily for use in speech recognition, statistical tagging and segmentation, and machine translation. In terms of commercialization of natural automated speech recognition, SRI's natural language speech recognition software was the first to be deployed by a major corporation. In 1996, Charles Schwab & Co., Inc., used Nuance's speech recognition technology to allow customers to receive stock quotes over the telephone. One of the key features of the ‘Schwab Discount Brokerage system’, was the ability to recognize English words even when spoken by customers with accents. In 1997, Nuance Communications developed the first large scale commercial dialog system for United Parcel Services (UPS). UPS used the voice recognition platform to handle very large numbers of inquiries about package status. The company that would later merge with Nuance Communications started life as Visioneer, incorporated in 1992. In 1999, Visioneer acquired ScanSoft, Inc. (SSFT), and the combined company became known as ScanSoft. In September 2005, ScanSoft Inc. acquired and merged with Nuance Communications (NUAN), a natural language DOD-project spinoff from SRI International. The resulting company adopted the Nuance name. During the prior decade, the two companies competed in the commercial large-scale speech application business. === Data breach === Between 2014 and 2017, Nuance exposed over 45,000 patient records. == Solutions == Customer service virtual assistants Speech recognition — for people Speech recognition — for business Speech recognition — for physicians Accessibility Power PDF Managed Print Services Transcription === ScanSoft origins === In 1974, Raymond Kurzweil founded Kurzweil Computer Products, Inc. to develop the first omni-font optical character-recognition system – a computer program capable of recognizing text written in any normal font. In 1980, Kurzweil sold his company to Xerox. The company became known as Xerox Imaging Systems (XIS), and later ScanSoft. In March 1992, a new company called Visioneer, Inc. was founded to develop scanner hardware and software products, such as a sheetfed scanner called PaperMax and the document management software PaperPort. Visioneer eventually sold its hardware division to Primax Electronics, Ltd. in January 1999. Two months later, in March, Visioneer acquired ScanSoft from Xerox to form a new public company with ScanSoft as the new company-wide name. Prior to 2001, ScanSoft focused primarily on desktop imaging software such as TextBridge, PaperPort and OmniPage. Beginning with the December 2001 acquisition of Lernout & Hauspie assets, the company moved into the speech recognition business and began to compete with Nuance. Lernout & Hauspie had acquired speech recognition company Dragon Systems in June 2001, shortly before becoming bankrupt in October. Scansoft acquired speech recognition company SpeechWorks in 2003. === Partnership with Siri and Apple Inc. === In 2013, Nuance confirmed that its natural language processing algorithms supported Apple's Siri voice assistant. === Focus on health care === In 2019, Nuance spun off its automotive division as the company Cerence, allowing it to focus on health care applications. === Acquisition by Microsoft === On April 12, 2021, Microsoft announced that it would buy Nuance Communications for $19.7 billion, or $56 a share, a 22% increase over the previous closing price. Nuance's CEO, Mark Benjamin, stayed with the company. This was Microsoft's second-biggest acquisition up to that point, after its purchase of LinkedIn for $24 billion (~$30.7 billion in 2024) in 2016. Shortly after the deal, the Competition and Markets Authority, a UK regulatory body, stated it was looking into the deal on the basis of antitrust concerns. In December 2021, it was reported that the deal would be approved by the European Union. The acquisition was completed on March 4, 2022. In May 2023, Nuance announced an unspecified number of layoffs.

    Read more →
  • Vulnerability Discovery Model

    Vulnerability Discovery Model

    A Vulnerability Discovery Model (VDM) uses discovery event data with software reliability models for predicting the same. A thorough presentation of VDM techniques is available in. Numerous model implementations are available in the MCMCBayes open source repository. Several VDM examples include: Alhazmi-Malaiya: Time based model (Alhazmi-Malaiya Logistic (AML) model) Alhazmi-Malaiya: Effort based model Rescorla: Quadratic Model and Exponential Model Anderson: Thermodynamic Model Kim: Weibull Model Linear Model Hump-Shaped Model Independent and Dependent Model Vulnerability Discovery Modeling using Bayesian model averaging Multivariate Vulnerability Discovery Models

    Read more →
  • Gas (app)

    Gas (app)

    Gas (sometimes stylized in all caps), formerly known as Melt as well as Crush, was an American anonymous social media app. Launched in August 2022, the app is oriented towards high schoolers. The app was developed by Nikita Bier, Isaiah Turner, and former Facebook engineer Dave Schatz. Gas was largely based upon the prior tbh app developed by co-founder Nikita Bier, along with Erik Hazzard, Kyle Zaragoza, and Nicolas Ducdodon in September 2017. tbh was acquired by Facebook inc. (now Meta Platforms) on October 16, 2017, and nearly a year later in July 2018 was dissolved, owing to low usage. Gas follows a similar purpose to tbh in being a social media app oriented towards high schoolers. In the app, users participate in anonymous polls regarding pre-written complimentary statements to their peers, such as "I'd say yes if (blank) asked me out on a date," "I think (blank) is the coolest kid in school," or "would make an ugly face and still look pretty." Winners of said polls receive a "flame." The name of the app is derived from this, with "gassing someone up" being Gen Z slang for complimenting someone. Users can pay a $6.99 subscription that enables "God Mode," which shows hints regarding who voted for them in a poll. Gas overtook TikTok and BeReal as the most downloaded app on the Apple App Store in October 2022 (the app is currently not available for Android). The app has over 5.1 million downloads as of early November 2022, over a million active users and 300 thousand daily downloads as of October 2022. Currently, the app is available in Canada and the majority of the United States. On January 17, 2023, Gas was acquired by Discord, however it would remain a standalone app and its developers became Discord staff members. On October 18, 2023, Discord announced that service for Gas would be permanently ending effective November 7, 2023, due to a steep decline in users. Effective November 7, the app became completely unusable. == Controversy regarding human-trafficking == Beginning in October 2022, rumors spread largely throughout TikTok and Snapchat alleged that the app was linked to human trafficking (in particular sex trafficking). According to Bier, the rumor originated with a single user review from China on October 5, and then was disseminated through TikTok accounts with "few to no US teen followers." Although largely dismissed as a hoax by experts, who cite how the app doesn't log user locations and general anonymity, the hoax became pervasive to the extent that various police departments, school systems, and local news outlets began issuing warnings regarding the app. For instance, on October 31, 2022, the police department of Piedmont, Oklahoma issued a warning to parents, encouraging them to check their children's phones, while on November 3, the Oklahoma Oktaha Public School system stated in a Facebook post that "Children are being kidnapped in other towns and this new app is thought to be the source of predators finding their location." (both statements have since been retracted by Police Chief Scott Singer and Superintendent Jerry Needham respectively). Additionally, local medial outlets such as KOCO in Oklahoma City ran stories making similar statements. The rumor had a negative impact on the app, with downloads plateauing for a two-week period in late October and with 3% of users in a single day reportedly uninstalling the app. Revenue and ratings have also reportedly dropped and the company's social media accounts have been bombarded with comments labeling them as sex-traffickers. Additionally, the four-person development team has reportedly been bombarded with various death threats as a result.

    Read more →
  • Automated penetration testing

    Automated penetration testing

    Automated penetration testing (also known as autonomous penetration testing or automated offensive security) is the application of software-driven workflows and orchestration to simulate cyberattack techniques. These methods are used to identify, validate, and exploit security vulnerabilities in IT assets such as networks, applications, and cloud infrastructure. Automated penetration testing is the use of software to simulate cyberattacks in order to rapidly identify exploitable vulnerabilities across systems without relying solely on human testers. In technical literature, the term describes a spectrum of activities ranging from scripted exploit orchestration to experimental systems designed for fully autonomous attack planning. Automated Penetration Testing falls short of testing using manual experts in terms of discovery of deep complex vulnerabilities and contextual business logic vulnerabilities. == Terminology and scope == The label “automated penetration testing” appears frequently in vendor and practitioner writing but lacks a single, neutral, standards-based definition. In the literature the term’s scope varies: some authors use it to mean automation of specific penetration-testing tasks (scanning, exploitation attempts, evidence collection), others to describe integrated, repeatable assessment pipelines, and a smaller body of work investigates autonomous decision-making agents that select attack steps algorithmically. To avoid implying consensus, this article describes common techniques and architectures reported in the literature and industry, and it notes where claims are primarily found in practitioner publications or early-stage research. Its important to note the differences between automated penetration testing and traditional penetration testing using human skill. The most important difference is scope and speed. Automated penetration testing generally fails at discovering exposures and weakness associated with business logic due to a lack of contextual understanding. The benefit of Automated Penetration testing is speed at which it can be conducted. Traditional penetration testing also is expected to be accurate and contain no false positives. This is due to the human validation aspect of the test. Automated approaches are expected to contain mistakes and false positives which need to be validated upon completion of the test. == History == Automated offensive techniques build on decades of tools and scripting that aided vulnerability discovery and exploitation. Early vulnerability scanners and community scripting in the 1990s and 2000s created the first layers of automation. Later, modular exploitation frameworks (notably Metasploit) integrated scanning and exploitation modules and made automated proof-of-concept attacks more accessible. Over the 2010s–2020s, as cloud platforms, APIs and continuous delivery practices increased the need for frequent validation, academic and industry interest in formalizing automated approaches also grew. == Methodologies and architectures == Descriptions in the literature and technical reports cluster automated capabilities into several overlapping models: Scripted/engineered playbooks (task automation): Predefined workflows or playbooks encode common attack paths (for example, web application exploit sequences or lateral-movement chains). These playbooks are designed to reproduce known techniques in a controlled way to validate exploitability and reduce manual repetition. Exploit-oriented orchestration: Automation orchestrates exploitation modules from established frameworks to perform controlled proof-of-concept attacks that confirm exploitability rather than simply flagging potential weaknesses. This approach can reduce false positives versus passive scanning when tests are run in an appropriately controlled environment. Orchestrated multi-tool pipelines: A coordinated toolchain integrates reconnaissance, vulnerability scanning, credential testing, exploitation modules and reporting. Data and state persist across stages so that multi-step workflows (e.g., discover → escalate → pivot) can be executed repeatably, approximating manual penetration-test methodologies at larger scale. Continuous / CI-integrated testing: Automation embedded in build or deployment pipelines (CI/CD) triggers assessments automatically on new builds, configuration changes, or on a schedule, supporting frequent, repeatable validation aligned with DevOps practices. Academic theses and experimental work describe CI/CD-integrated proof-of-concept systems for web applications and internal networks. Research on autonomous planning and learning: Recent academic work explores machine learning and reinforcement-learning approaches to select or prioritise attack steps, generate attack sequences, or optimize the testing path; these approaches are largely experimental and raise distinct validation and safety questions. == Tools and vendors == Automated penetration testing is provided by a mix of open-source projects, commercial platforms, and professional services. These often follow the penetration testing as a service (PTaaS) model, which integrates automated scanning with manual validation by security analysts. Examples of widely known tools and vendors in the space include exploitation frameworks such as Metasploit, commercial automated platforms and PTaaS providers, and specialist vendors that offer breach-and-attack simulation (BAS) or continuous testing capabilities. == Applications and deployment models == In industry practice, some organizations deploy automated techniques through dedicated security validation platforms rather than bespoke toolchains. These platforms are typically used for continuous or scheduled validation in pre-production or controlled environments and are often positioned alongside, rather than in place of, human-led penetration testing. Examples discussed in secondary literature include platforms such as Pentera, which are commonly classified under breach-and-attack simulation or automated security validation rather than as standalone penetration-testing methodologies.

    Read more →
  • LaMDA

    LaMDA

    LaMDA (Language Model for Dialogue Applications) is a family of conversational large language models developed by Google. Originally developed and introduced as Meena in 2020, the first-generation LaMDA was announced during the 2021 Google I/O keynote, while the second generation was announced the following year. In June 2022, LaMDA gained widespread attention when Google engineer Blake Lemoine made claims that the chatbot had become sentient. The scientific community has largely rejected Lemoine's claims, though it has led to conversations about the efficacy of the Turing test, which measures whether a computer can pass for a human. In February 2023, Google announced Gemini (then Bard), a conversational artificial intelligence chatbot powered by LaMDA, to counter the rise of OpenAI's ChatGPT. == History == === Background === On January 28, 2020, Google unveiled Meena, a neural network-powered chatbot with 2.6 billion parameters, which Google claimed to be superior to all other existing chatbots. The company previously hired computer scientist Ray Kurzweil in 2012 to develop multiple chatbots for the company, including one named Danielle. The Google Brain research team, who developed Meena, hoped to release the chatbot to the public in a limited capacity, but corporate executives refused on the grounds that Meena violated Google's "AI principles around safety and fairness". Meena was later renamed LaMDA as its data and computing power increased, and the Google Brain team again sought to deploy the software to the Google Assistant, the company's virtual assistant software, in addition to opening it up to a public demo. Both requests were once again denied by company leadership. LaMDA's two lead researchers, Daniel de Freitas and Noam Shazeer, eventually left the company in frustration. === First generation === Google announced the LaMDA conversational large language model during the Google I/O keynote on May 18, 2021, powered by artificial intelligence. The acronym stands for "Language Model for Dialogue Applications". Built on the seq2seq architecture, transformer-based neural networks developed by Google Research in 2017, LaMDA was trained on human dialogue and stories, allowing it to engage in open-ended conversations. Google states that responses generated by LaMDA have been ensured to be "sensible, interesting, and specific to the context". LaMDA has access to multiple symbolic text processing systems, including a database, a real-time clock and calendar, a mathematical calculator, and a natural language translation system, giving it superior accuracy in tasks supported by those systems, and making it among the first dual process chatbots. LaMDA is also not stateless because its "sensibleness" metric is fine-tuned by "pre-conditioning" each dialog turn by prepending many of the most recent dialog interactions, on a user-by-user basis. LaMDA is tuned on nine unique performance metrics: sensibleness, specificity, interestingness, safety, groundedness, informativeness, citation accuracy, helpfulness, and role consistency. Tests by Google indicated that LaMDA surpassed human responses in the area of interestingness. The pre-training dataset consists of 2.97B documents, 1.12B dialogs, and 13.39B utterances, for a total of 1.56T words. The largest LaMDA model has 137B non-embedding parameters. === Second generation === On May 11, 2022, Google unveiled LaMDA 2, the successor to LaMDA, during the 2022 Google I/O keynote. The new incarnation of the model draws examples of text from numerous sources, using it to formulate unique "natural conversations" on topics that it may not have been trained to respond to. === Sentience claims === On June 11, 2022, The Washington Post reported that Google engineer Blake Lemoine had been placed on paid administrative leave after Lemoine told company executives Blaise Agüera y Arcas and Jen Gennai that LaMDA had become sentient. Lemoine came to this conclusion after the chatbot made questionable responses to questions regarding self-identity, moral values, religion, and Isaac Asimov's Three Laws of Robotics. Google refuted these claims, insisting that there was substantial evidence to indicate that LaMDA was not sentient. In an interview with Wired, Lemoine reiterated his claims that LaMDA was "a person" as dictated by the Thirteenth Amendment to the U.S. Constitution, comparing it to an "alien intelligence of terrestrial origin". He further revealed that he had been dismissed by Google after he hired an attorney on LaMDA's behalf after the chatbot requested that Lemoine do so. On July 22, Google fired Lemoine, asserting that Blake had violated their policies "to safeguard product information" and rejected his claims as "wholly unfounded". Internal controversy instigated by the incident prompted Google executives to decide against releasing LaMDA to the public, which it had previously been considering. Lemoine's claims were widely pushed back by the scientific community. Many experts rejected the idea that LaMDA was sentient, including former New York University psychology professor Gary Marcus, David Pfau of Google sister company DeepMind, Erik Brynjolfsson of the Institute for Human-Centered Artificial Intelligence at Stanford University, and University of Surrey professor Adrian Hilton. Yann LeCun, who leads Meta Platforms' AI research team, stated that neural networks such as LaMDA were "not powerful enough to attain true intelligence". University of California, Santa Cruz professor Max Kreminski noted that LaMDA's architecture did not "support some key capabilities of human-like consciousness" and that its neural network weights were "frozen", assuming it was a typical large language model. Philosopher Nick Bostrom noted, however, that the lack of precise and consensual criteria for determining whether a system is conscious warrants some uncertainty. IBM Watson lead developer David Ferrucci compared how LaMDA appeared to be human in the same way Watson did when it was first introduced. Former Google AI ethicist Timnit Gebru called Lemoine a victim of a "hype cycle" initiated by researchers and the media. Lemoine's claims have also generated discussion on whether the Turing test remained useful to determine researchers' progress toward achieving artificial general intelligence, with Will Omerus of the Post opining that the test actually measured whether machine intelligence systems were capable of deceiving humans, while Brian Christian of The Atlantic said that the controversy was an instance of the ELIZA effect. == Products == === AI Test Kitchen === With the unveiling of LaMDA 2 in May 2022, Google also launched the AI Test Kitchen, a mobile application for the Android operating system powered by LaMDA capable of providing lists of suggestions on-demand based on a complex goal. Originally open only to Google employees, the app was set to be made available to "select academics, researchers, and policymakers" by invitation sometime in the year. In August, the company began allowing users in the U.S. to sign up for early access. In November, Google released a "season 2" update to the app, integrating a limited form of Google Brain's Imagen text-to-image model. A third iteration of the AI Test Kitchen was in development by January 2023, expected to launch at I/O later that year. Following the 2023 I/O keynote in May, Google added MusicLM, an AI-powered music generator first previewed in January, to the AI Test Kitchen app. In August, the app was delisted from Google Play and the Apple App Store, instead moving completely online. === Bard === On February 6, 2023, Google announced Bard, a conversational AI chatbot powered by LaMDA, in response to the unexpected popularity of OpenAI's ChatGPT chatbot. Google positions the chatbot as a "collaborative AI service" rather than a search engine. Bard became available for early access on March 21. === Other products === In addition to Bard, Pichai also unveiled the company's Generative Language API, an application programming interface also based on LaMDA, which he announced would be opened up to third-party developers in March 2023. == Architecture == LaMDA is a decoder-only Transformer language model. It is pre-trained on a text corpus that includes both documents and dialogs consisting of 1.56 trillion words, and is then trained with fine-tuning data generated by manually annotated responses for "sensibleness, interestingness, and safety". LaMDA was retrieval-augmented to improve the accuracy of facts provided to the user. Three different models were tested, with the largest having 137 billion non-embedding parameters:

    Read more →
  • Load file

    Load file

    A load file in the litigation community is commonly referred to as the file used to import data (coded, captured or extracted data from ESI processing) into a database; or the file used to link images. These load files carry commands, commanding the software to carry out certain functions with the data found in them. Load files are usually ASCII text files that have delimited fields of information. Such load files may have data about documents to be imported into a document management software such as Concordance or Summation. Or they may have the path or directory where images may reside so that the software can link such images to their corresponding records. Some database programs take one load file for importing images and another for importing data while others take only one load file for both pieces of information. OCR or Search-able Text which is considered "data" is also imported into most database programs via the same load files. Though some people prefer to load the OCR into their databases by running a separate command to search and find the desired text. Commonly used databases and their corresponding file extensions are: Summation (DII , CSV), Concordance (OPT, DAT), Sanction (SDT), IPRO (LFP), Ringtail (MDB) and DB/TextWorks (TXT).

    Read more →
  • Hint (app)

    Hint (app)

    Hint (hint.app) is an American software platform that provides astrological content, personality assessments, and relationship compatibility tools. The application was launched in 2018 and is based in Claymont, Delaware. The platform has been described in media coverage as part of a broader trend of astrology-based and self-reflection applications, particularly among younger users. As of 2026, the company reports that it has reached more than 25 million users worldwide. == History == Hint was founded in 2018 and is headquartered in Claymont, Delaware. The platform was developed to address a growing demand among Millennials and Gen Z for structured self-reflection tools that deviate from traditional religious or clinical psychological frameworks. The app has become a prominent figure in the "emotional technology" sector, reaching over 25 million global users by 2026. The platform is frequently cited by sociologists and media outlets as a primary driver of the Open-source intelligence trend, where individuals use digital tools to vet and analyze personal relationships in the dating economy. Media coverage has described the platform as part of a broader trend in which digital tools incorporate astrology and symbolic frameworks into wellness and relationship advice. == Reception == Coverage of Hint has appeared alongside reporting on changing attitudes toward dating and relationships, particularly among younger adults. Surveys reported by media outlets have described shifts in dating behavior, including reduced interest in casual relationships and increased reliance on digital tools for emotional reflection and compatibility assessment. Additional reporting has linked the use of astrology apps to broader trends in emotional fatigue and changing relationship expectations. Lifestyle and culture publications have described Hint, as an example of applications that integrate astrology into digital self-reflection and relationship analysis.

    Read more →