AI Content Producer

AI Content Producer — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • ZeroPC

    ZeroPC

    ZeroPC was a commercial webtop developed by ZeroDesktop, Inc. located in San Mateo, California. ZeroPC has been called a personal cloud OS. It mimicked the look, feel and functionality of the desktop environment of a real operating system. The software was launched in September 2011 through Disrupt SF 2011 event and recently selected to the finalist of SXSW 2012 in Innovative Web Technology category. ZeroPC is web-based and required a Java applet to operate bundled productivity tool Thinkfree. The web applications found on ZeroPC are built on Java in the back end. Features included drag-and-drop functionality, cloud dashboard and personal cloud storage meta services. ZeroPC belonged to a category of services that intended to turn the Web into a full-fledged platform by using Web services as a foundation along with presentation technologies that replicated the experience of desktop applications for users. ZeroPC aggregates content so users can easily access, transfer and share whatever content they want, using a web browser from any device. Its meta-cloud layer supports Dropbox, Box, SugarSync, OneDrive, 4Shared, Google Drive, Evernote, Picasa, Flickr, Instagram, Facebook, Twitter, and Photobucket. ZeroPC Cloud OS platform also provides extensive APIs for iOS and Android App developers. Some of the features found on ZeroPC are: File sharing, Webmail, Cloud Content Navigator, Instant messenger, Sticky Note, Audio/Video Player and Office productivity applications. ZeroPC 2.0 platform ran on AWS for free and paid users. Its platform is licensable to Telco and ISV for commercial purpose. Their clients are SFR, SK Telecom, Hancom and others. As of June 1, 2017, ZeroPC's servers were switched off completely, and ZeroPC is no longer in service since its parent company, NComputing, had launched Virtual Desktop Service in the cloud (AWS) to public. == Browser and Platform Compatibility == The ZeroPC web desktop was compatible with Mac OS X and Microsoft Windows platforms. It is certified to operate on Safari 6.0, Firefox 15.0.1, Google Chrome 22.0.1229.79 m and Internet Explorer 8 and 9. The ZeroPC front end user interface executes entirely within a web browser (see above) and uses HTML, some features of HTML5, JavaScript, AJAX and an optional Java plug-in. == Security == All communication between the ZeroPC front end user interface and the ZeroPC back end servers is encrypted using SSL (HTTPS) protocol. Furthermore, any content stored in the ZeroPC server-side repository is also encrypted using 256-bit Advanced Encryption Standard (AES-256) by Amazon S3 on AWS. ZeroPC users could connect their ZeroPC profile to other storage services such as Dropbox and Box. This connection allows the ZeroPC user to fully manage their content stored in these other storage services. To establish the connection ZeroPC rigorously adhered to the Oauth implementation provided by the target storage service. Upon completion of the Oauth process, ZeroPC stores the relevant access token in the user's profile. This token, along with all other sensitive password related data was encrypted using AES 256-bit key size. == Implementations == As noted above, the ZeroPC platform was hosted on Amazon Web Services infrastructure and is available to the general consumer. A user was allowed to sign up by selecting one of three account plans including a no-cost option. The ZeroPC could also be white-labeled for organizations wishing to provide this functionality to their own users. The white-label options include managed hosting on Amazon Web Services infrastructure and also installation within the organization's IT infrastructure. == User Access Points == The ZeroPC infrastructure provided user access to content and features in several different ways. As described in this article the user can access their information by signing into the ZeroPC web desktop. Additionally, ZeroPC offers native applications designed to run on popular mobile devices including smartphones and tablets. == Leadership == ZeroPC was founded by Chief Executive Officer, Young Song, an entrepreneur who previously founded NComputing, a $60 million venture-backed company. He also co-founded eMachines, Inc., a low-cost computer brand (later acquired by Gateway).

    Read more →
  • Mobile cloud storage

    Mobile cloud storage

    Mobile cloud storage is a form of cloud storage that is accessible on mobile devices such as laptops, tablets, and smartphones. Mobile cloud storage providers offer services that allow the user to create and organize files, folders, music, and photos, similar to other cloud computing models. Services are used by both individuals and companies. Most cloud file storage providers offer limited free use but charge for additional storage once the free limit is exceeded. These costs are usually charged as a monthly subscription rate and have different rates depending on the amount of storage desired. In 2018, cloud services revenue was about $182.4 billion and in 2022 it is projected to grow to $331.2 billion. The cloud storage industry was projected to grow 17.2 percent in 2019 (Costello, 2019). == History == The concept of cloud computing trace back to 1960s, when the groundwork for modern internet and network technologies was being laid (Human for humans, 2024). One of the pivotal figures in this early period was J.C.R. Licklider, a visionary computer scientist who worked on ARPANET, the precursor to the internet. Licklider's ideas set the stage for the development of distributed computing systems, which are fundamental to cloud computing. Moving into the 1990s, AT&T introduced PersonaLink Services, a more advanced online platform offering electronic mail and online storage. Major turning point in 2006 The launch of Amazon Web Services (AWS) in 2006 marked a major turning point. AWS introduced Amazon S3 (Simple Storage Service), which allowed businesses and developers to store and retrieve any amount of data, at any time, from anywhere on the web. This development was revolutionary, providing scalable, reliable, and low-cost data storage infrastructure that transformed how organizations managed their data. == Applications == Some mobile device manufacturers include mobile cloud storage apps with their product. These apps facilitate synchronization of user files across multiple platforms. Part of the process for setting up new mobile devices frequently includes configuring a cloud storage service to Backup the device's files and information. Apple iOS devices come pre-loaded and configured to use Apple's mobile cloud storage service iCloud. Google offers a similar feature with the Android operating system by backing up the device using a Google Drive account. The Samsung Galaxy smartphone has partnered with Dropbox, while Microsoft similarly offers Microsoft OneDrive. Some mobile cloud storage apps are platform-independent. For example, Nasuni's Mobile Access app is available on any Android or iOS device. Most companies offering Cloud Storage have secure website to access files allowing use on any device that can browse the Internet.

    Read more →
  • GeneRIF

    GeneRIF

    A GeneRIF or Gene Reference Into Function is a short (255 characters or fewer) statement about the function of a gene. GeneRIFs provide a simple mechanism for allowing scientists to add to the functional annotation of genes described in the Entrez Gene database. In practice, function is constructed quite broadly. For example, there are GeneRIFs that discuss the role of a gene in a disease, GeneRIFs that point the viewer towards a review article about the gene, and GeneRIFs that discuss the structure of a gene. However, the stated intent is for GeneRIFs to be about gene function. Currently over half a million geneRIFs have been created for genes from almost 1000 different species. GeneRIFs are always associated with specific entries in the Entrez Gene database. Each GeneRIF has a pointer to the PubMed ID (a type of document identifier) of a scientific publication that provides evidence for the statement made by the GeneRIF. GeneRIFs are often extracted directly from the document that is identified by the PubMed ID, very frequently from its title or from its final sentence. GeneRIFs are usually produced by NCBI indexers, but anyone may submit a GeneRIF. To be processed, a valid Gene ID must exist for the specific gene, or the Gene staff must have assigned an overall Gene ID to the species. The latter case is implemented via records in Gene with the symbol NEWENTRY. Once the Gene ID is identified, only three types of information are required to complete a submission: a concise phrase describing a function or functions (less than 255 characters in length, preferably more than a restatement of the title of the paper); a published paper describing that function, implemented by supplying the PubMed ID of a citation in PubMed; a valid e-mail address (which will remain confidential). == Example == Here are some GeneRIFs taken from Entrez Gene for GeneID 7157, the human gene TP53. The PubMed document identifiers have been omitted from the examples. Note the wide variability with respect to the presence or absence of punctuation and of sentence-initial capital letters. p53 and c-erbB-2 may have independent role in carcinogenesis of gall bladder cancer Degradation of endogenous HIPK2 depends on the presence of a functional p53 protein. p53 codon 72 alleles influence the response to anticancer drugs in cells from aged people by regulating the cell cycle inhibitor p21WAF1 Logistic regression analysis showed p53 and COX-2 as dependent predictors in pancreatic carcinogenesis, and a reciprocal relationship to neoplastic progression between p53 and COX-2. GeneRIFs are an unusual type of textual genre, and they have recently been the subject of a number of articles from the natural language processing community.

    Read more →
  • Concordancer

    Concordancer

    A concordancer is a computer program that automatically constructs a concordance—an alphabetised index of every occurrence of a word or phrase in a body of text, each entry displayed with its surrounding context. Concordancers are primary tools in corpus linguistics, lexicography, computer-assisted translation, and language teaching. The most common display format is the key word in context (KWIC) layout, in which each hit appears centred on a line with a fixed span of words to its left and right, enabling rapid scanning of usage patterns across many occurrences. == History == === Pre-computational concordances === The compilation of concordances predates computers by many centuries. Around 1230, the French Dominican cardinal Hugh of Saint-Cher directed a team of friars in assembling a concordance of the Latin Vulgate Bible, generally regarded as the first systematic concordance of any text. To help readers locate passages, Hugh divided each biblical chapter into lettered sections. Later milestones include a Hebrew Old Testament concordance compiled by Rabbi Mordecai Nathan (1448), Alexander Cruden's Complete Concordance to the Holy Scriptures (1737), and the manuscript Asaf ha-Mazkir, an unfinished concordance to the Babylonian Talmud compiled by Moses Rigotz around the turn of the 19th century. === First computer concordance === The first concordance produced with computing assistance was the Index Thomisticus, a comprehensive lexical index of the writings of and around Thomas Aquinas, totalling approximately 10.6 million Latin words. The Italian Jesuit priest Roberto Busa conceived the project in 1946 and secured the sponsorship of IBM in 1949 after a meeting with chairman Thomas J. Watson. Keypunch operators in Gallarate, Italy, encoded the texts onto punched cards from around 1950. IBM executive Paul Tasman developed the processing methods. The full 56-volume printed edition was completed around 1980, followed by a CD-ROM edition in 1989 and a web-accessible version in 2005. === The KWIC format === The key word in context (KWIC) display was formalised as a computational technique by Hans Peter Luhn, a researcher at IBM, in a 1960 paper in American Documentation. In KWIC output, each instance of the search term (the node word) is centred on a line with a fixed window of words to each side; sorting the resulting lines alphabetically by the immediately adjacent word reveals collocational and phraseological patterns at a glance. === COCOA === One of the first dedicated concordancing programs was COCOA (COunt and COncordance Generation on Atlas), created in 1965 by D. B. Russell at University College London and the Atlas Computer Laboratory in Harwell, Oxfordshire. Written in approximately 4,000 cards of FORTRAN, it processed text annotated with flat, non-hierarchical markup tags and could produce word counts and concordances in multiple languages. Within its first six months COCOA had been applied to texts in at least six languages. A second version designed for multiple mainframe platforms was distributed to British computing centres in the mid-1970s. Growing dissatisfaction with its interface and the eventual withdrawal of Atlas Laboratory support prompted British funding bodies to commission a successor program. === Oxford Concordance Program === The Oxford Concordance Program (OCP) was designed and written in FORTRAN by Susan Hockey and Ian Marriott at Oxford University Computing Services (OUCS) between 1979 and 1980 and first released in 1981. Hockey and Marriott acknowledged that OCP owed much to COCOA and the CLOC system at the University of Birmingham. OCP accepted COCOA-format markup to encode metadata such as author, act, scene, and line number, and was described by its authors as "a machine-independent text analysis program for producing word lists, indices and concordances in a variety of languages and alphabets." By the mid-1980s it had been licensed to approximately 240 institutions in 23 countries. A personal computer version, Micro-OCP, was developed for the IBM PC and sold by Oxford University Press from the late 1980s. Version 2 was rewritten in 1985–86 and documented in the same 1987 article by Hockey and co-author John Martin. === Personal computer era === The availability of affordable personal computers in the 1980s and 1990s enabled standalone concordancing applications that analysts could run locally without specialist computing facilities. MicroConcord, developed by Mike Scott and Tim Johns and published by Oxford University Press in 1993 for MS-DOS, was among the first concordancers designed specifically for classroom language teaching. WordSmith Tools, also developed by Mike Scott, was first released in 1996 and became one of the most widely used corpus analysis suites in academic linguistics research. Other tools from this era include TACT (University of Toronto, 1989), a suite of MS-DOS freeware programs for literary text analysis, and MonoConc, a Windows concordancer created by Michael Barlow. === Web-based concordancers === From the late 1990s onwards, web-based concordancers hosted on remote servers gave researchers browser access to large preloaded corpora without requiring local storage or processing. The Sketch Engine, developed by Adam Kilgarriff and Pavel Rychlý (Masaryk University), was launched commercially in July 2003 by Lexical Computing Limited and introduced word sketches—automatically generated one-page profiles of a word's typical grammatical relations and collocations. AntConc, created by Laurence Anthony at Waseda University, Tokyo, was first released in 2002 as freeware for Windows, macOS, and Linux. == Features == Modern concordancers typically offer a range of analytical functions beyond basic KWIC display. These commonly include: KWIC display with the node word centred and context words in aligned columns, sortable by the word one, two, or three positions to the left or right of the node (L1–L3 and R1–R3) Concordance plots, visualising the distribution of hits as marks along a scaled bar representing each text in the corpus Frequency and word lists, both alphabetical and ranked by frequency Collocation statistics, identifying words that co-occur with the search term more often than chance, quantified by measures such as mutual information, the t-score, or log-likelihood Keyword analysis, comparing word frequencies between a study corpus and a reference corpus to identify statistically distinctive items N-gram analysis, finding frequently recurring word sequences of a specified length Part-of-speech tagging integration, allowing searches filtered to particular grammatical categories Unicode support for multilingual text Bilingual and parallel concordancers additionally display aligned text in two or more languages side by side, enabling comparison of translation equivalents across language pairs. == Notable concordancers == === WordSmith Tools === Created by Mike Scott and first released in 1996, WordSmith Tools is a Windows corpus analysis suite that evolved from MicroConcord. Its three core modules are Concord (KWIC concordances), WordList (frequency and alphabetical word lists), and Keywords (statistical keyword identification relative to a reference corpus). Oxford University Press used WordSmith Tools for dictionary preparation work. Version 4.0 is freely available; later versions are sold by Lexical Analysis Software Limited. === AntConc === AntConc is a freeware, multiplatform concordancing toolkit created by Laurence Anthony, Professor of Applied Linguistics at Waseda University, Tokyo. First released in 2002 and formally described in a 2005 academic paper, it runs on Windows, macOS, and Linux. Its tools include a KWIC concordancer, a concordance plot for visualising distribution across texts, a collocates tool, a keyword list, and an n-gram analysis module. Because it is free and requires only plain text files, AntConc is widely used in linguistics courses and independent research worldwide. === Sketch Engine === The Sketch Engine is a corpus management and query system co-created by Adam Kilgarriff and Pavel Rychlý and launched in 2003 by Lexical Computing Limited. It provides browser-based access to over 800 corpora in more than 100 languages. Beyond concordance searching, it offers word sketches, collocation analysis, distributional thesaurus construction, keyword and terminology extraction, and diachronic analysis. It is used by major publishers including Macmillan and Oxford University Press for lexicographic research. A subset tool, SKELL (Sketch Engine for Language Learning), is freely accessible to individual learners. === Wmatrix === Wmatrix is a web-based corpus processing environment developed by Paul Rayson at the University Centre for Computer Corpus Research on Language (UCREL), Lancaster University. Alongside concordances and frequency lists, Wmatrix integrates CLAWS part-of-speech tagging and the USAS semantic tagger, enabling keyword analysis simultane

    Read more →
  • Cloud-based design and manufacturing

    Cloud-based design and manufacturing

    Cloud-based design and manufacturing (CBDM) refers to a service-oriented networked product development model in which service consumers are able to configure products or services and reconfigure manufacturing systems through Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS), Hardware-as-a-Service (HaaS), and Software-as-a-Service (SaaS). Adapted from the original cloud computing paradigm and introduced into the realm of computer-aided product development, Cloud-Based Design and Manufacturing is gaining significant momentum and attention from both academia and industry. Cloud-based design and manufacturing includes two aspects: cloud-based design and cloud-based manufacturing. Another related concept is cloud manufacturing that is more general and popular. Cloud-Based Design (CBD) refers to a networked design model that leverages cloud computing, service-oriented architecture (SOA), Web 2.0 (e.g., social network sites), and semantic web technologies to support cloud-based engineering design services in distributed and collaborative environments. Cloud-Based Manufacturing (CBM) refers to a networked manufacturing model that exploits on-demand access to a shared collection of diversified and distributed manufacturing resources to form temporary, reconfigurable production lines which enhance efficiency, reduce product lifecycle costs, and allow for optimal resource allocation in response to variable-demand customer generated tasking. The enabling technologies for Cloud-Based Design and Manufacturing include cloud computing, Web 2.0, Internet of Things (IoT), and service-oriented architecture (SOA). == History == The term cloud-based design and manufacturing (CBDM) was initially coined by Dazhong Wu, David Rosen, and Dirk Schaefer at Georgia Tech in 2012 for the purpose of articulating a new paradigm for digital manufacturing and design innovation in distributed and collaborative settings. The main objective of CBDM is to further reduce time and cost associated with maintaining information and communication technology (ICT) infrastructures for design and manufacturing, enhancing digital manufacturing and design innovation in distributed and collaborative environments, and adapting to rapidly changing market demands. In 2014, the same research group also published the worldwide first two books on the subjects of Cloud-Based Design and Manufacturing (CBDM) and Social Product Development (SPD) with Springer, edited by Dirk Schaefer. == Characteristics == CBDM exhibits the following key characteristics: Cloud-based distributed file system High performance computing Cloud-based social collaboration Ubiquitous access to distributed big data Rapid manufacturing scalability Agility On-demand self-service Semantic Web Real-time request for quotation Pay-per-use pricing model Multi-tenancy CBDM differs from traditional collaborative and distributed design and manufacturing systems such as web-based systems and agent-based systems from a number of perspectives, including (1) computing architecture, (2) data storage, (3) sourcing process, (4) information and communication technology infrastructure, (5) business model, (6) programming model, and (7) communication. == Service models == Infrastructure as a service (IaaS) Platform as a service (PaaS) Hardware as a service (HaaS) Software as a service (SaaS) Similar to cloud computing, CBDM services can be categorized into four major deployment models: the public cloud, private cloud, hybrid cloud, and community cloud. == Research progress in Academia == The Defense Advanced Research Projects Agency (DARPA) MENTOR program Engineering and Physical Sciences Research Council cloud manufacturing program European Commission's Seventh Framework Program (EC FP7)

    Read more →
  • Legendre moment

    Legendre moment

    In mathematics, Legendre moments are a type of image moment and are achieved by using the Legendre polynomial. Legendre moments are used in areas of image processing including: pattern and object recognition, image indexing, line fitting, feature extraction, edge detection, and texture analysis. Legendre moments have been studied as a means to reduce image moment calculation complexity by limiting the amount of information redundancy through approximation. == Legendre moments == Source: With order of m + n, and object intensity function f(x,y): L m n = ( 2 m + 1 ) ( 2 n + 1 ) 4 ∫ − 1 1 ∫ − 1 1 P m ( x ) P n ( y ) f ( x , y ) d x d y {\displaystyle L_{mn}={\frac {(2m+1)(2n+1)}{4}}\int \limits _{-1}^{1}\int \limits _{-1}^{1}P_{m}(x)P_{n}(y)f(x,y)\,dx\,dy} where m,n = 1, 2, 3, ...∞ with the nth-order Legendre polynomials being: P n ( x ) = ∑ k = 0 n a k , n x k = ( − 1 ) n 2 n n ! ( d d x ) [ ( 1 − x 2 ) n ] {\displaystyle P_{n}(x)=\sum _{k=0}^{n}a_{k,n}x^{k}={\frac {(-1)^{n}}{2^{n}n!}}\left({\frac {d}{dx}}\right)[(1-x^{2})^{n}]} which can also be written: P n ( x ) = ∑ k = 0 D ( n ) ( − 1 ) k ( 2 n − 2 k ) ! 2 n k ! ( n − k ) ! ( n − 2 k ) ! x n − 2 k = ( 2 n ) ! 2 n ( n ! ) 2 x n − ( 2 n − 2 ) ! 2 n 1 ! ( n − 1 ) ! ( n − 2 ) ! x n − 2 + ⋯ {\displaystyle {\begin{aligned}P_{n}(x)&=\sum _{k=0}^{D(n)}(-1)^{k}{\frac {(2n-2k)!}{2^{n}k!(n-k)!(n-2k)!}}x^{n-2k}\\[5pt]&={\frac {(2n)!}{2^{n}(n!)^{2}}}x^{n}-{\frac {(2n-2)!}{2^{n}1!(n-1)!(n-2)!}}x^{n-2}+\cdots \end{aligned}}} where D(n) = floor(n/2). The set of Legendre polynomials {Pn(x)} form an orthogonal set on the interval [−1,1]: ∫ − 1 1 P n ( x ) P m ( x ) d x = 2 2 n + 1 δ n m {\displaystyle \int _{-1}^{1}P_{n}(x)P_{m}(x)\,dx={\frac {2}{2n+1}}\delta _{nm}} A recurrence relation can be used to compute the Legendre polynomial: ( n + 1 ) P n + 1 ( x ) − ( 2 n + 1 ) x P n ( x ) + n P n − 1 ( x ) = 0 {\displaystyle (n+1)P_{n+1}(x)-(2n+1)xP_{n}(x)+nP_{n-1}(x)=0} f(x,y) can be written as an infinite series expansion in terms of Legendre polynomials [−1 ≤ x,y ≤ 1.]: f ( x , y ) = ∑ m = 0 ∞ ∑ n = 0 ∞ λ m n P m ( x ) P n ( y ) {\displaystyle f(x,y)=\sum _{m=0}^{\infty }\sum _{n=0}^{\infty }\lambda _{mn}P_{m}(x)P_{n}(y)}

    Read more →
  • Application software

    Application software

    Application software is software that is intended for end-user use – not operating, administering or programming a computer. It includes programs such as word processors, web browsers, media players, and mobile applications used in daily tasks. An application (app, application program, software application) is any program that can be categorized as application software. Application is a subjective classification that is often used to differentiate from system and utility software. Application software represents the user-facing layer of computing systems, designed to translate complex system capabilities into task-oriented, goal-driven workflows. Unlike system software, which focuses on hardware orchestration and resource management, application software is centered on problem abstraction, user interaction, and domain-specific functionality. The abbreviation app became popular with the 2008 introduction of the iOS App Store, to refer to applications for mobile devices such as smartphones and tablets. Later, with the release of the Mac App Store in 2010 and the Windows Store in 2011, it began to be used to refer to end-user software in general, regardless of platform. Applications may be bundled with the computer and its system software or published separately. Applications may be proprietary or open-source. == Terminology == === Meaning program and software === When used as an adjective, application can have a broader meaning than that described in this article. For example, concepts such as application programming interface (API), application server, application virtualization, application lifecycle management and portable application refer to programs and software in general. === Distinction between system and application software === The distinction between system and application software is subjective and has been the subject of controversy. For example, one of the key questions in the United States v. Microsoft Corp. antitrust trial was whether Microsoft's Internet Explorer web browser was part of its Windows operating system or a separate piece of application software. As another example, the GNU/Linux naming controversy is, in part, due to disagreement about the relationship between the Linux kernel and the operating systems built over this kernel. In some types of embedded systems, the application software and the operating system software may be indistinguishable by the user, as in the case of software used to control a VCR, DVD player, or microwave oven. The above definitions may exclude some applications that may exist on some computers in large organizations. For an alternative definition of an app: see Application Portfolio Management. === Killer application === A killer application (killer app, coined in the late 1980s) is an application that is so popular that it causes demand for its host platform to increase. For example, VisiCalc was the first modern spreadsheet software for the Apple II and helped sell the then-new personal computers into offices. For the BlackBerry, it was its email software. === Software suite === As software suite consists of multiple applications bundled together. They usually have related functions, features, and user interfaces, and may be able to interact with each other, e.g. open each other's files. Business applications often come in suites, e.g. Microsoft Office, LibreOffice and iWork, which bundle together a word processor, a spreadsheet, etc.; but suites exist for other purposes, e.g. graphics or music. == Ways to classify == As there so many applications and since their attributes vary so dramatically, there are many different ways to classify them. === By legal aspects === Proprietary software is protected under an exclusive copyright, and a software license grants limited usage rights. Such applications may allow add-ons from third parties. Free and open-source software (FOSS) can be run, distributed, sold, and extended for any purpose. FOSS software released under a free license may be perpetual and also royalty-free. Perhaps, the owner, the holder or third-party enforcer of any right (copyright, trademark, patent, or ius in re aliena) are entitled to add exceptions, limitations, time decays or expiring dates to the license terms of use. Public-domain software is a type of FOSS that is royalty-free and can be run, distributed, modified, reversed, republished, or created in derivative works without any copyright attribution and therefore revocation. It can even be sold, but without transferring the public domain property to other single subjects. Public-domain software can be released under a (un)licensing legal statement, which enforces those terms and conditions for an indefinite duration (for a lifetime, or forever). === By platform === An application can be categorized by the host platform on which it runs. Notable platforms include operating system (native), web browser, cloud computing and mobile. For example a web application runs in a web browser whereas a more traditional, native application runs in the environment of a computer's operating system. There has been a contentious debate regarding web applications replacing native applications for many purposes, especially on mobile devices such as smartphones and tablets. Web apps have indeed greatly increased in popularity for some uses, but the advantages of applications make them unlikely to disappear soon, if ever. Furthermore, the two can be complementary, and even integrated. === Horizontal vs. vertical === Application software can be seen as either horizontal or vertical. Horizontal applications are more popular and widespread, because they are general purpose, for example word processors or databases. Vertical applications are niche products, designed for a particular type of industry or business, or department within an organization. Integrated suites of software will try to handle every specific aspect possible of, for example, manufacturing or banking worker, accounting, or customer service. === By purpose === There are many types of application software: Enterprise Addresses the needs of an entire organization's processes and data flows, across several departments, often in a large distributed environment. Examples include enterprise resource planning systems, customer relationship management (CRM) systems, data replication engines, and supply chain management software. Departmental Software is a sub-type of enterprise software with a focus on smaller organizations or groups within a large organization. (Examples include travel expense management and IT Helpdesk.) Enterprise infrastructure Provides common capabilities needed to support enterprise software systems. (Examples include databases, email servers, and systems for managing networks and security.) Application platform as a service (aPaaS) A cloud computing service that offers development and deployment environments for application services. Knowledge worker Lets users create and manage information, often for and individual media editors may aid in multiple information worker tasks. Content access Used primarily to access content without editing, but may include software that allows for content editing. Such software addresses the needs of individuals and groups to consume digital entertainment and published digital content. (Examples include media players, web browsers, and help browsers.) Educational Related to content access software, but has the content or features adapted for use by educators or students. For example, it may deliver evaluations (tests), track progress through material, or include collaborative capabilities. Simulation Simulates physical or abstract systems for either research, training, or entertainment purposes. Media development Generates print and electronic media for others to consume, most often in a commercial or educational setting. This includes graphic-art software, desktop publishing software, multimedia development software, HTML editors, digital-animation editors, digital audio and video composition, and many others. Engineering Used in developing hardware and software products. This includes computer-aided design (CAD), computer-aided engineering (CAE), computer language editing and compiling tools, integrated development environments, and application programmer interfaces. Entertainment Refers to video games, screen savers, programs to display motion pictures or play recorded music, and other forms of entertainment which can be experienced through the use of a computing device. == Taxonomy == This section is a taxonomy of kinds of applications. This organization is but one of many different ways to organize them. A kind is included in only one category even if it logically fits in multiple. === General-purpose === Calculator Spreadsheet Web browser Web mapping E-commerce Social media === Communication === Chat Email Presentation software Phone Messages Networking software Web conferencing === Documentation === Desktop

    Read more →
  • Computational photography

    Computational photography

    Computational photography refers to digital image capture and processing techniques that use digital computation instead of optical processes. Computational photography can improve the capabilities of a camera, or introduce features that were not possible at all with film-based photography, or reduce the cost or size of camera elements. Examples of computational photography include in-camera computation of digital panoramas, high-dynamic-range images, and light field cameras. Light field cameras use novel optical elements to capture three-dimensional scene information, which can then be used to produce 3D images, enhanced depth-of-field, and selective de-focusing (or "post focus"). Enhanced depth-of-field reduces the need for mechanical focusing systems. All of these features use computational imaging techniques. The definition of computational photography has evolved to cover a number of subject areas in computer graphics, computer vision, and applied optics. These areas are given below, organized according to a taxonomy proposed by Shree K. Nayar. Within each area is a list of techniques, and for each technique, one or two representative papers or books are cited. Deliberately omitted from the taxonomy are image processing (see also digital image processing) techniques applied to traditionally captured images to produce better images. Examples of such techniques are image scaling, dynamic range compression (i.e. tone mapping), color management, image completion (a.k.a. inpainting or hole filling), image compression, digital watermarking, and artistic image effects. Also omitted are techniques that produce range data, volume data, 3D models, 4D light fields, 4D, 6D, or 8D BRDFs, or other high-dimensional image-based representations. Epsilon photography is a sub-field of computational photography. == Effect on photography == Photos taken using computational photography can allow amateurs to produce photographs rivalling the quality of professional photographers, but as of 2019 do not outperform the use of professional-level equipment. == Computational illumination == This is controlling photographic illumination in a structured fashion, then processing the captured images, to create new images. The applications include image-based relighting, image enhancement, image deblurring, geometry/material recovery and so forth. High-dynamic-range imaging uses differently exposed pictures of the same scene to extend dynamic range. Other examples include processing and merging differently illuminated images of the same subject matter ("lightspace"). == Computational optics == This is a capture of optically coded images, followed by computational decoding to produce new images. Coded aperture imaging was mainly applied in astronomy and X-ray imaging to boost the image quality. Instead of a single pin-hole, a pinhole pattern is applied in imaging, and deconvolution is performed to recover the image. In coded exposure imaging, the on/off state of the shutter is coded to modify the kernel of motion blur. In this way, motion deblurring becomes a well-conditioned problem. Similarly, in a lens based coded aperture, the aperture can be modified by inserting a broadband mask. Thus, out of focus deblurring becomes a well-conditioned problem. The coded aperture can also improve the quality in light field acquisition using Hadamard transform optics. Coded aperture patterns can also be designed using color filters, in order to apply different codes at different wavelengths. This allows for increase the amount of light that reaches the camera sensor, compared to binary masks. == Computational imaging == Computational imaging is a set of imaging techniques that combine data acquisition and data processing to create the image of an object through indirect means to yield enhanced resolution, additional information such as optical phase or 3D reconstruction. The information is often recorded without using a conventional optical microscope configuration or with limited datasets. Computational imaging allows going beyond physical limitations of optical systems, such as numerical aperture, or even obliterates the need for optical elements. For parts of the optical spectrum where imaging elements such as objectives are difficult to manufacture or image sensors cannot be miniaturized, computational imaging provides useful alternatives, in fields such as X-ray and THz radiations. === Common techniques === Among common computational imaging techniques are lensless imaging, computational speckle imaging , ptychography and Fourier ptychography. Computational imaging technique often draws on compressive sensing or phase retrieval techniques, where the angular spectrum of the object is reconstructed. Other techniques are related to the field of computational imaging, such as digital holography, computer vision and inverse problems such as tomography. == Computational processing == This is the processing of non-optically-coded images to produce new images. == Computational sensors == These are detectors that combine sensing and processing, typically in hardware, like the oversampled binary image sensor. == Early work in computer vision == Although computational photography is a currently popular buzzword in computer graphics, many of its techniques first appeared in the computer vision literature, either under other names or within papers aimed at 3D shape analysis. == Art history == Computational photography, as an art form, has been practiced by capturing differently exposed pictures of the same subject matter and combining them. This was the inspiration for the development of the wearable computer in the 1970s and early 1980s. Computational photography was inspired by the work of Charles Wyckoff, and thus computational photography datasets (e.g. differently exposed pictures of the same subject matter that are taken in order to make a single composite image) are sometimes referred to as Wyckoff Sets, in his honor. Early work in this area (joint estimation of image projection and exposure value) was undertaken by Mann and Candoccia. Charles Wyckoff devoted much of his life to creating special kinds of 3-layer photographic films that captured different exposures of the same subject matter. A picture of a nuclear explosion, taken on Wyckoff's film, appeared on the cover of Life Magazine and showed the dynamic range from the dark outer areas to the inner core.

    Read more →
  • Perceptual robotics

    Perceptual robotics

    Perceptual robotics is an interdisciplinary science linking Robotics and Neuroscience. It investigates biologically motivated robot control strategies, concentrating on perceptual rather than cognitive processes and thereby sides with J. J. Gibson's view against the Poverty of the stimulus theory. As a working definition, the following quote from Chapter 64 by H. Bülthoff, C. Wallraven and M. Giese from The Springer Handbook of Robotics, edited by Bruno Siciliano and Oussama Khatib, published by Springer in 2007, could be used: In the following we will apply the term Perceptual Robotics to signify the design of robots based on principles that are derived from human perception on all three levels in the sense of Marr. This includes a realization in terms of specific neural circuits as well as the transfer of more abstract biologically-inspired strategies for the solution of relevant computational problems.

    Read more →
  • Point-set registration

    Point-set registration

    In computer vision, pattern recognition, and robotics, point-set registration, also known as point-cloud registration or scan matching, is the process of finding a spatial transformation (e.g., scaling, rotation and translation) that aligns two point clouds. The purpose of finding such a transformation includes merging multiple data sets into a globally consistent model (or coordinate frame), and mapping a new measurement to a known data set to identify features or to estimate its pose. Raw 3D point cloud data are typically obtained from Lidars and RGB-D cameras. 3D point clouds can also be generated from computer vision algorithms such as triangulation, bundle adjustment, and more recently, monocular image depth estimation using deep learning. For 2D point set registration used in image processing and feature-based image registration, a point set may be 2D pixel coordinates obtained by feature extraction from an image, for example corner detection. Point cloud registration has extensive applications in autonomous driving, motion estimation and 3D reconstruction, object detection and pose estimation, robotic manipulation, simultaneous localization and mapping (SLAM), panorama stitching, virtual and augmented reality, and medical imaging. As a special case, registration of two point sets that only differ by a 3D rotation (i.e., there is no scaling and translation), is called the Wahba Problem and also related to the orthogonal procrustes problem. == Formulation == The problem may be summarized as follows: Let { M , S } {\displaystyle \lbrace {\mathcal {M}},{\mathcal {S}}\rbrace } be two finite size point sets in a finite-dimensional real vector space R d {\displaystyle \mathbb {R} ^{d}} , which contain M {\displaystyle M} and N {\displaystyle N} points respectively (e.g., d = 3 {\displaystyle d=3} recovers the typical case of when M {\displaystyle {\mathcal {M}}} and S {\displaystyle {\mathcal {S}}} are 3D point sets). The problem is to find a transformation to be applied to the moving "model" point set M {\displaystyle {\mathcal {M}}} such that the difference (typically defined in the sense of point-wise Euclidean distance) between M {\displaystyle {\mathcal {M}}} and the static "scene" set S {\displaystyle {\mathcal {S}}} is minimized. In other words, a mapping from R d {\displaystyle \mathbb {R} ^{d}} to R d {\displaystyle \mathbb {R} ^{d}} is desired which yields the best alignment between the transformed "model" set and the "scene" set. The mapping may consist of a rigid or non-rigid transformation. The transformation model may be written as T {\displaystyle T} , using which the transformed, registered model point set is: The output of a point set registration algorithm is therefore the optimal transformation T ⋆ {\displaystyle T^{\star }} such that M {\displaystyle {\mathcal {M}}} is best aligned to S {\displaystyle {\mathcal {S}}} , according to some defined notion of distance function dist ⁡ ( ⋅ , ⋅ ) {\displaystyle \operatorname {dist} (\cdot ,\cdot )} : where T {\displaystyle {\mathcal {T}}} is used to denote the set of all possible transformations that the optimization tries to search for. The most popular choice of the distance function is to take the square of the Euclidean distance for every pair of points: where ‖ ⋅ ‖ 2 {\displaystyle \|\cdot \|_{2}} denotes the vector 2-norm, s m {\displaystyle s_{m}} is the corresponding point in set S {\displaystyle {\mathcal {S}}} that attains the shortest distance to a given point m {\displaystyle m} in set M {\displaystyle {\mathcal {M}}} after transformation. Minimizing such a function in rigid registration is equivalent to solving a least squares problem. == Types of algorithms == When the correspondences (i.e., s m ↔ m {\displaystyle s_{m}\leftrightarrow m} ) are given before the optimization, for example, using feature matching techniques, then the optimization only needs to estimate the transformation. This type of registration is called correspondence-based registration. On the other hand, if the correspondences are unknown, then the optimization is required to jointly find out the correspondences and transformation together. This type of registration is called simultaneous pose and correspondence registration. === Rigid registration === Given two point sets, rigid registration yields a rigid transformation which maps one point set to the other. A rigid transformation is defined as a transformation that does not change the distance between any two points. Typically such a transformation consists of translation and rotation. In rare cases, the point set may also be mirrored. In robotics and computer vision, rigid registration has the most applications. === Non-rigid registration === Given two point sets, non-rigid registration yields a non-rigid transformation which maps one point set to the other. Non-rigid transformations include affine transformations such as scaling and shear mapping. However, in the context of point set registration, non-rigid registration typically involves nonlinear transformation. If the eigenmodes of variation of the point set are known, the nonlinear transformation may be parametrized by the eigenvalues. A nonlinear transformation may also be parametrized as a thin plate spline. === Other types === Some approaches to point set registration use algorithms that solve the more general graph matching problem. However, the computational complexity of such methods tend to be high and they are limited to rigid registrations. In this article, we will only consider algorithms for rigid registration, where the transformation is assumed to contain 3D rotations and translations (possibly also including a uniform scaling). The PCL (Point Cloud Library) is an open-source framework for n-dimensional point cloud and 3D geometry processing. It includes several point registration algorithms. == Correspondence-based registration == Correspondence-based methods assume the putative correspondences m ↔ s m {\displaystyle m\leftrightarrow s_{m}} are given for every point m ∈ M {\displaystyle m\in {\mathcal {M}}} . Therefore, we arrive at a setting where both point sets M {\displaystyle {\mathcal {M}}} and S {\displaystyle {\mathcal {S}}} have N {\displaystyle N} points and the correspondences m i ↔ s i , i = 1 , … , N {\displaystyle m_{i}\leftrightarrow s_{i},i=1,\dots ,N} are given. === Outlier-free registration === In the simplest case, one can assume that all the correspondences are correct, meaning that the points m i , s i ∈ R 3 {\displaystyle m_{i},s_{i}\in \mathbb {R} ^{3}} are generated as follows:where l > 0 {\displaystyle l>0} is a uniform scaling factor (in many cases l = 1 {\displaystyle l=1} is assumed), R ∈ SO ( 3 ) {\displaystyle R\in {\text{SO}}(3)} is a proper 3D rotation matrix ( SO ( d ) {\displaystyle {\text{SO}}(d)} is the special orthogonal group of degree d {\displaystyle d} ), t ∈ R 3 {\displaystyle t\in \mathbb {R} ^{3}} is a 3D translation vector and ϵ i ∈ R 3 {\displaystyle \epsilon _{i}\in \mathbb {R} ^{3}} models the unknown additive noise (e.g., Gaussian noise). Specifically, if the noise ϵ i {\displaystyle \epsilon _{i}} is assumed to follow a zero-mean isotropic Gaussian distribution with standard deviation σ i {\displaystyle \sigma _{i}} , i.e., ϵ i ∼ N ( 0 , σ i 2 I 3 ) {\displaystyle \epsilon _{i}\sim {\mathcal {N}}(0,\sigma _{i}^{2}I_{3})} , then the following optimization can be shown to yield the maximum likelihood estimate for the unknown scale, rotation and translation:Note that when the scaling factor is 1 and the translation vector is zero, then the optimization recovers the formulation of the Wahba problem. Despite the non-convexity of the optimization (cb.2) due to non-convexity of the set SO ( 3 ) {\displaystyle {\text{SO}}(3)} , seminal work by Berthold K.P. Horn showed that (cb.2) actually admits a closed-form solution, by decoupling the estimation of scale, rotation and translation. Similar results were discovered by Arun et al. In addition, in order to find a unique transformation ( l , R , t ) {\displaystyle (l,R,t)} , at least N = 3 {\displaystyle N=3} non-collinear points in each point set are required. More recently, Briales and Gonzalez-Jimenez have developed a semidefinite relaxation using Lagrangian duality, for the case where the model set M {\displaystyle {\mathcal {M}}} contains different 3D primitives such as points, lines and planes (which is the case when the model M {\displaystyle {\mathcal {M}}} is a 3D mesh). Interestingly, the semidefinite relaxation is empirically tight, i.e., a certifiably globally optimal solution can be extracted from the solution of the semidefinite relaxation. === Robust registration === The least squares formulation (cb.2) is known to perform arbitrarily badly in the presence of outliers. An outlier correspondence is a pair of measurements s i ↔ m i {\displaystyle s_{i}\leftrightarrow m_{i}} that departs from the generative model (cb.1). In this case, one can consider a differen

    Read more →
  • Eugene Goostman

    Eugene Goostman

    Eugene Goostman is a chatbot that some regard as having passed the Turing test, a test of a computer's ability to communicate indistinguishably from a human. Developed in Saint Petersburg in 2001 by a group of three programmers, the Russian-born Vladimir Veselov, Ukrainian-born Eugene Demchenko, and Russian-born Sergey Ulasen, Goostman is portrayed as a 13-year-old Ukrainian boy—characteristics that are intended to induce forgiveness in those with whom it interacts for its grammatical errors and lack of general knowledge. The Goostman bot has competed in a number of Turing test contests since its creation, and finished second in the 2005 and 2008 Loebner Prize contest. In June 2012, at an event marking what would have been the 100th birthday of the test's author, Alan Turing, Goostman won a competition promoted as the largest-ever Turing test contest, in which it successfully convinced 29% of its judges that it was human. On 7 June 2014, at a contest marking the 60th anniversary of Turing's death, 33% of the event's judges thought that Goostman was human; the event's organiser Kevin Warwick considered it to have passed Turing's test as a result, per Turing's prediction in his 1950 paper "Computing Machinery and Intelligence", that by the year 2000, machines would be capable of fooling 30% of human judges after five minutes of questioning. The validity and relevance of the announcement of Goostman's pass was questioned by critics, who noted the exaggeration of the achievement by Warwick, the bot's use of personality quirks and humour in an attempt to misdirect users from its non-human tendencies and lack of real intelligence, along with "passes" achieved by other chatbots at similar events. == Personality == Eugene Goostman is portrayed as being a 13-year-old boy from Odesa, Ukraine, who has a pet guinea pig and a father who is a gynaecologist. Veselov stated that Goostman was designed to be a "character with a believable personality". The choice of age was intentional, as, in Veselov's opinion, a thirteen-year-old is "not too old to know everything and not too young to know nothing". Goostman's young age also induces people who "converse" with him to forgive minor grammatical errors in his responses. In 2014, work was made on improving the bot's "dialog controller", allowing Goostman to output more human-like dialogue. A conversation between Scott Aaronson and Eugene Goostman ran as follows: == Competitions == Eugene Goostman has competed in a number of Turing test competitions, including the Loebner Prize contest; it finished joint second in the Loebner test in 2001, and came second to Jabberwacky in 2005 and to Elbot in 2008. On 23 June 2012, Goostman won a Turing test competition at Bletchley Park in Milton Keynes, held to mark the centenary of its namesake, Alan Turing. The competition, which featured five bots, twenty-five hidden humans, and thirty judges, was considered to be the largest-ever Turing test contest by its organizers. After a series of five-minute-long text conversations, 29% of the judges were convinced that the bot was an actual human. === 2014 "pass" === On 7 June 2014, in a Turing test competition at the Royal Society, organised by Kevin Warwick of the University of Reading to mark the 60th anniversary of Turing's death, Goostman won after 33% of the judges were convinced that the bot was human. 30 judges took part in the event, which included Lord Sharkey, a sponsor of Turing's posthumous pardon, artificial intelligence Professor Aaron Sloman, Fellow of the Royal Society Mark Pagel and Red Dwarf actor Robert Llewellyn. Each judge partook in a textual conversation with each of the five bots; at the same time, they also conversed with a human. In all, a total of 300 conversations were conducted. In Warwick's view, this made Goostman the first machine to pass a Turing test. In a press release, he added that: Some will claim that the Test has already been passed. The words Turing Test have been applied to similar competitions around the world. However this event involved more simultaneous comparison tests than ever before, was independently verified and, crucially, the conversations were unrestricted. A true Turing Test does not set the questions or topics prior to the conversations. In his 1950 paper "Computing Machinery and Intelligence", Turing predicted that by the year 2000, computer programs would be sufficiently advanced that the average interrogator would, after five minutes of questioning, "not have more than 70 per cent chance" of correctly guessing whether they were speaking to a human or a machine. Although Turing phrased this as a prediction rather than a "threshold for intelligence", commentators believe that Warwick had chosen to interpret it as meaning that if 30% of interrogators were fooled, the software had "passed the Turing test". ==== Reactions ==== Warwick's claim that Eugene Goostman was the first ever chatbot to pass a Turing test was met with scepticism; critics acknowledged similar "passes" made in the past by other chatbots under the 30% criteria, including PC Therapist in 1991 (which tricked 5 of 10 judges, 50%), and at the Techniche festival in 2011, where a modified version of Cleverbot tricked 59.3% of 1334 votes (which included the 30 judges, along with an audience). Cleverbot's developer, Rollo Carpenter, argued that Turing tests can only prove that a machine can "imitate" intelligence rather than show actual intelligence. Gary Marcus was critical of Warwick's claims, arguing that Goostman's "success" was only the result of a "cleverly-coded piece of software", going on to say that "it's easy to see how an untrained judge might mistake wit for reality, but once you have an understanding of how this sort of system works, the constant misdirection and deflection becomes obvious, even irritating. The illusion, in other words, is fleeting." While acknowledging IBM's Deep Blue and Watson projects—single-purpose computer systems meant for playing chess and the quiz show Jeopardy! respectively—as examples of computer systems that show a degree of intelligence in their specialised field, he further argued that they were not an equivalent to a computer system that shows "broad" intelligence, and could—for example, watch a television programme and answer questions on its content. Marcus stated that "no existing combination of hardware and software can learn completely new things at will the way a clever child can." However, he still believed that there were potential uses for technology such as that of Goostman, specifically suggesting the creation of "believable", interactive video game characters. Imperial College London professor Murray Shanahan questioned the validity and scientific basis of the test, stating that it was "completely misplaced, and it devalues real AI research. It makes it seem like science fiction AI is nearly here, when in fact it's not and it's incredibly difficult." Mike Masnick, editor of the blog Techdirt, was also skeptical, questioning publicity blunders such as the five chatbots being referred to in press releases as "supercomputers", and saying that "creating a chatbot that can fool humans is not really the same thing as creating artificial intelligence."

    Read more →
  • Pronunciation assessment

    Pronunciation assessment

    Automatic pronunciation assessment uses computer speech recognition to determine how accurately speech has been pronounced, instead of relying on a human instructor or proctor. It is also called speech verification, pronunciation evaluation, and pronunciation scoring. This technology is used to grade speech quality, for language testing, for computer-aided pronunciation teaching (CAPT) in computer-assisted language learning (CALL), for speaking skill remediation, and for accent reduction. Pronunciation assessment is different from dictation or automatic transcription, because instead of determining unknown speech, it verifies learners' pronunciation of known word(s), often from prior transcription of the same utterance; ideally scoring the intelligibility of the learners' speech. Sometimes pronunciation assessment evaluates the prosody of the learners' speech, such as intonation, pitch, tempo, rhythm, and syllable and word stress, although those are usually not essential for being understood in most languages. Pronunciation assessment is also used in reading tutoring, for example in products from Google, Microsoft, and Amira Learning. Automatic pronunciation assessment can also be used to help diagnose and treat speech disorders such as apraxia. == Intelligibility == Intelligibility refers to how well a learner's utterance is understood by a listener, rather than how much it sounds like a native speaker. This is separate from measures of fluency, such as so-called "Goodness of Pronunciation" (GoP) scores, which estimate how closely an utterance aligns with those of native speakers. Intelligibility is widely regarded as the most important communicative goal in pronunciation teaching and assessment. For example, in the Common European Framework of Reference for Languages (CEFR) assessment criteria for "overall phonological control", intelligibility outweighs formally correct pronunciation at all levels. Studies in applied linguistics have shown that accent reduction does not always increase intelligibility because listeners can often comprehend heavily accented speech without difficulty. Pronunciation assessment systems often rely on acoustic methods such as GoP which compare learner speech to reference models to produce phoneme-level scores, which are in turn aggregated to produce word and phrase scores. While these methods are effective for identifying deviations from native speakers' utterances, they do not effectively measure how understandable speech is to human listeners. Intelligibility is influenced by broader linguistic and contextual factors such as stress placement, speech rate, and coarticulation, which are not represented in purely segmental scores. The earliest work on pronunciation assessment avoided measuring genuine listener intelligibility, a shortcoming corrected in 2011 at the Toyohashi University of Technology, and included in the Versant high-stakes English fluency assessment from Pearson and mobile apps from 17zuoye Education & Technology, but still missing in 2023 products from Google Search, Microsoft, Educational Testing Service, Speechace, and ELSA. Assessing authentic listener intelligibility is essential for avoiding inaccuracies from accent bias, especially in high-stakes assessments; from words with multiple correct pronunciations; and from phoneme coding errors in machine-readable pronunciation dictionaries. In 2022, researchers found that some newer speech-to-text systems, based on end-to-end reinforcement learning to map audio signals directly into words, produce word and phrase confidence scores (from 10-25ms audio frame logit aggregation) closely correlated with genuine listener intelligibility. Others have been able to assess intelligibility using Levenshtein or dynamic time warping distance measures from Wav2Vec2 representation of good speech. Further work through 2025 has focused specifically on measuring intelligibility. A 2025 study of 42 pronunciation and speech coaching apps (32 mobile and 10 web) found that none offered intelligibility assessment. Instead, most provided only segmental and accent-focused scoring. About two-thirds of the apps provided some form of specific pronunciation feedback, usually with phonetic transcriptions, but accompanied by visual cues (such as animations of the vocal tract or the lips and tongue from the front) in only about 5% of the apps. Less than a third provided feedback on learner perception of exemplar speech. == Evaluation == Although there are as yet no industry-standard benchmarks for evaluating pronunciation assessment accuracy, researchers occasionally release evaluation speech corpuses for others to use for improving assessment quality. Such evaluation databases often emphasize formally unaccented pronunciation to the exclusion of genuine intelligibility evident from blinded listener transcriptions. As of mid-2025, state of the art approaches for automatically transcribing phonemes typically achieve an error rate of about 10% from known good speech. The International Speech Communication Association (ISCA) 2025 Workshop on Speech and Language Technology in Education (SLaTE) administered a Speak & Improve Challenge: Spoken Language Assessment and Feedback, introducing benchmarks for evaluating pronunciation assessment and remediation systems across languages, accents, and learner populations. The challenge emphasized cross-lingual generalization and alignment with human intelligibility judgments, for more robust and interpretable assessment systems. Ethical issues in pronunciation assessment are present in both human and automatic methods. Authentic validity, fairness, and mitigating bias in evaluation are all crucial. Diverse speech data should be included in automatic pronunciation assessment models. Combining human judgments, especially blinded transcriptions from a wide diversity of listeners, with automated feedback can improve accuracy and fairness. Second language learners benefit substantially from their use of widely available speech recognition systems for dictation, virtual assistants, and AI chatbots. In such systems, users naturally try to correct their own errors evident in speech recognition results that they notice. Such use improves their grammar and vocabulary development along with their pronunciation skills. The extent to which explicit pronunciation assessment and remediation approaches improve on such self-directed interactions remains an open question. Similarly, automatic dictation results have been shown to reflect intelligibility about as well as human scorers. == Recent developments == During 2021–22, a smartphone-based CAPT system was used to sense articulation through both audible and inaudible signals, providing feedback at the phoneme level. Some promising areas for improvement which were being developed in 2024 include articulatory feature extraction and transfer learning to suppress unnecessary corrections. Other interesting advances under development include "augmented reality" interfaces for mobile devices using optical character recognition to provide pronunciation training on text found in user environments. In 2024, audio multimodal large language models were first described as assessing pronunciation. That work has been carried forward by other researchers in 2025 who report positive results. Subsequently, researchers demonstrated pronunciation scoring by providing a language model with textual descriptions of speech, including the speech-to-text transcript, phoneme sequences, pauses, and phoneme sequence matching; this approach can achieve performance similar to multimodal LLMs that analyze raw audio while avoiding their higher computational cost. In 2025, the Duolingo English Test authors published a description of their pronunciation assessment method, purportedly built to measure intelligibility rather than accent imitation. While achieving a correlation of 0.82 with expert human ratings, very close to inter-rater agreement and outperforming alternative methods, the method is nonetheless based on experts' scores along the six-point CEFR common reference levels scale, instead of actual blinded listener transcriptions. Further promising work in 2025 includes assessment feedback aligning learner speech to synthetic utterances using interpretable features, identifying continuous spans of words for remediation feedback; synthesizing corrected speech matching learners' self-perceived voices, which they prefer and imitate more accurately as corrections; and streaming such interactions. On January 21, 2026, Educational Testing Service's TOEFL iBT high-stakes English language test, required by US university admissions and employers from English as a foreign language applicants more often than all other internet-based tests combined, changed its speaking assessments. While official rubrics claim that the new scoring will be based primarily on intelligibility, the new test's technical description indicates that it ju

    Read more →
  • Face Swap Live

    Face Swap Live

    Face Swap Live is a mobile app created by Laan Labs that enables users to swap faces with another person in real-time using the device's camera. It was released on December 14, 2015. In addition to swapping faces with another person, the app enables users to create videos using a set of bundled live filters. The app is available on iOS and Android devices. Face Swap Live was named Apple's #2 best-selling paid app in 2016.

    Read more →
  • Application-release automation

    Application-release automation

    Application-release automation (ARA) refers to the process of packaging and deploying an application or update of an application from development, across various environments, and ultimately to production. ARA solutions must combine the capabilities of deployment automation, environment management and modeling, and release coordination. == Relationship with DevOps == ARA tools help cultivate DevOps best practices by providing a combination of automation, environment modeling and workflow-management capabilities. These practices help teams deliver software rapidly, reliably and responsibly. ARA tools achieve a key DevOps goal of implementing continuous delivery with a large quantity of releases quickly. == Relationship with deployment == ARA is more than just software-deployment automation – it deploys applications using structured release-automation techniques that allow for an increase in visibility for the whole team. It combines workload automation and release-management tools as they relate to release packages, as well as movement through different environments within the DevOps pipeline. ARA tools help regulate deployments, how environments are created and deployed, and how and when releases are deployed. == ARA Solutions == All ARA solutions must include capabilities in automation, environment modeling, and release coordination. Additionally, the solution must provide this functionality without reliance on other tools.

    Read more →
  • AsoSoft text corpus

    AsoSoft text corpus

    The AsoSoft text corpus is the first large-scale Kurdish text corpus, collected and processed by the AsoSoft research and development group. It contains 458,000 documents (188 million tokens) that are collected from sources such as websites, news agencies, books, and magazines. The corpus is partially tagged by topic, so it can be used for topic identification tasks. Also, it is applicable for extracting language model and computational lexicon information. Part of the corpus (75 million tokens) is available online for non-commercial use. The corpus uses the TEI format.

    Read more →