AI Data Trainer

AI Data Trainer — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Sketch Engine

    Sketch Engine

    Sketch Engine is a corpus manager and text analysis software developed by Lexical Computing since 2003. Its purpose is to enable people studying language behaviour (lexicographers, researchers in corpus linguistics, translators or language learners) to search large text collections according to complex and linguistically motivated queries. Sketch Engine gained its name after one of the key features, word sketches: one-page, automatic, corpus-derived summaries of a word's grammatical and collocational behaviour. Currently, it supports and provides corpora in over 100 languages. == History of development == Sketch Engine is a product of Lexical Computing, a company founded in 2003 by the lexicographer and research scientist Adam Kilgarriff. He started a collaboration with Pavel Rychlý, a computer scientist working at the Natural Language Processing Centre, Masaryk University, and the developer of Manatee and Bonito (two major parts of the software suite). Kilgarriff also introduced the concept of word sketches. Since then, Sketch Engine has been commercial software, however, all the core features of Manatee and Bonito that were developed by 2003 (and extended since then) are freely available under the GPL license within the NoSketch Engine suite. == Features == A list of tools available in Sketch Engine: Word sketches – a one-page automatic derived summary of a word's grammatical and collocational behaviour Word sketch difference – compares and contrasts two words by analysing their collocations Distributional thesaurus – automated thesaurus for finding words with similar meaning or appearing in the same/similar context Concordance search – finds occurrences of a word form, lemma, phrase, tag or complex structure Collocation search – word co-occurrence analysis displaying the most frequent words (for a search word) which can be regarded as collocation candidates Word lists – generates frequency lists which can be filtered with complex criteria n-grams – generates frequency lists of multi-word expressions Terminology / Keyword extraction (both monolingual and bilingual) – automatic extraction of key words and multi-word terms from texts (based on frequency count and linguistic criteria) Diachronic analysis (Trends) – detecting words which undergo changes in the frequency of use in time (show trending words) Corpus building and management – create corpora from the Web or uploaded texts including part-of-speech tagging and lemmatization which can be used as data mining software Parallel corpus (bilingual) facilities – looking up translation examples (EUR-Lex corpus, Europarl corpus, OPUS corpus, etc.) or building a parallel corpus from own aligned texts Text type analysis – statistics of metadata in the corpus === Keywords and terminology extraction === Sketch Engine can perform automatic term extraction by identifying words typical of a particular corpus, document, or text. Single words and multi-word units can be extracted from monolingual or bilingual texts. The terminology extraction feature provides a list of relevant terms based on comparison with a large corpus of general language. This functionality is also available as a separate service called OneClick Terms with a dedicated interface. === SKELL === A free web service based on Sketch Engine and aimed at language learners and teachers is SKELL (formerly SkELL). It exploits Sketch Engine's proprietary GDEX (Good Dictionary Examples) scoring function to provide authentic example sentences for specific target words. Results are drawn from a special corpus of high-quality texts covering everyday, standard, formal, and professional language and displayed as a concordance. SKELL also includes simplified versions of Sketch Engine's word sketch and thesaurus functions. It has been suggested that SKELL can be used, for instance, to help students understand the meaning and/or usage of a word or phrase; to help teachers wanting to use example sentences in a class; to discover and explore collocates; to create gap-fill exercises; to teach various kinds of homonyms and polysemous words. SKELL was first presented in 2014, when only English was supported. Later, support was added for Russian, Czech, German, Italian and Estonian. == List of text corpora == Sketch Engine provides access to more than 800 text corpora. There are monolingual as well as multilingual corpora of different sizes (from one thousand words up to 85 billion words) and various sources (e.g. web, books, subtitles, legal documents). The list of corpora includes British National Corpus, Brown Corpus, Cambridge Academic English Corpus and Cambridge Learner Corpus, CHILDES corpora of child language, OpenSubtitles (a set of 60 parallel corpora), 24 multilingual corpora of EUR-Lex documents, the TenTen Corpus Family (multi-billion web corpora), and Trends corpora (monitor corpora with daily updates). == Architecture == Sketch Engine consists of three main components: an underlying database management system called Manatee, a web interface search front-end called Bonito, and a web interface for corpus building and management called Corpus Architect. === Manatee === Manatee is a database management system specifically devised for effective indexing of large text corpora. It is based on the idea of inverted indexing (keeping an index of all positions of a given word in the text). It has been used to index text corpora comprising tens of billions of words. Searching corpora indexed by Manatee is performed by formulating queries in the Corpus Query Language (CQL). Manatee is written in C++ and offers an API for a number of other programming languages including Python, Java, Perl and Ruby. Recently, it was rewritten into Go for faster processing of corpus queries. === Bonito === Bonito is a web interface for Manatee providing access to corpus search. In the client–server model, Manatee is the server and Bonito plays the client part. It is written in Python. === Corpus Architect === Corpus Architect is a web interface providing corpus building and management features. It is also written in Python. == Applications == Sketch Engine has been used by major British and other publishing houses for producing dictionaries such as Macmillan English Dictionary, Dictionnaires Le Robert, Oxford University Press or Shogakukan. Four of United Kingdom's five biggest dictionary publishers use Sketch Engine.

    Read more →
  • Cups (app)

    Cups (app)

    Cups (stylized as CUPS) was a mobile app launched in New York City in April 2014. It was a mobile payment and discovery platform for independent coffee shops nearby. The app was active in more than 400 cafes in New York, San Francisco, Philadelphia, Nashville, Minneapolis and Saint Paul, and other U.S. cities. == History == Cups was founded in Israel in 2012 by Gilad Rotem and four other co-founders, who were all high school friends. The company ran a limited beta pilot in Tel Aviv and Jerusalem, featuring 80 locations, from September 2012 until September 2014. Customers received all-you-can-drink coffee at certain coffee shops in Tel Aviv for approximately $45 a month. In October 2013, the founders relocated to New York. Cups participated in the Entrepreneur's Roundtable Accelerator program and went live in New York in 2014, initially working with 50 small coffee shops in Manhattan and Brooklyn. In early 2016, the company launched 30 locations in Philadelphia in February, followed by 40 more locations in San Francisco in March. == Functionality == The Cups app gave the user a list of the nearest participating coffee shops to their current location. The app user can order a drink using the app and pay the cashier with their phone. The cashier would enter a code that entered the purchase into the app's system. The app also allowed for onboard tipping and food purchases. The company reimbursed the coffee shop and kept a portion of their sales. In early 2016, the Cups Café Network was launched, using bulk purchasing power to land discounts with service providers which would normally be reserved for larger chains. In this way, the company aimed to help its café partners compete with the larger coffee chains.

    Read more →
  • Apps to analyse COVID-19 sounds

    Apps to analyse COVID-19 sounds

    Apps to analyse COVID-19 sounds are mobile software applications designed to collect respiratory sounds and aid diagnosis in response to the COVID-19 pandemic. Numerous applications are in development, with different institutions and companies taking various approaches to privacy and data collection. Current efforts are aimed at gathering data. In a later stage, it is possible that sound apps will have the capacity (and ethical approvals) to provide information back to users. In order to develop and train signal analysis approaches, large datasets are required. == History == The COVID-19 outbreak was announced as a global pandemic by the World Health Organization in March 2020 and has affected a growing number of people globally. In this context, advanced artificial intelligence techniques are being considered as tools in aiding our response to global health crisis. Other COVID-19 apps which offer solutions for user tracking have been developed. At the same time a number of approaches which tries to use respiratory sounds and artificial intelligence to understand if the disease can be diagnosed have been proposed. A few studies are available as preprints (i.e. not yet peer-reviewed) documents. == Methodologies == The potential for using speech and sound analysis by artificial intelligence to help in this scenario, by surveying which types of related or contextually significant phenomena can be automatically assessed from speech or sound has been recently overviewed. These include the automatic recognition and monitoring of breathing, dry and wet coughing or sneezing sounds, speech under cold, eating behaviour, sleepiness, or pain. Additionally, the potential use-cases of intelligent speech analysis for COVID-19 diagnosed patients has also been presented. In particular, by analysing speech recordings from these patients, an audio-only-based model to automatically categorise the health state of patients from four aspects, including the severity of illness, sleep quality, fatigue, and anxiety, is constructed. This work shows promise in estimating the severity of illness. Machine learning methods have been explored to recognize and diagnose coughs from different diseases. These included a low complexity, automated recognition and diagnostic tool for screening respiratory infections that utilizes convolutional neural networks (CNNs) to detect cough within environment audio and diagnose three potential illnesses (i.e. bronchitis, bronchiolitis and pertussis) based on their unique cough audio features. A large-scale crowdsourced dataset of respiratory sounds has been collected to aid diagnosis of COVID-19: coughs and breathing sounds are sufficient to distinguish users affected by COVID-19 versus those affected by asthma or healthy controls. Behind these studies is the ambition that automated systems to screen for respiratory diseases based on voice, raw cough or other sound data would have positive medical applications in both clinical and public health arenas. == List of apps to analyse COVID-19 sounds ==

    Read more →
  • Minimum resolvable contrast

    Minimum resolvable contrast

    Minimum resolvable contrast (MRC) is a subjective measure of a visible spectrum sensor’s or camera's sensitivity and ability to resolve data. A snapshot image of a series of three bar targets of selected spatial frequencies and various contrast coatings captured by the unit under test (UUT) is used to determine the MRC of the UUT, i.e., the visible spectrum camera or sensor. A trained observer selects the smallest target resolvable at each contrast level. Typically, specialized computer software collects the inputted data of the observer and provides a graph of contrast vs. spatial frequency at a given luminance level. A first order polynomial is fitted to the data and an MRC curve of spatial frequency versus contrast is generated.

    Read more →
  • Artificial intelligence in hiring

    Artificial intelligence in hiring

    Artificial intelligence can be used to automate aspects of the job recruitment process. Advances in artificial intelligence, such as the advent of machine learning and the growth of big data, enable AI to be utilized to recruit, screen, and predict the success of applicants. Proponents of artificial intelligence in hiring claim it reduces bias, assists with finding qualified candidates, and frees up human resource workers' time for other tasks, while opponents worry that AI perpetuates inequalities in the workplace and will eliminate jobs. Despite the potential benefits, the ethical implications of AI in hiring remain a subject of debate, with concerns about algorithmic transparency, accountability, and the need for ongoing oversight to ensure fair and unbiased decision-making throughout the recruitment process. == Background == It is common for companies to use AI to automate aspects of their hiring process, especially the hospitality, finance, and tech industries. == Uses == === Screeners === Screeners are tests that allow companies to sift through a large applicant pool and extract applicants that have desirable features. What factors are used to screen applicants is a concern to ethicists and civil rights activists. A screener that favors people who have similar characteristics to those already employed at a company may perpetuate inequalities. For example, if a company that is predominantly white and male uses its employees' data to train its screener it may accidentally create a screening process that favors white, male applicants. The automation of screeners also has the potential to reduce biases. Biases against applicants with African American sounding names have been shown in multiple studies. An AI screener has the potential to limit human bias and error in the hiring process, allowing more minority applicants to be successful. === Recruitment === Recruitment involves the identification of potential applicants and the marketing of positions. AI is commonly utilized in the recruitment process because it can help boost the number of qualified applicants for positions. Companies are able to use AI to target their marketing to applicants who are likely to be good fits for a position. This often involves the use of social media sites advertising tools, which rely on AI. Facebook allows advertisers to target ads based on demographics, location, interests, behavior, and connections. Facebook also allows companies to target a "look-a-like" audience, that is the company supplies Facebook with a data set, typically the company's current employees, and Facebook will target the ad to profiles that are similar to the profiles in the data set. Additionally, job sites like Indeed, Glassdoor, and ZipRecruiter target job listings to applicants that have certain characteristics employers are looking for. Targeted advertising has many advantages for companies trying to recruit such being a more efficient use of resources, reaching a desired audience, and boosting qualified applicants. This has helped make it a mainstay in modern hiring. Who receives a targeted ad can be controversial. In hiring, the implications of targeted ads have to do with who is able to find out about and then apply to a position. Most targeted ad algorithms are proprietary information. Some platforms, like Facebook and Google, allow users to see why they were shown a specific ad, but users who do not receive the ad likely never know of its existence and also have no way of knowing why they were not shown the ad. === Interviews === Chatbots were one of the first applications of AI and are commonly used in the hiring process. Interviewees interact with chatbots to answer interview questions, and an analysis of their responses can be generated by AI. HireVue has created technology that analyzes interviewees' responses and gestures during recorded video interviews. Over 12 million interviewees have been screened by the more than 700 companies that utilize the service. == Controversies == Artificial intelligence in hiring confers many benefits, but it also has some challenges that have concerned experts. AI is only as good as the data it is using. Biases can inadvertently be baked into the data used in AI. Often companies will use data from their employees to decide what people to recruit or hire. This can perpetuate bias and lead to more homogenous workforces. Facebook Ads was an example of a platform that created such controversy for allowing business owners to specify what type of employee they are looking for. For example, job advertisements for nursing and teach could be set such that only women of a specific age group would see the advertisements. Facebook Ads has since then removed this function from its platform, citing the potential problems with the function in perpetuating biases and stereotypes against minorities. The growing use of Artificial Intelligence-enabled hiring systems has become an important component of modern talent hiring, particularly through social networks such as LinkedIn and Facebook. However, data overflow embedded in the hiring systems, based on Natural Language Processing (NLP) methods, may result in unconscious gender bias. Utilizing data driven methods may mitigate some bias generated from these systems It can also be hard to quantify what makes a good employee. This poses a challenge for training AI to predict which employees will be best. Commonly used metrics like performance reviews can be subjective and have been shown to favor white employees over black employees and men over women. Another challenge is the limited amount of available data. Employers only collect certain details about candidates during the initial stages of the hiring process. This requires AI to make determinations about candidates with very limited information to go off of. Additionally, many employers do not hire employees frequently and so have limited firm specific data to go off. To combat this, many firms will use algorithms and data from other firms in their industry. AI's reliance on applicant and current employees personal data raises privacy issues. These issues effect both the applicants and current employees, but also may have implications for third parties who are linked through social media to applicants or current employees. For example, a sweep of someone's social media will also show their friends and people they have tagged in photos or posts. == AI and the future of hiring == Artificial intelligence along with other technological advances such as improvements in robotics have placed 47% of jobs at risk of being eliminated in the near future. In 2016 the founder of the World Economic Forum, Klaus Schwab, called AI and related technology the "Fourth Industrial Revolution". According to some scholars, however, the transformative impact of AI on labor has been overstated. The "no-real-change" theory holds that an IT revolution has already occurred, but that the benefits of implementing new technologies does not outweigh the costs associated with adopting them. This theory claims that the result of the IT revolution is thus much less impactful than had originally been forecasted. Other scholars refute this theory claiming that AI has already led to significant job loss for unskilled labor and that it will eliminate middle skill and high skill jobs in the future. This position is based around the idea that AI is not yet a technology of general use and that any potential 4th industrial revolution has not fully occurred. A third theory holds that the effect of AI and other technological advances is too complicated to yet be understood. This theory is centered around the idea that while AI will likely eliminate jobs in the short term it will also likely increase the demand for other jobs. The question then becomes will the new jobs be accessible to people and will they emerge near when jobs are eliminated. == AI use in hiring for candidates == Job seekers now commonly encounter AI-driven tools at multiple stages, including automated resume parsing, video interview analysis, chatbots for frequently asked questions, and real‑time application updates. Some candidates also employ AI career agents, designed to optimize job searches, tailor applications, and interface with hiring teams. A 2025 Australian study found that AI-driven video interviews exhibited transcription error rates of up to 22% for non‑native speakers and those with speech-related disabilities, raising concerns of discrimination. A 2017 study in the Journal of Sociology found persistent gender and racial disparities in AI screening tools, even when fairness interventions are applied. Industry observers describe a growing “AI arms race” in recruitment, where both employers and candidates increasingly rely on automated agents. Employers use recruiting systems to source and filter applicants, while candidates deploy AI agents to prepare and submit applications. == Regulations == The Artifici

    Read more →
  • VueScan

    VueScan

    VueScan is a computer program for image scanning, especially of photographs, including negatives. It supports optical character recognition (OCR) of text documents. The software can be downloaded and used free of charge, but adds a watermark on scans until a license is purchased. == Purpose == VueScan is intended to work with a large number of image scanners, excluding specialised professional scanners such as drum scanners, on many computer operating systems (OS), even if drivers for the scanner are not available for the OS. These scanners are supplied with device drivers and software to operate them, included in their price. A 2014 review considered that the reasons to purchase VueScan are to allow older scanners not supported by drivers for newer operating systems to be used in more up-to-date systems and for better scanning and processing of photographs (prints; also slides and negatives when supported by scanners) than is afforded by manufacturers' software. The review did not report any advantages to VueScan's processing of documents over other software. The reviewer considered VueScan comparable to SilverFast, a similar program, with support for some specific scanners better in one or the other. Vuescan supports more scanners, with a single purchase giving access to the full range of both film and flatbed scanners, and costs less. The VueScan program can be used with its own drivers or with drivers supplied by the scanner manufacturer, if supported by the operating system. VueScan drivers can also be used without the VueScan program by application software that supports scanning directly, such as Adobe Photoshop, again enabling the use of scanners without current manufacturers' drivers. In 2019 when Apple released macOS Catalina, they removed support for running 32-bit programs, including 32-bit drivers for scanning equipment. In response, Hamrick released VueScan 9.7, effectively saving thousands of scanners from being rendered obsolete. == Overview == VueScan enables the user to modify and fine-tune the scanning parameters. The program uses its own independent method to interface with scanner hardware, and can support many older scanners under computer operating systems for which drivers are not available, allowing old scanners to be used with newer platforms that do not otherwise support them. VueScan supports an increasing number of scanners and digital cameras; 2,400 on Windows, 2,100 on Mac OS X and 1,900 on Linux in 2018. VueScan is supplied as one downloadable file for each operating system, which supports the full range of scanners. Without the purchase of a license, the program runs in fully functional demonstration mode, identical to Professional mode, except that watermarks are superimposed on saved and printed images. Purchase of a license removes the watermark. A standard license allows updates for one year; a professional license allows unlimited updates and provides some additional features. VueScan supports optical character recognition (OCR), with English included, and 32 additional language packages available on its website. In September 2011, VueScan co-developer Ed Hamrick said that he was selling US$3 million per year of VueScan licenses.

    Read more →
  • Phase stretch transform

    Phase stretch transform

    Phase stretch transform (PST) is a computational approach to signal and image processing. One of its utilities is for feature detection and classification. PST is related to time stretch dispersive Fourier transform. It transforms the image by emulating propagation through a diffractive medium with engineered 3D dispersive property (refractive index). The operation relies on symmetry of the dispersion profile and can be understood in terms of dispersive eigenfunctions or stretch modes. PST performs similar functionality as phase-contrast microscopy, but on digital images. PST can be applied to digital images and temporal (time series) data. It is a physics-based feature engineering algorithm. == Operation principle == Here the principle is described in the context of feature enhancement in digital images. The image is first filtered with a spatial kernel followed by application of a nonlinear frequency-dependent phase. The output of the transform is the phase in the spatial domain. The main step is the 2-D phase function which is typically applied in the frequency domain. The amount of phase applied to the image is frequency dependent, with higher amount of phase applied to higher frequency features of the image. Since sharp transitions, such as edges and corners, contain higher frequencies, PST emphasizes the edge information. Features can be further enhanced by applying thresholding and morphological operations. PST is a pure phase operation whereas conventional edge detection algorithms operate on amplitude. == Physical and mathematical foundations of phase stretch transform == Photonic time stretch technique can be understood by considering the propagation of an optical pulse through a dispersive fiber. By disregarding the loss and non-linearity in fiber, the non-linear Schrödinger equation governing the optical pulse propagation in fiber upon integration reduces to: E o ( z , t ) = 1 2 π ∫ − ∞ ∞ E ~ i ( 0 , ω ) ⋅ e − i β 2 z ω 2 2 ⋅ e i ω t d ω {\displaystyle E_{o}(z,t)={\frac {1}{2\pi }}\int _{-\infty }^{\infty }{\tilde {E}}_{i}(0,\omega )\cdot e^{\frac {-i\beta _{2}z\omega ^{2}}{2}}\cdot e^{i\omega {t}}\,d\omega } (1) where β 2 {\displaystyle \beta _{2}} = GVD parameter, z is propagation distance, E o ( z , t ) {\displaystyle E_{o}(z,t)} is the reshaped output pulse at distance z and time t. The response of this dispersive element in the time-stretch system can be approximated as a phase propagator as presented in H ( ω ) = e i φ ( ω ) = e i ∑ m = 0 ∞ φ m ( ω ) = ∏ m = 0 ∞ H m ( ω ) {\displaystyle H(\omega )=e^{i\varphi (\omega )}=e^{i\sum _{m=0}^{\infty }\varphi _{m}(\omega )}=\prod _{m=0}^{\infty }H_{m}(\omega )} (2) Therefore, Eq. 1 can be written as following for a pulse that propagates through the time-stretch system and is reshaped into a temporal signal with a complex envelope given by E o ( t ) = 1 2 π ∫ − ∞ ∞ E ~ i ( ω ) ⋅ H ( ω ) ⋅ e i ω t d ω {\displaystyle E_{o}(t)={\frac {1}{2\pi }}\int _{-\infty }^{\infty }{\tilde {E}}_{i}(\omega )\cdot H(\omega )\cdot e^{i\omega t}\,d\omega } (3) The time stretch operation is formulated as generalized phase and amplitude operations, S { E i ( t ) } = ∫ − ∞ + ∞ F { E i ( t ) } ⋅ e i φ ( ω ) ⋅ L ~ ( ω ) ⋅ e i ω t d ω {\displaystyle \mathbb {S} \{E_{i}(t)\}=\int _{-\infty }^{+\infty }{\mathcal {F}}\{E_{i}(t)\}\cdot e^{i\varphi (\omega )}\cdot {\tilde {L}}(\omega )\cdot e^{i\omega {t}}d\omega } (4) where e i φ ( ω ) {\displaystyle e^{i\varphi (\omega )}} is the phase filter and L ~ ( ω ) {\displaystyle {\tilde {L}}(\omega )} is the amplitude filter. Next the operator is converted to discrete domain, S { E i [ n ] } = 1 N ∑ u = 0 N − 1 F F T { E i ( n ) } ⋅ K ~ ( u ) ⋅ L ~ ( u ) ⋅ e i 2 π N u n {\displaystyle \mathbb {S} \{E_{i}[n]\}={\frac {1}{N}}\sum _{u=0}^{N-1}FFT\{E_{i}(n)\}\cdot {\tilde {K}}(u)\cdot {\tilde {L}}(u)\cdot e^{i{\frac {2\pi }{N}}un}} (5) where u {\displaystyle u} is the discrete frequency, K ~ ( u ) {\displaystyle {\tilde {K}}(u)} is the phase filter, L ~ ( u ) {\displaystyle {\tilde {L}}(u)} is the amplitude filter and FFT is fast Fourier transform. The stretch operator S { } {\displaystyle \mathbb {S} \{\}} for a digital image is then S { E i [ n , m ] } = 1 M N ∑ v = 0 N − 1 ∑ u = 0 M − 1 F F T 2 { E i ( n , m ) } ⋅ K ~ ( u , v ) ⋅ L ~ ( u , v ) ⋅ e i 2 π M u m ⋅ e i 2 π N v n {\displaystyle \mathbb {S} \{E_{i}[n,m]\}={\frac {1}{MN}}\sum _{v=0}^{N-1}\sum _{u=0}^{M-1}FFT^{2}\{E_{i}(n,m)\}\cdot {\tilde {K}}(u,v)\cdot {\tilde {L}}(u,v)\cdot e^{i{\frac {2\pi }{M}}um}\cdot e^{i{\frac {2\pi }{N}}vn}} (6) In the above equations, E i [ n , m ] {\displaystyle E_{i}[n,m]} is the input image, n {\displaystyle n} and m {\displaystyle m} are the spatial variables, F F T 2 {\displaystyle FFT^{2}} is the two-dimensional fast Fourier transform, and u {\displaystyle u} and v {\displaystyle v} are spatial frequency variables. The function K ~ ( u , v ) {\displaystyle {\tilde {K}}(u,v)} is the warped phase kernel and the function L ~ ( u , v ) {\displaystyle {\tilde {L}}(u,v)} is a localization kernel implemented in frequency domain. PST operator is defined as the phase of the Warped Stretch Transform output as follows P S T { E i [ n , m ] } ≜ ∡ { S { E i [ x , y ] } } {\displaystyle PST\{E_{i}[n,m]\}\triangleq \measuredangle \{\mathbb {S} \{E_{i}[x,y]\}\}} (7) where ∡ { } {\displaystyle \measuredangle \{\}} is the angle operator. == PST kernel implementation == The warped phase kernel K ~ ( u , v ) {\displaystyle {\tilde {K}}(u,v)} can be described by a nonlinear frequency dependent phase K ~ ( u , v ) = e i φ ( u , v ) {\displaystyle {\tilde {K}}(u,v)=e^{i\varphi (u,v)}} While arbitrary phase kernels can be considered for PST operation, here we study the phase kernels for which the kernel phase derivative is a linear or sublinear function with respect to frequency variables. A simple example for such phase derivative profiles is the inverse tangent function. Consider the phase profile in the polar coordinate system φ ( u , v ) = φ polar ( r , θ ) = φ polar ( r ) {\displaystyle \varphi (u,v)=\varphi _{\text{polar}}(r,\theta )=\varphi _{\text{polar}}(r)} From d φ ( r ) d r = tan − 1 ⁡ ( r ) {\displaystyle {\frac {d\varphi (r)}{dr}}=\tan ^{-1}(r)} we have φ ( r ) = r tan − 1 ⁡ ( r ) − 1 2 log ⁡ ( r 2 + 1 ) {\displaystyle \varphi (r)=r\tan ^{-1}(r)-{\frac {1}{2}}\log(r^{2}+1)} Therefore, the PST kernel is implemented as φ ( r ) = S ⋅ ( W r ) ⋅ tan − 1 ⁡ ( W r ) − 1 2 log ⁡ ( 1 + ( W r ) 2 ) ( W r max ) ⋅ tan − 1 ⁡ ( W r max ) − 1 2 log ⁡ ( 1 + ( W r max ) 2 ) {\displaystyle \varphi (r)=S\cdot {\frac {(Wr)\cdot \tan ^{-1}(Wr)-{\frac {1}{2}}\log(1+(Wr)^{2})}{(Wr_{\max })\cdot \tan ^{-1}(Wr_{\max })-{\frac {1}{2}}\log(1+(Wr_{\max })^{2})}}} where S {\displaystyle S} and W {\displaystyle W} are real-valued numbers related to the strength and warp of the phase profile == Applications == PST has been used for edge detection in biological and biomedical images as well as synthetic-aperture radar (SAR) image processing, as well as detail and feature enhancement for digital images. PST has also been applied to improve the point spread function for single molecule imaging in order to achieve super-resolution. The transform exhibits intrinsic superior properties compared to conventional edge detectors for feature detection in low contrast visually impaired images. The PST function can also be performed on 1-D temporal waveforms in the analog domain to reveal transitions and anomalies in real time. == Open source code release == On February 9, 2016, a UCLA Engineering research group has made public the computer code for PST algorithm that helps computers process images at high speeds and "see" them in ways that human eyes cannot. The researchers say the code could eventually be used in face, fingerprint, and iris recognition systems for high-tech security, as well as in self-driving cars' navigation systems or for inspecting industrial products. The Matlab implementation for PST can also be downloaded from Matlab Files Exchange. However, it is provided for research purposes only, and a license must be obtained for any commercial applications. The software is protected under a US patent. The code was then significantly refactored and improved to support GPU acceleration. In May 2022, it became one algorithm in PhyCV: the first physics-inspired computer vision library.

    Read more →
  • VistaCreate

    VistaCreate

    VistaCreate (formerly Crello) is an online graphic design platform for non-designers, launched in 2016. As of 2022, it has more than 10 million users in 192 countries. == Overview == VistaCreate (then known as Crello) was launched in 2016 as a part of Depositphotos. In 2019, the product hit a milestone of 1 million registered users and also launched mobile apps. In 2020, the library of templates and objects became free. A music library and a background remover tool were added to the platform. In May 2021, Moufflons Basketball, in collaboration with VistaCreate, organized a poster design competition in support of gender equality in sports. In October 2021, Vistaprint acquired Crello and its parent company, Depositphotos, for a total price of $85 million. After the acquisition, Crello was rebranded to VistaCreate. Along with Vistaprint and 99designs, it became part of the new Vista parent brand. After Russia started a full-scale war on the territory of Ukraine in February 2022, VistaCreate suspended all business in Russia and Belarus. VistaCreate's team and Depositphotos gathered collections of images and templates dedicated to the war in Ukraine.

    Read more →
  • MobileNet

    MobileNet

    MobileNet is a family of convolutional neural network (CNN) architectures designed for image classification, object detection, and other computer vision tasks. They are designed for small size, low latency, and low power consumption, making them suitable for on-device inference and edge computing on resource-constrained devices like mobile phones and embedded systems. They were originally designed to be run efficiently on mobile devices with TensorFlow Lite. The need for efficient deep learning models on mobile devices led researchers at Google to develop MobileNet. As of June 2025, the family has five versions, each improving upon the previous one in terms of performance and efficiency. == Features == === V1 === MobileNetV1 was published in April 2017. Its main architectural innovation was incorporation of depthwise separable convolutions. It was first developed by Laurent Sifre during an internship at Google Brain in 2013 as an architectural variation on AlexNet to improve convergence speed and model size. The depthwise separable convolution decomposes a single standard convolution into two convolutions: a depthwise convolution that filters each input channel independently and a pointwise convolution ( 1 × 1 {\displaystyle 1\times 1} convolution) that combines the outputs of the depthwise convolution. This factorization significantly reduces computational cost. The MobileNetV1 has two hyperparameters: a width multiplier α {\displaystyle \alpha } that controls the number of channels in each layer. Smaller values of α {\displaystyle \alpha } lead to smaller and faster models, but at the cost of reduced accuracy, and a resolution multiplier ρ {\displaystyle \rho } , which controls the input resolution of the images. Lower resolutions result in faster processing but potentially lower accuracy. === V2 === MobileNetV2 was published in March 2019. It uses inverted residual layers and linear bottlenecks. Inverted residuals modify the traditional residual block structure. Instead of compressing the input channels before the depthwise convolution, they expand them. This expansion is followed by a 1 × 1 {\displaystyle 1\times 1} depthwise convolution and then a 1 × 1 {\displaystyle 1\times 1} projection layer that reduces the number of channels back down. This inverted structure helps to maintain representational capacity by allowing the depthwise convolution to operate on a higher-dimensional feature space, thus preserving more information flow during the convolutional process. Linear bottlenecks removes the typical ReLU activation function in the projection layers. This was rationalized by arguing that that nonlinear activation loses information in lower-dimensional spaces, which is problematic when the number of channels is already small. === V3 === MobileNetV3 was published in 2019. The publication included MobileNetV3-Small, MobileNetV3-Large, and MobileNetEdgeTPU (optimized for Pixel 4). They were found by a form of neural architecture search (NAS) that takes mobile latency into account, to achieve good trade-off between accuracy and latency. It used piecewise-linear approximations of swish and sigmoid activation functions (which they called "h-swish" and "h-sigmoid"), squeeze-and-excitation modules, and the inverted bottlenecks of MobileNetV2. === V4 === MobileNetV4 was published in September 2024. The publication included a large number of architectures found by NAS. Inspired by Vision Transformers, the V4 series included multi-query attention. It also unified both inverted residual and inverted bottleneck from the V3 series with the "universal inverted bottleneck", which includes these two as special cases. === V5 === MobileNetV5's architecture was published shortly after the release of Gemma 3n in June 2025. While the announcement stated a technical report on MobileNetV5 would be available soon, this has not yet materialised. The network is 10 times larger than the largest V4 variant.

    Read more →
  • Automated dispensing cabinet

    Automated dispensing cabinet

    An automated dispensing cabinet (ADC), also called a unit-based cabinet (UBC), automated dispensing device (ADD), or automated dispensing machine (ADM)[1], is a computerized medicine cabinet for hospitals and healthcare settings. ADCs allow medications to be stored and dispensed near the point of care while controlling and tracking drug distribution. == Overview == Hospital pharmacies have provided medications for patients by filling patient-specific cassettes of unit-dose medications that were then delivered to the nursing unit and stored in medication cabinets or carts. ADCs, originally designed for hospital use, were introduced in hospitals in the 1980s and have facilitated the transition to alternative delivery models and more decentralized medication distribution systems.[2] Implementing automated dispensing cabinets as part of a decentralized or hybrid medication distribution system can improve patient safety and the accountability of the inventory, streamline certain billing processes. However, in the 2000s, the technology began to be deployed into other care settings where medication doses were stored onsite, and higher security methods were needed to control inventory, access, and dispensing of each patient dose. Settings that now deploy ADCs include long-term care facilities, hospice, critical access hospitals, surgery centers, group homes, residential care facilities, rehab and psych environments, animal health, dental clinics, and nursing education simulation. These diverse care settings share a common need to safely store, account for, and dispense individual doses of medications, especially narcotics and high-value medications, at the point of care.[3] ADCs track user access and dispensed medications, and their use can improve control over medication inventory. The real-time inventory reports generated by many cabinets can simplify the filling process and help the pharmacy track expired drugs. Furthermore, by restricting individual drugs – such as high-risk medications and controlled substances – to unique drawers within the cabinet, overall inventory management, patient safety, and medication security can be improved. Automated dispensing cabinets allow the pharmacy department to profile physician orders before they are dispensed.[4] ADCs can also enable providers to record medication charges upon dispensing, reducing the billing paperwork the pharmacy is responsible for. In addition, nurses can note returned medications using the cabinets' computers, enabling direct credits to patients' accounts. Since automated cabinets can be located on the nursing unit floor, nursing have speedier access to a patient's medications. Also, shorter waiting time ensures improved patient comfort and care.[5] == Role of automated dispensing in healthcare == Automated dispensing is a pharmacy practice in which a device dispenses medications and fills prescriptions. ADCs, which can handle many different medications, are available from a number of manufacturers such as BD, ARxIUM, and Omnicell. Though members of the pharmacy community have been utilizing automation technology since the 1980s, companies are constantly improving ADCs to meet changing needs and health standards in the industry. Several goals can be met by implementing an automated product in a healthcare facility. Patient safety can be ensured with the use of ADC technology such as barcoding. Anesthesia ADCs in operating rooms and perioperative areas may include label printing to prevent mix-ups such as errors between morphine and hydromorphone, two different opioid analgesics that frequently get confused. These systems also communicate with the pharmacy and its information management system to track medications removed and support inventory replenishment. == Key features == ADCs are like automated teller machines whose specific technologies such as barcode scanning and clinical decision support can improve medication safety. Some have metal locking drawers for added security and some have automated single-dose dispensing to prevent the need for a blind count each time a controlled substance is accessed. Over the years, ADCs have been adapted to facilitate compliance with emerging regulatory requirements such as pharmacy review of medication orders and safe practice recommendations. ADCs incorporate advanced software and electronic interfaces to synthesize high-risk steps in the medication use process. These unit-based medication repositories provide computer-controlled storage, dispensation, tracking, and documentation of medication distribution in the resident care unit. Since automated dispensing cabinets are not located in the pharmacy, they are considered "decentralized" medication distribution systems. Instead, they can be found at the point of care on the resident care unit. Tracking of the stocking and distribution process can occur by interfacing the unit with a central pharmacy computer. These cabinets can also be interfaced with other external databases such as resident profiles, the facility's admission/discharge/transfer system, and billing systems. Most ADC providers offer scalable systems since several important factors vary widely by facility such as budget, physical room size, patient population/demographics, type of healthcare facility, etc.

    Read more →
  • Projection-slice theorem

    Projection-slice theorem

    In mathematics, the projection-slice theorem, central slice theorem or Fourier slice theorem in two dimensions states that the results of the following two calculations are equal: Take a two-dimensional function f(r), project (e.g. using the Radon transform) it onto a (one-dimensional) line, and do a Fourier transform of that projection. Take that same function, but do a two-dimensional Fourier transform first, and then slice the function through its origin, parallel to the projection line. In operator terms, if F1 and F2 are the 1- and 2-dimensional Fourier transform operators mentioned above, P1 is the projection operator (which projects a 2-D function onto a 1-D line), S1 is a slice operator (which extracts a 1-D central slice from a function), then F 1 P 1 = S 1 F 2 . {\displaystyle F_{1}P_{1}=S_{1}F_{2}.} This idea can be extended to higher dimensions. This theorem is used, for example, in the analysis of medical CT scans where a "projection" is an x-ray image of an internal organ. The Fourier transforms of these images are seen to be slices through the Fourier transform of the 3-dimensional density of the internal organ, and these slices can be interpolated to build up a complete Fourier transform of that density. The inverse Fourier transform is then used to recover the 3-dimensional density of the object. This technique was first derived by Ronald N. Bracewell in 1956 for a radio-astronomy problem. == The projection-slice theorem in N dimensions == In N dimensions, the projection-slice theorem states that the Fourier transform of the projection of an N-dimensional function f(r) onto an m-dimensional linear submanifold is equal to an m-dimensional slice of the N-dimensional Fourier transform of that function consisting of an m-dimensional linear submanifold through the origin in the Fourier space which is parallel to the projection submanifold. In operator terms: F m P m = S m F N . {\displaystyle F_{m}P_{m}=S_{m}F_{N}.\,} == The generalized Fourier-slice theorem == In addition to generalizing to N dimensions, the projection-slice theorem can be further generalized with an arbitrary change of basis. For convenience of notation, we consider the change of basis to be represented as B, an N-by-N invertible matrix operating on N-dimensional column vectors. Then the generalized Fourier-slice theorem can be stated as F m P m B = S m B − T | B − T | F N {\displaystyle F_{m}P_{m}B=S_{m}{\frac {B^{-T}}{|B^{-T}|}}F_{N}} where B − T = ( B − 1 ) T {\displaystyle B^{-T}=(B^{-1})^{T}} is the transpose of the inverse of the change of basis transform. == Proof in two dimensions == The projection-slice theorem is easily proven for the case of two dimensions. Without loss of generality, we can take the projection line to be the x-axis. There is no loss of generality because if we use a shifted and rotated line, the law still applies. Using a shifted line (in y) gives the same projection and therefore the same 1D Fourier transform results. The rotated function is the Fourier pair of the rotated Fourier transform, for which the theorem again holds. If f(x, y) is a two-dimensional function, then the projection of f(x, y) onto the x axis is p(x) where p ( x ) = ∫ − ∞ ∞ f ( x , y ) d y . {\displaystyle p(x)=\int _{-\infty }^{\infty }f(x,y)\,dy.} The Fourier transform of f ( x , y ) {\displaystyle f(x,y)} is F ( k x , k y ) = ∫ − ∞ ∞ ∫ − ∞ ∞ f ( x , y ) e − 2 π i ( x k x + y k y ) d x d y . {\displaystyle F(k_{x},k_{y})=\int _{-\infty }^{\infty }\int _{-\infty }^{\infty }f(x,y)\,e^{-2\pi i(xk_{x}+yk_{y})}\,dxdy.} The slice is then s ( k x ) {\displaystyle s(k_{x})} s ( k x ) = F ( k x , 0 ) = ∫ − ∞ ∞ ∫ − ∞ ∞ f ( x , y ) e − 2 π i x k x d x d y {\displaystyle s(k_{x})=F(k_{x},0)=\int _{-\infty }^{\infty }\int _{-\infty }^{\infty }f(x,y)\,e^{-2\pi ixk_{x}}\,dxdy} = ∫ − ∞ ∞ [ ∫ − ∞ ∞ f ( x , y ) d y ] e − 2 π i x k x d x {\displaystyle =\int _{-\infty }^{\infty }\left[\int _{-\infty }^{\infty }f(x,y)\,dy\right]\,e^{-2\pi ixk_{x}}dx} = ∫ − ∞ ∞ p ( x ) e − 2 π i x k x d x {\displaystyle =\int _{-\infty }^{\infty }p(x)\,e^{-2\pi ixk_{x}}dx} which is just the Fourier transform of p(x). The proof for higher dimensions is easily generalized from the above example. == The FHA cycle == If the two-dimensional function f(r) is circularly symmetric, it may be represented as f(r), where r = |r|. In this case the projection onto any projection line will be the Abel transform of f(r). The two-dimensional Fourier transform of f(r) will be a circularly symmetric function given by the zeroth-order Hankel transform of f(r), which will therefore also represent any slice through the origin. The projection-slice theorem then states that the Fourier transform of the projection equals the slice or F 1 A 1 = H , {\displaystyle F_{1}A_{1}=H,} where A1 represents the Abel-transform operator, projecting a two-dimensional circularly symmetric function onto a one-dimensional line, F1 represents the 1-D Fourier-transform operator, and H represents the zeroth-order Hankel-transform operator. == Extension to fan beam or cone-beam CT == The projection-slice theorem is suitable for CT image reconstruction with parallel beam projections. It does not directly apply to fanbeam or conebeam CT. The theorem was extended to fan-beam and conebeam CT image reconstruction by Shuang-ren Zhao in 1995.

    Read more →
  • ARD Sounds

    ARD Sounds

    ARD Sounds (until March 2026: ARD Audiothek) is the joint audio portal of the state broadcasting stations of the ARD and Deutschlandradio on the Internet. The service was officially launched as a mobile app on November 8, 2017, on the occasion of the ARD Radio Play Days in Karlsruhe. A beta web version has also been available since November 2018; it replaces the radio features in the ARD Mediathek, which has since offered only video content. Editorial support for the ARD Audiothek is provided by the ARD, the online editorial team in Mainz. In April 2018, the ARD Audiothek won the German Digital Award in silver in the category "Mobile Apps - User Experience / Usability". Within a year, the mobile app version had been installed more than 510,000 times and had around 21 million audio views. The Android app recorded more than 100,000 downloads in October 2019, according to the Google Play Store.

    Read more →
  • Toggl Track

    Toggl Track

    Toggl Track (formerly Toggl) is a time tracking software developed by Toggl OÜ which is headquartered in Tallinn, Estonia. The company offers online time tracking and reporting services through their website along with mobile and desktop applications. Time can be tracked through a start/stop button, manual entry, or dragging and resizing time blocks in a calendar view. == History == According to Alari Aho, Toggl's CEO and founder, the application has been fully self-funded from the start. The name was created using a random name generator.

    Read more →
  • Progressive Graphics File

    Progressive Graphics File

    PGF (Progressive Graphics File) is a wavelet-based bitmapped image format that employs lossless and lossy data compression. PGF was created to improve upon and replace the JPEG format. It was developed at the same time as JPEG 2000 but with a focus on speed over compression ratio. PGF can operate at higher compression ratios without taking more encoding/decoding time and without generating the characteristic "blocky and blurry" artifacts of the original DCT-based JPEG standard. It also allows more sophisticated progressive downloads. == Color models == PGF supports a wide variety of color models: Grayscale with 1, 8, 16, or 31 bits per pixel Indexed color with palette size of 256 RGB color image with 12, 16 (red: 5 bits, green: 6 bits, blue: 5 bits), 24, or 48 bits per pixel ARGB color image with 32 bits per pixel Lab color image with 24 or 48 bits per pixel CMYK color image with 32 or 64 bits per pixel == Technical discussion == PGF claims to achieve an improved compression quality over JPEG adding or improving features such as scalability. Its compression performance is similar to the original JPEG standard. Very low and very high compression rates (including lossless compression) are also supported in PGF. The ability of the design to handle a very large range of effective bit rates is one of the strengths of PGF. For example, to reduce the number of bits for a picture below a certain amount, the advisable thing to do with the first JPEG standard is to reduce the resolution of the input image before encoding it — something that is ordinarily not necessary for that purpose when using PGF because of its wavelet scalability properties. The PGF process chain contains the following four steps: Color space transform (in case of color images) Discrete Wavelet Transform Quantization (in case of lossy data compression) Hierarchical bit-plane run-length encoding === Color components transformation === Initially, images have to be transformed from the RGB color space to another color space, leading to three components that are handled separately. PGF uses a fully reversible modified YUV color transform. The transformation matrices are: [ Y r U r V r ] = [ 1 4 1 2 1 4 1 − 1 0 0 − 1 1 ] [ R G B ] ; [ R G B ] = [ 1 3 4 − 1 4 1 − 1 4 − 1 4 1 − 1 4 3 4 ] [ Y r U r V r ] {\displaystyle {\begin{bmatrix}Y_{r}\\U_{r}\\V_{r}\end{bmatrix}}={\begin{bmatrix}{\frac {1}{4}}&{\frac {1}{2}}&{\frac {1}{4}}\\1&-1&0\\0&-1&1\end{bmatrix}}{\begin{bmatrix}R\\G\\B\end{bmatrix}};\qquad \qquad {\begin{bmatrix}R\\G\\B\end{bmatrix}}={\begin{bmatrix}1&{\frac {3}{4}}&-{\frac {1}{4}}\\1&-{\frac {1}{4}}&-{\frac {1}{4}}\\1&-{\frac {1}{4}}&{\frac {3}{4}}\end{bmatrix}}{\begin{bmatrix}Y_{r}\\U_{r}\\V_{r}\end{bmatrix}}} The chrominance components can be, but do not necessarily have to be, down-scaled in resolution. === Wavelet transform === The color components are then wavelet transformed to an arbitrary depth. In contrast to JPEG 1992 which uses an 8x8 block-size discrete cosine transform, PGF uses one reversible wavelet transform: a rounded version of the biorthogonal CDF 5/3 wavelet transform. This wavelet filter bank is exactly the same as the reversible wavelet used in JPEG 2000. It uses only integer coefficients, so the output does not require rounding (quantization) and so it does not introduce any quantization noise. === Quantization === After the wavelet transform, the coefficients are scalar-quantized to reduce the amount of bits to represent them, at the expense of a loss of quality. The output is a set of integer numbers which have to be encoded bit-by-bit. The parameter that can be changed to set the final quality is the quantization step: the greater the step, the greater is the compression and the loss of quality. With a quantization step that equals 1, no quantization is performed (it is used in lossless compression). In contrast to JPEG 2000, PGF uses only powers of two, therefore the parameter value i represents a quantization step of 2i. Just using powers of two makes no need of integer multiplication and division operations. === Coding === The result of the previous process is a collection of sub-bands which represent several approximation scales. A sub-band is a set of coefficients — integer numbers which represent aspects of the image associated with a certain frequency range as well as a spatial area of the image. The quantized sub-bands are split further into blocks, rectangular regions in the wavelet domain. They are typically selected in a way that the coefficients within them across the sub-bands form approximately spatial blocks in the (reconstructed) image domain and collected in a fixed size macroblock. The encoder has to encode the bits of all quantized coefficients of a macroblock, starting with the most significant bits and progressing to less significant bits. In this encoding process, each bit-plane of the macroblock gets encoded in two so-called coding passes, first encoding bits of significant coefficients, then refinement bits of significant coefficients. Clearly, in lossless mode all bit-planes have to be encoded, and no bit-planes can be dropped. Only significant coefficients are compressed with an adaptive run-length/Rice (RLR) coder, because they contain long runs of zeros. The RLR coder with parameter k (logarithmic length of a run of zeros) is also known as the elementary Golomb code of order 2k. === Comparison with other file formats === JPEG 2000 is slightly more space-efficient in handling natural images. The PSNR for the same compression ratio is on average 3% better than the PSNR of PGF. It has a small advantage in compression ratio but longer encoding and decoding times. PNG (Portable Network Graphics) is more space-efficient in handling images with many pixels of the same color. There are several self-proclaimed advantages of PGF over the ordinary JPEG standard: Superior compression performance: The image quality (measured in PSNR) for the same compression ratio is on average 3% better than the PSNR of JPEG. At lower bit rates (e.g. less than 0.25 bits/pixel for gray-scale images), PGF has a much more significant advantage over certain modes of JPEG: artifacts are less visible and there is almost no blocking. The compression gains over JPEG are attributed to the use of DWT. Multiple resolution representation: PGF provides seamless compression of multiple image components, with each component carrying from 1 to 31 bits per component sample. With this feature there is no need for separately stored preview images (thumbnails). Progressive transmission by resolution accuracy, commonly referred to as progressive decoding: PGF provides efficient code-stream organizations which are progressive by resolution. This way, after a smaller part of the whole file has been received, it is possible to see a lower quality of the final picture, the quality can be improved monotonically getting more data from the source. Lossless and lossy compression: PGF provides both lossless and lossy compression in a single compression architecture. Both lossy and lossless compression are provided by the use of a reversible (integer) wavelet transform. Side channel spatial information: Transparency and alpha planes are fully supported ROI extraction: Since version 5, PGF supports extraction of regions of interest (ROI) without decoding the whole image. == Available software == The author published libPGF via a SourceForge, under the GNU Lesser General Public License version 2.0. Xeraina offers a free Windows console encoder and decoder, and PGF viewers based on WIC for 32bit and 64bit Windows platforms. Other WIC applications including File Explorer are able to display PGF images after installing this viewer. Digikam is a popular open-source image editing and cataloging software that uses libPGF for its thumbnails. It makes use of the progressive decoding feature of PGF images to store a single version of each thumbnail, which can then be decoded to different resolutions without loss, thus allowing users to dynamically change the size of the thumbnails without having to recalculate them again.

    Read more →
  • Diia

    Diia

    Diia (Ukrainian: Дія [ˈd⁽ʲ⁾ijɐ] , lit. 'Action'; also an acronym for Держава і Я, Derzhava i Ya, IPA: [derˈʒɑwɐ i ˈjɑ], lit. 'State and Me') is a mobile app, a web portal and a brand of e-governance in Ukraine. Launched in 2020, the Diia app allows Ukrainian citizens to use digital documents on their smartphones instead of physical ones for identification and sharing purposes. The Diia portal allows access to over 130 government services. Eventually, the government plans to make all kinds of state-person interactions available through Diia. Diia was built in partnership with the United States and is poised to be shared with other countries. On the sidelines of the 2023 World Economic Forum in Davos, USAID Administrator Samantha Power said the US hopes to replicate the success of Diia in other countries. == History == Diia was first presented on September 27, 2019, by the Ministry of Digital Transformation of Ukraine as a brand of the State in a Smartphone project. Vice Prime Minister and Minister of Digital Transformation Mykhailo Fedorov announced the creation of a mobile app and a web portal that would unite in a single place all the services provided by the state to citizens and businesses. On February 6, 2020, the mobile app Diia was officially launched. During the presentation, Ukrainian President Volodymyr Zelensky said that 9 million Ukrainians now have access to their driver's license and car registration documents on their phones, while Prime Minister Oleksiy Honcharuk called the implementation of the State in a Smartphone project a priority for the government. In April 2020, the Ukrainian government approved a resolution for experimental usage of digital ID-cards and passports which would be issued to all Ukrainians via the Diia. On October 5, 2020, during the Diia Summit, the government presented a first major update of the app and web portal branded "Diia 2.0". More types of documents were added to the app as well as the ability to share documents with others via a single tap on a push-message. The web portal in turn expanded the number of available services to 27, including the ability to register a private limited company in half an hour. President Zelensky who opened the summit, announced that in 2021 Ukraine will enter the "paper less" mode by prohibiting civil servants from demanding paper documents. By the end of 2020, the app had more than six million users, while the portal had 50 available services. In March 2021, the Ukrainian parliament adopted a bill equating digital identity documents with their physical analogues. Starting on August 23, Ukrainian citizens can use digital ID-cards and passports for all purposes while in Ukraine. According to Minister of Digital Transformation Mykhailo Fedorov, Ukraine will become the first country in the world where digital identity documents are considered legally equivalent to ordinary ones. In September 2024, Diia launched an online marriage registration service, which can be beneficial especially for military personnel who spend much time on the frontline separated from their partners. In October 2024, Diia's online marriage service appeared in Time's Inventions of the 2024 list. In the first month of its operations over 1.1 million Ukrainians tried to make proposals using the technology, and 435 couples got married. == Benefits and challenges == The first and most obvious benefit is the convenience of such a platform. Citizens can have many documents on their smartphones at once, without concern about losing or damaging them. Whenever needed, they can just open an app on their smartphones and show/check the document they need. The idea is that Diia will help cut the bureaucracy associated with public services, which in turn will help fight corruption and increase government savings. Fewer people are needed to be employed in the public sector and fewer human to human interactions are supposed to happen. With the start of the program, already 10% of government employees were reduced, which contributes to hundreds of millions of dollars in savings, but besides this, the initiative also improves the speed, efficiency, and transparency of government services. In addition, the digitalization of the government sector helps to develop the whole IT industry in the country, people become more digitally aware and educated, this affects other sectors as well, increasing the spread of digital infrastructure and expediting the speed of overall digitalization. The UN E-government Development Index, which assesses the capabilities of governments to integrate its functions electronically, such as the use of internet and mobile devices, ranked Ukraine 69th in 193 countries surveyed in 2020. Despite its low ranking in the e-government development index, Ukraine made a big jump on the e-participation index, which they ranked 43rd out of 193 countries from 0.66 in 2018 to 0.81 in 2020 (un.org, 2020), suggesting that the government and its citizens are adapting the IT-based government functions. The main goal of e-government according to Perez-Morote et.al. (2020) is to have accountability and transparency among the countries involved. But to do so, there are several challenges that a country should assess first prior to implementing e-government. In the research written by Heeks (2001), the author identified 2 main challenges that countries face in the development of e-government, first is the strategic challenge which involves the preparedness (e-readiness) of the entire government system for electronic transformation, and second challenge is the tactical challenge where the government must design (e-governance design) a system where it can be understood by every user, it's important that the information that needs to be communicated to the consumers is received clearly. For the first challenge (e-readiness), Ukraine had an internet penetration rate of 76% in 2020 and is expected to grow to 82%, it is important that consumers have the internet access for it to enable the consumers to utilize the service. Another factor is the readiness of its institutional infrastructure, which means that the government has its own organization which is solely focused on implementing the e-government project. In the case of Ukraine, the e-governance team is led by Oleksandr Ryzhenko, and the country's e-governance initiative is even further strengthened by ensuring that the data and legal infrastructure are already prepared. Ukraine has done this by modernizing their legislation that is more appropriate in the digital service, and the data exchange solution used by Ukraine is called Trembita. The human infrastructure is also being updated, as competent individuals must be the one doing the task, hence, EGOV4UKRAINE was launched, this aims to get IT developers for developing a system for administrative services. These efforts by the Ukrainian government did not go unnoticed, and they received an award from the e-Governance Academy as "partner of the year 2017". For the second challenge, which deals with the system design, the success of Ukraine can be seen on the latest data of UNDP, where it shows a high increase in the E-participation index. In 2018, Ukraine ranked 75th it ranked 46th in 2020 (un.org, 2020). Despite visible success, the implementation of the e-government was accompanied by problems. Data leakage became the main one. In May 2020, the data of 26 million driver's licenses appeared in the public domain on the Internet. The Ukrainian government said the Diia app was not linked to a data breach, but it is impossible to say for certain. Any storage of official documents in electronic format is associated with the risk of their leakage. In addition, the Diia application still has data protection issues, as the required protection system has not been implemented. This is also compounded by the country's weak data protection legal regime. In addition, since 2023, Ukrainians are able to register their cars with this app. Issued license plates are not using regional codes, but they are using special codes starting with DI or PD. == Diia City == In May 2020, the government presented Diia City headed by Oleksandr Borniakov, a large-scale project which would establish a virtual model of a free economic zone for representatives of the creative economy. It would provide for special digital residency with a particular taxation regime, intellectual property protection and simplified regulations. Diia City concurrently imposes certain constraints on contracts involving individual entrepreneurs (FOPs). It also offers the benefit of tax rebates. Diia City garners endorsement from the Ukrainian government, believing it will support the country's position in the IT market. As of July 30, 2023, the program had more than 600 residents, including companies like iGama, Avenga, SBRobotiks, and Intellectsoft.

    Read more →