AI Video Tools

Explore the best AI Video Tools — independent reviews, comparisons, pricing and step-by-step how-to guides, curated by Aizhi.

  • Empirical risk minimization

    Empirical risk minimization

    In statistical learning theory, the principle of empirical risk minimization defines a family of learning algorithms based on evaluating performance over a known and fixed dataset. The core idea is based on an application of the law of large numbers; more specifically, we cannot know exactly how well a predictive algorithm will work in practice (i.e. the "true risk") because we do not know the true distribution of the data, but we can instead estimate and optimize the performance of the algorithm on a known set of training data. The performance over the known set of training data is referred to as the "empirical risk". == Background == The following situation is a general setting of many supervised learning problems. There are two spaces of objects X {\displaystyle X} and Y {\displaystyle Y} and we would like to learn a function h : X → Y {\displaystyle \ h:X\to Y} (often called hypothesis) which outputs an object y ∈ Y {\displaystyle y\in Y} , given x ∈ X {\displaystyle x\in X} . To do so, there is a training set of n {\displaystyle n} examples ( x 1 , y 1 ) , … , ( x n , y n ) {\displaystyle \ (x_{1},y_{1}),\ldots ,(x_{n},y_{n})} where x i ∈ X {\displaystyle x_{i}\in X} is an input and y i ∈ Y {\displaystyle y_{i}\in Y} is the corresponding response that is desired from h ( x i ) {\displaystyle h(x_{i})} . To put it more formally, assuming that there is a joint probability distribution P ( x , y ) {\displaystyle P(x,y)} over X {\displaystyle X} and Y {\displaystyle Y} , and that the training set consists of n {\displaystyle n} instances ( x 1 , y 1 ) , … , ( x n , y n ) {\displaystyle \ (x_{1},y_{1}),\ldots ,(x_{n},y_{n})} drawn i.i.d. from P ( x , y ) {\displaystyle P(x,y)} . The assumption of a joint probability distribution allows for the modelling of uncertainty in predictions (e.g. from noise in data) because y {\displaystyle y} is not a deterministic function of x {\displaystyle x} , but rather a random variable with conditional distribution P ( y | x ) {\displaystyle P(y|x)} for a fixed x {\displaystyle x} . It is also assumed that there is a non-negative real-valued loss function L ( y ^ , y ) {\displaystyle L({\hat {y}},y)} which measures how different the prediction y ^ {\displaystyle {\hat {y}}} of a hypothesis is from the true outcome y {\displaystyle y} . For classification tasks, these loss functions can be scoring rules. The risk associated with hypothesis h ( x ) {\displaystyle h(x)} is then defined as the expectation of the loss function: R ( h ) = E [ L ( h ( x ) , y ) ] = ∫ L ( h ( x ) , y ) d P ( x , y ) . {\displaystyle R(h)=\mathbf {E} [L(h(x),y)]=\int L(h(x),y)\,dP(x,y).} A loss function commonly used in theory is the 0-1 loss function: L ( y ^ , y ) = { 1 if y ^ ≠ y 0 if y ^ = y {\displaystyle L({\hat {y}},y)={\begin{cases}1&{\mbox{ if }}\quad {\hat {y}}\neq y\\0&{\mbox{ if }}\quad {\hat {y}}=y\end{cases}}} . The ultimate goal of a learning algorithm is to find a hypothesis h ∗ {\displaystyle h^{}} among a fixed class of functions H {\displaystyle {\mathcal {H}}} for which the risk R ( h ) {\displaystyle R(h)} is minimal: h ∗ = a r g m i n h ∈ H R ( h ) . {\displaystyle h^{}={\underset {h\in {\mathcal {H}}}{\operatorname {arg\,min} }}\,{R(h)}.} For classification problems, the Bayes classifier is defined to be the classifier minimizing the risk defined with the 0–1 loss function. == Formal definition == In general, the risk R ( h ) {\displaystyle R(h)} cannot be computed because the distribution P ( x , y ) {\displaystyle P(x,y)} is unknown to the learning algorithm. However, given a sample of iid training data points, we can compute an estimate, called the empirical risk, by computing the average of the loss function over the training set; more formally, computing the expectation with respect to the empirical measure: R emp ( h ) = 1 n ∑ i = 1 n L ( h ( x i ) , y i ) . {\displaystyle \!R_{\text{emp}}(h)={\frac {1}{n}}\sum _{i=1}^{n}L(h(x_{i}),y_{i}).} The empirical risk minimization principle states that the learning algorithm should choose a hypothesis h ^ {\displaystyle {\hat {h}}} which minimizes the empirical risk over the hypothesis class H {\displaystyle {\mathcal {H}}} : h ^ = a r g m i n h ∈ H R emp ( h ) . {\displaystyle {\hat {h}}={\underset {h\in {\mathcal {H}}}{\operatorname {arg\,min} }}\,R_{\text{emp}}(h).} Thus, the learning algorithm defined by the empirical risk minimization principle consists in solving the above optimization problem. == Properties == Guarantees for the performance of empirical risk minimization depend strongly on the function class selected as well as the distributional assumptions made. In general, distribution-free methods are too coarse, and do not lead to practical bounds. However, they are still useful in deriving asymptotic properties of learning algorithms, such as consistency. In particular, distribution-free bounds on the performance of empirical risk minimization given a fixed function class can be derived using bounds on the VC complexity of the function class. For simplicity, considering the case of binary classification tasks, it is possible to bound the probability of the selected classifier, ϕ n {\displaystyle \phi _{n}} being much worse than the best possible classifier ϕ ∗ {\displaystyle \phi ^{}} . Consider the risk L {\displaystyle L} defined over the hypothesis class C {\displaystyle {\mathcal {C}}} with growth function S ( C , n ) {\displaystyle {\mathcal {S}}({\mathcal {C}},n)} given a dataset of size n {\displaystyle n} . Then, for every ϵ > 0 {\displaystyle \epsilon >0} : P ( L ( ϕ n ) − L ( ϕ ∗ ) > ϵ ) ≤ 8 S ( C , n ) exp ⁡ { − n ϵ 2 / 32 } {\displaystyle \mathbb {P} \left(L(\phi _{n})-L(\phi ^{})>\epsilon \right)\leq {\mathcal {8}}S({\mathcal {C}},n)\exp\{-n\epsilon ^{2}/32\}} Similar results hold for regression tasks. These results are often based on uniform laws of large numbers, which control the deviation of the empirical risk from the true risk, uniformly over the hypothesis class. === Impossibility results === It is also possible to show lower bounds on algorithm performance if no distributional assumptions are made. This is sometimes referred to as the No free lunch theorem. Even though a specific learning algorithm may provide the asymptotically optimal performance for any distribution, the finite sample performance is always poor for at least one data distribution. This means that no classifier can improve on the error for a given sample size for all distributions. Specifically, let ϵ > 0 {\displaystyle \epsilon >0} and consider a sample size n {\displaystyle n} and classification rule ϕ n {\displaystyle \phi _{n}} , there exists a distribution of ( X , Y ) {\displaystyle (X,Y)} with risk L ∗ = 0 {\displaystyle L^{}=0} (meaning that perfect prediction is possible) such that: E L n ≥ 1 / 2 − ϵ . {\displaystyle \mathbb {E} L_{n}\geq 1/2-\epsilon .} It is further possible to show that the convergence rate of a learning algorithm is poor for some distributions. Specifically, given a sequence of decreasing positive numbers a i {\displaystyle a_{i}} converging to zero, it is possible to find a distribution such that: E L n ≥ a i {\displaystyle \mathbb {E} L_{n}\geq a_{i}} for all n {\displaystyle n} . This result shows that universally good classification rules do not exist, in the sense that the rule must be low quality for at least one distribution. === Computational complexity === Empirical risk minimization for a classification problem with a 0-1 loss function is known to be an NP-hard problem even for a relatively simple class of functions such as linear classifiers. Nevertheless, it can be solved efficiently when the minimal empirical risk is zero, i.e., data is linearly separable. In practice, machine learning algorithms cope with this issue either by employing a convex approximation to the 0–1 loss function (like hinge loss for SVM), which is easier to optimize, or by imposing assumptions on the distribution P ( x , y ) {\displaystyle P(x,y)} (and thus stop being agnostic learning algorithms to which the above result applies). In the case of convexification, Zhang's lemma majors the excess risk of the original problem using the excess risk of the convexified problem. Minimizing the latter using convex optimization also allow to control the former. == Tilted empirical risk minimization == Tilted empirical risk minimization is a machine learning technique used to modify standard loss functions like squared error, by introducing a tilt parameter. This parameter dynamically adjusts the weight of data points during training, allowing the algorithm to focus on specific regions or characteristics of the data distribution. Tilted empirical risk minimization is particularly useful in scenarios with imbalanced data or when there is a need to emphasize errors in certain parts of the prediction space.

    Read more →
  • Marti Hearst

    Marti Hearst

    Marti Alice Hearst is a professor in the School of Information at the University of California, Berkeley. She did early work in corpus-based computational linguistics, including some of the first work in automating sentiment analysis, and word sense disambiguation. She invented an algorithm that became known as "Hearst patterns" which applies lexico-syntactic patterns to recognize hyponymy (ISA) relations with high accuracy in large text collections, including an early application of it to WordNet; this algorithm is widely used in commercial text mining applications including ontology learning. Hearst also developed early work in automatic segmentation of text into topical discourse boundaries, inventing a now well-known approach called TextTiling. Hearst's research is on user interfaces for search engine technology and big data analytics. She did early work in user interfaces and information visualization for search user interfaces, inventing the TileBars query term visualization. Her Flamenco research project investigated and developed the now widely used faceted navigation approach for searching and browsing web sites and information collections. She wrote the first academic book on the topic of Search User Interfaces (Cambridge University Press, 2009). Hearst is an Edge Foundation contributing author and a member of the Usage panel of the American Heritage Dictionary of the English Language. Hearst received her B.A., M.S., and Ph.D. in computer science, all from Berkeley. In 2013 she became a fellow of the Association for Computing Machinery. She became a member of the CHI Academy in 2017, and has previously served as president of the Association for Computational Linguistics and on the advisory council of NSF's CISE Directorate. Additionally, she has been a member of the Web Board for CACM, the Usage Panel for the American Heritage Dictionary, the Edge.org panel of experts, the research staff at Xerox PARC, and the boards of ACM Transactions on the Web, Computational Linguistics, ACM Transactions on Information Systems, and IEEE Intelligent Systems. Hearst has received an NSF CAREER award, an IBM Faculty Award, and an Okawa Foundation Fellowship. Her work on user interfaces has had a profound impact on the industry, earning Hearst two Google Research Awards and four Excellence in Teaching Awards.} She has also led projects worth over $3.5M in research grants. Hearst’s publications date back to 1990, when ‘A Hybrid Approach to Restricted Text Interpretation’ was published in Stanford University’s AAAI Spring Symposium on Text Based Intelligent Systems in March of that year.

    Read more →
  • Nathalie Japkowicz

    Nathalie Japkowicz

    Nathalie Japkowicz is a Canadian computer scientist specializing in machine learning. She is a professor and department chair of computer science at the American University College of Arts and Sciences. == Life == Nathalie Japkowicz completed a B.Sc. at McGill University in 1988. She earned an M.Sc. from the University of Toronto in 1990. She completed a Ph.D. at Rutgers University in 1999. Her dissertation was titled Concept-learning in the absence of counter-examples: an autoassociation-based approach to classification. Stephen José Hanson and Casimir Alexander Kulikowski were her doctoral advisors. Japkowicz worked at the University of Ottawa in the school of electrical engineering and computer science. She was the lead of its laboratory for research on machine learning for defense security. From 2003 to 2005, Japkowicz was the secretary of the Canadian Artificial Intelligence Association (CAIAC). She was CAIAC vice president from 2009 to 2014 and president from 2013 to 2015, and part-president from 2015 to 2017. Japkowicz is a professor and department chair of computer science at the American University College of Arts and Sciences. She researches artificial intelligence, machine learning, data mining, and big data analysis. == Selected works == Gao, Yong; Japkowicz, Nathalie, eds. (2009). Advances in Artificial Intelligence: 22nd Canadian Conference on Artificial Intelligence, Canadian AI 2009 Kelowna, Canada, May 25–27, 2009 Proceedings. Lecture Notes in Computer Science. Vol. 5549. Berlin, Heidelberg: Springer Berlin Heidelberg. doi:10.1007/978-3-642-01818-3. ISBN 978-3-642-01817-6. S2CID 27083226. Japkowicz, Nathalie; Shah, Mohak (2011). Evaluating Learning Algorithms: A Classification Perspective (1 ed.). Cambridge University Press. doi:10.1017/cbo9780511921803. ISBN 978-0-511-92180-3. Japkowicz, Nathalie; Matwin, Stan, eds. (2015). Discovery Science: 18th International Conference, DS 2015, Banff, AB, Canada, October 4–6, 2015. Proceedings. Lecture Notes in Computer Science. Vol. 9356. Cham: Springer International Publishing. doi:10.1007/978-3-319-24282-8. ISBN 978-3-319-24281-1. S2CID 1302223. Japkowicz, Nathalie; Stefanowski, Jerzy, eds. (2016). Big Data Analysis: New Algorithms for a New Society. Studies in Big Data. Vol. 16. Cham: Springer International Publishing. doi:10.1007/978-3-319-26989-4. ISBN 978-3-319-26987-0. Ceci, Michelangelo; Japkowicz, Nathalie; Liu, Jiming; Papadopoulos, George A.; Raś, Zbigniew W., eds. (2018). Foundations of Intelligent Systems: 24th International Symposium, ISMIS 2018, Limassol, Cyprus, October 29–31, 2018, Proceedings. Lecture Notes in Computer Science. Vol. 11177. Cham: Springer International Publishing. doi:10.1007/978-3-030-01851-1. ISBN 978-3-030-01850-4. S2CID 53038780.

    Read more →
  • The Best Free AI Video Generator for Beginners

    The Best Free AI Video Generator for Beginners

    Trying to pick the best AI video generator? An AI video generator is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI video generator slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Lexxe

    Lexxe

    Lexxe is an internet search engine that applies Natural Language Processing in its semantic search technology. Founded in 2005 by Dr. Hong Liang Qiao, Lexxe is based in Sydney, Australia. Today, Lexxe's key focus is on sentiment search with the launch of a news sentiment search site at News & Moods (www.newsandmoods.com). Lexxe has experienced several stages of change of focus in search technology: Lexxe launched its Alpha version in 2005, featuring Natural Language question answering (i.e. users could ask questions in English to the search engine apart from keyword searches — this feature has been suspended for redevelopment since 2010). It used only algorithms to extract answers from web pages, with no question-answer pair databases prepared in advance. In 2011, Lexxe launched a beta version with a new search technology called Semantic Key. Semantic Keys enable users to query with a conceptual keyword (or a keyword with a special meaning, hence the term Semantic Key) in order to find instances under the concept, e.g. price → $5.95 or €200, color → red, yellow, white. For example, “price: a pound of apples”, “color: ferrari”. With initial 500 Semantic Keys at the Beta launch, Lexxe became the first search engine in the world to offer this unique and useful search technology to the users. The cost of building Semantic Keys was too heavy though. In 2017, Lexxe launched News & Moods (www.newsandmoods.com), an open platform for news sentiment search, a first step towards sentiment search feature for the entire Internet search in Lexxe search engine. News & Moods also comes with smartphone apps in Android and iOS.

    Read more →
  • Kaiming He

    Kaiming He

    Kaiming He (Chinese: 何恺明; pinyin: Hé Kǎimíng) is a Chinese computer scientist who primarily researches computer vision and deep learning. He is an associate professor at Massachusetts Institute of Technology and works part-time as a Distinguished Scientist at Google DeepMind. He is known as one of the creators of the residual neural network (ResNet) architecture. == Early life and education == He attended the public Guangzhou Zhixin High School in Guangzhou, Guangdong, China. He scored first place for the total scores in the 2003 Guangdong provincial undergraduate admissions exam. He went to Tsinghua University for undergraduate education and received a Bachelor of Science degree in 2007. In 2007 to 2011, he pursued doctoral studies in information engineering at the Chinese University of Hong Kong at its Multimedia Laboratory, receiving a PhD degree in 2011. His doctoral dissertation was titled Single image haze removal using dark channel prior (2011), and his doctoral adviser was Tang Xiao'ou. == Career == He worked at Microsoft Research Asia from 2011 to 2016 and at Facebook Artificial Intelligence Research from 2016 to 2024. In 2024, he became an associate professor at the Department of Electrical Engineering and Computer Science of the Massachusetts Institute of Technology. His 2016 paper Deep Residual Learning for Image Recognition is the most cited research paper in 5 years according to Google Scholar's reports in 2020 and 2021. == Awards and recognitions == He won ICCV's best paper award (Marr Prize) in 2017 and CVPR's best paper award in 2009 and 2016. He was awarded the 2023 Future Science Prize along with 3 collaborators for "fundamental contribution to artificial intelligence by introducing deep residual learning".

    Read more →
  • Android Auto

    Android Auto

    Android Auto is a mobile app developed by Google to mirror features of a smartphone (or other Android device) on a car's dashboard information and entertainment head unit. Once an Android device is paired with the car's head unit, the system can mirror some apps on the vehicle's display. Supported apps include GPS mapping and navigation, music playback, SMS, telephone, and Web search. The system supports both touchscreen and button-controlled head units. Hands-free operation through voice commands is available and recommended to reduce driver distraction. Android Auto is part of the Open Automotive Alliance, a joint effort of 28 automobile manufacturers, with Nvidia as tech supplier, available in 36 countries. == History == Android Auto was revealed at Google I/O 2014. The app was released to the public on March 19, 2015. In November 2016, Google implemented an app that would run the Android Auto UI on the mobile device. In July 2019, Android Auto received its first major UI rework, which among other changes, brought an app drawer to Android Auto for the first time. Google also announced that the app's ability to be used on a phone would be discontinued in favor of Google Assistant's drive mode. In December 2020, Google announced the expansion of Android Auto to 36 additional countries in Europe, Indonesia, and more. In April 2021, Android Auto launched in Belgium, Denmark, Netherlands, Norway, Portugal, and Sweden. Google announced in May 2022 a user interface redesign for Android Auto, codenamed CoolWalk, which aims to simplify the app's usage, and make it more adaptable to screens of different orientations and aspect ratios. The redesign incorporates a new split-screen layout, where Google Maps can be displayed alongside a music player. CoolWalk was originally slated to launch in Q3 2022. In June 2022, Android Auto no longer ran directly on a mobile device; the app permitting this was decommissioned, in favor of a Driving Mode built into the Google Assistant app for a similar purpose. In November 2022, the CoolWalk user interface was released in Android Auto's beta program. == Functionality == Android Auto is software that can be utilized from an Android mobile device, acting as a vehicle's dashboard head unit. Once the user's Android device is connected to the vehicle, the head unit will serve as an external display for the Android device, presenting supported software in a car-specific user interface provided by the Android Auto app. In Android Auto's first iterations, the device was required to be connected via USB to the car. For some time, starting in November 2016, Google added the option to run Android Auto as a regular app on an Android device, allowing users to choose whether to use Android Auto on a personal phone or tablet, rather than on a compatible automotive head unit. This app was decommissioned in June 2022 in favor of a Driving Mode built into the Google Assistant app. At CES 2018, Google confirmed that the Google Assistant would be coming to Android Auto later in the year. An Android Auto SDK has been released, allowing third parties to modify their apps to work with Android Auto; initially, only APIs for music and messaging apps were available. == Head unit support == In May 2015, Hyundai became the first manufacturer to offer Android Auto support, making it first available in the 2015 Hyundai Sonata. Automobile manufacturers that will offer Android Auto support in their cars include Abarth, Acura, Alfa Romeo, Aston Martin, Audi, Bentley, Buick, BMW, BYD, Cadillac, Chevrolet, Chrysler, Citroën, Dodge, Ferrari, Fiat, Ford, GMC, Genesis, Holden, Honda, Hyundai, Infiniti, Jaguar Land Rover, Jeep, Kia, Lamborghini, Lexus, Lincoln, Mahindra and Mahindra, Maserati, Maybach, Mazda, Mercedes-Benz, Mitsubishi, Nissan, Opel, Peugeot, Porsche, RAM, Renault, SEAT, Škoda, SsangYong, Subaru, Suzuki, Tata Motors Cars, Toyota, Volkswagen and Volvo. Additionally, aftermarket car-audio systems supporting Android Auto add the technology into host vehicles, including Pioneer, Kenwood, Panasonic, and Sony. == Criticism == In May 2019, Italy filed an antitrust complaint targeting Android Auto, citing a Google policy of allowing third-parties to only offer media and messaging apps on the platform, preventing Enel from offering an app for locating vehicle charging stations. Google announced a new SDK, to be released to select partners in August 2020 and made generally available by the end of the year. == Availability == As of December 2025, Android Auto is available in 46 countries:

    Read more →
  • OCR-A

    OCR-A

    OCR-A is a font issued in 1966 and first implemented in 1968. A special font was needed in the early days of computer optical character recognition, when there was a need for a font that could be recognized not only by the computers of that day, but also by humans. OCR-A uses simple, thick strokes to form recognizable characters. The font is monospaced (fixed-width), with the printer required to place glyphs 0.254 cm (0.10 inch) apart, and the reader required to accept any spacing between 0.2286 cm (0.09 inch) and 0.4572 cm (0.18 inch). == Standardization == The OCR-A font was standardized by the American National Standards Institute (ANSI) as ANSI X3.17-1981. X3.4 has since become the INCITS and the OCR-A standard is now called ISO 1073-1:1976. == Implementations == In 1968, American Type Founders produced OCR-A, one of the first optical character recognition typefaces to meet the criteria set by the U.S. Bureau of Standards. The design is simple so that it can be easily read by a machine, but it is more difficult for the human eye to read. As metal type gave way to computer-based typesetting, Tor Lillqvist used Metafont to describe the OCR-A font. That definition was subsequently improved by Richard B. Wales. Their work is available from CTAN. To make the free version of the font more accessible to users of Microsoft Windows, John Sauter converted the Metafont definitions to TrueType using potrace and FontForge in 2004. In 2007, Gürkan Sengün created a Debian package from this implementation. In 2008. Luc Devroye corrected the vertical positioning in John Sauter's implementation, and fixed the name of lower case z. Independently, Matthew Skala used mftrace to convert the Metafont definitions to TrueType format in 2006. In 2011 he released a new version created by rewriting the Metafont definitions to work with METATYPE1, generating outlines directly without an intermediate tracing step. On September 27, 2012, he updated his implementation to version 0.2. In addition to these free implementations of OCR-A, there are also implementations sold by several vendors. As a joke, Tobias Frere-Jones in 1995 created Estupido-Espezial, a redesign with swashes and a long s. It was used in a "technology"-themed section of Rolling Stone. Maxitype designed the OCR-X typeface—based on the OCR-A typeface with OpenType features, alien/technology-themed dingbats and available in six weights (Thin, Light, Regular, Medium, Bold, Black). Japanese typeface foundry Visual Design Laboratory (VDL) designed two typefaces based on the OCR-A typeface: one for Simplified Chinese characters named Jieyouti and one for Japanese characters named Yota G (ヨタG) , both available in five weights (Light, Regular, Medium, Semi Bold, Bold). == Use == Although optical character recognition technology has advanced to the point where such simple fonts are no longer necessary, the OCR-A font has remained in use. Its usage remains widespread in the encoding of checks around the world. Some lock box companies still insist that the account number and amount owed on a bill return form be printed in OCR-A. Also, because of its unusual look, it is sometimes used in advertising and display graphics. Notably, it is used for the subtitles in films and television series such as Blacklist and for the main titles in The Pretender. Additionally, OCR-A is used in the titles and subtitles for the films 13 Hours: The Secret Soldiers of Benghazi and Hoppers (film). It was also used for the logo, branding, and marketing material of the children's toy line Hexbug. == Code points == A font is a set of character shapes, or glyphs. For a computer to use a font, each glyph must be assigned a code point in a character set. When OCR-A was being standardized the usual character coding was the American Standard Code for Information Interchange or ASCII. Not all of the glyphs of OCR-A fit into ASCII, and for five of the characters there were alternate glyphs, which might have suggested the need for a second font. However, for convenience and efficiency all of the glyphs were expected to be accessible in a single font using ASCII coding, with the additional characters placed at coding points that would otherwise have been unused. The modern descendant of ASCII is Unicode, also known as ISO 10646. Unicode contains ASCII and has special provisions for OCR characters, so some implementations of OCR-A have looked to Unicode for guidance on character code assignments. === Pre-Unicode standard representation === The ISO standard ISO 2033:1983, and the corresponding Japanese Industrial Standard JIS X 9010:1984 (originally JIS C 6229–1984), define character encodings for OCR-A, OCR-B and E-13B. For OCR-A, they define a modified 7-bit ASCII set (also known by its ISO-IR number ISO-IR-91) including only uppercase letters, digits, a subset of the punctuation and symbols, and some additional symbols. Codes which are redefined relative to ASCII, as opposed to simply omitted, are listed below: Additionally, the long vertical mark () is encoded at 0x7C, corresponding to the ASCII vertical bar (|). === Dedicated OCR-A characters in Unicode === The following characters have been defined for control purposes and are now in the "Optical Character Recognition" Unicode range 2440–245F: === Space, digits, and unaccented letters === All implementations of OCR-A use U+0020 for space, U+0030 through U+0039 for the decimal digits, U+0041 through U+005A for the unaccented upper case letters, and U+0061 through U+007A for the unaccented lower case letters. === Regular characters === In addition to the digits and unaccented letters, many of the characters of OCR-A have obvious code points in ASCII. Of those that do not, most, including all of OCR-A's accented letters, have obvious code points in Unicode. === Remaining characters === Linotype coded the remaining characters of OCR-A as follows: === Additional characters === The fonts that descend from the work of Tor Lillqvist and Richard B. Wales define four characters not in OCR-A to fill out the ASCII character set. These shapes use the same style as the OCR-A character shapes. They are: Linotype also defines additional characters. === Exceptions === Some implementations do not use the above code point assignments for some characters. ==== PrecisionID ==== The PrecisionID implementation of OCR-A has the following non-standard code points: OCR Hook at U+007E OCR Chair at U+00C1 OCR Fork at U+00C2 Euro Sign at U+0080 ==== Barcodesoft ==== The Barcodesoft implementation of OCR-A has the following non-standard code points: OCR Hook at U+0060 OCR Chair at U+007E OCR Fork at U+005F Long Vertical Mark at U+007C (agrees with Linotype) Character Erase at U+0008 ==== Morovia ==== The Morovia implementation of OCR-A has the following non-standard code points: OCR Hook at U+007E (agrees with PrecisionID) OCR Chair at U+00F0 OCR Fork at U+005F (agrees with Barcodesoft) Long Vertical Mark at U+007C (agrees with Linotype) ==== IDAutomation ==== The IDAutomation implementation of OCR-A has the following non-standard code points: OCR Hook at U+007E (agrees with PrecisionID) OCR Chair at U+00C1 (agrees with PrecisionID) OCR Fork at U+00C2 (agrees with PrecisionID) OCR Belt Buckle at U+00C3 == Sellers of font standards == Hardcopy of ISO 1073-1:1976, distributed through ANSI, from Amazon.com ISO 1073-1 is also available from Techstreet, who distributes standards for ANSI and ISO

    Read more →
  • Metigo

    Metigo

    metigo is a software application that performs image-based modelling and close range photogrammetry. It produces rectified imagery plans, true ortho-projections on planar, cylindric and conic surfaces, 3D photorealistic models, measurements from photography and mappings on a photographic base for uses in the cultural heritage sector, mainly conservation. == Products == The metigo product line currently consists of the mapping software metigo MAP, the stereo-photogrammetry modeling software metigo 3D, the free viewer metigo VIEW. These products are all standalone and are not depending on other software, such as AutoCAD. === metigo MAP === metigo MAP is mainly used to map findings and conservation measured on a uniform metric photographic base. Therefore, photos of planar surfaces can be rectified based on geometrical informations, e.g. height and width of a rectangle, or cartesian coordinates measured by total station. Beside rectified imagery several other metric mapping bases can be imported and used: true ortho-projections; scaled scans of plans and plots; CAD-files; 3D models, such as digital surface models (DSM) produced by stereo-photogrammetry, SfM or 3D scanning. metigo MAP 's strong point is that rectified imagery taken with different techniques (visual light, sided light, IR, UV, UV-fluorescence, X-ray), historic images and photos taken at various stages of the conservation process can be superimposed and evaluated mutually. The user can allocate several attributes, such as different conservation measures and damage classes, to the mapped geometries. The mappings can be analysed by geometries as well as by user-defined attributes at any stage of the project. metigo MAP targets mainly conservators in different cultural heritage fields. Using it no specialist knowledge of surveying and photogrammetric techniques are needed. === metigo 3D === metigo 3D is a stereo-photogrammetric kit that allows to calculate bundle adjustments (axios3D), create high-quality 3D point clouds using multiple stereo photo pairs combined with metric survey data, mesh these point clouds, texture the meshes with high-resolution image data to create photo-realistic models, ortho-project orientated images on digital surface models (DSM) on planes and best-fit cylinders and cones, create unwrappings and developed views of curved surfaces, analyse deformations of 3D surfaces. metigo 3D targets metric survey specialists working in the cultural heritage sector. == Supported file formats == metigo has the ability to read the following formats: images: JPEG (.jpg), Tiff (.tif), Bitmaps (.bmp), CompuServ (.gif), Encapsualated Postscript (.eps), PCX (.pcx), Photo-CD (.pcd), PICT (.pcd), PNG (.png), Targa (.tga), RAW-format of several camera brands. CAD: DBX, DXF, DWG. 3D: many ASCII-formats (.stl, .wrl, etc.) point data: format editor for ASCII files. == Supported languages == Currently, an English and German version of the software is supported. For metigo MAP beside these a French and Polish GUI is offered for sale. == Applications == The main applications of metigo are: conservation in the cultural heritage context, e.g. stone conservation paintings tapestry etc. architecture, archaeology, many other are possible, e.g. forensics. == History == The first public release of metigo was in 2000.

    Read more →
  • AI Text-to-image Tools Reviews: What Actually Works in 2026

    AI Text-to-image Tools Reviews: What Actually Works in 2026

    In search of the best AI text-to-image tool? An AI text-to-image tool is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI text-to-image tool slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • Yasuo Matsuyama

    Yasuo Matsuyama

    Yasuo Matsuyama (born March 23, 1947) is a Japanese researcher in machine learning and human-aware information processing. Matsuyama is a Professor Emeritus and an Honorary Researcher of the Research Institute of Science and Engineering of Waseda University. == Early life and education == Matsuyama received his bachelor’s, master’s and doctoral degrees in electrical engineering from Waseda University in 1969, 1971, and 1974 respectively. The dissertation title for the Doctor of Engineering is Studies on Stochastic Modeling of Neurons. There, he contributed to the spiking neurons with stochastic pulse-frequency modulation. Advisors were Jun’ichi Takagi, Kageo, Akizuki, and Katsuhiko Shirai. Upon the completion of the doctoral work at Waseda University, he was dispatched to the United States as a Japan-U.S. exchange fellow by the joint program of the Japan Society for the Promotion of Science, Fulbright Program, and the Institute of International Education. Through this exchange program, he completed his Ph.D. program at Stanford University in 1978. The dissertation title is Process Distortion Measures and Signal Processing. There, he contributed to the theory of probabilistic distortion measures and its applications to speech encoding with spectral clustering or vector quantization. His advisor was Robert. M. Gray. == Career == From 1977 to 1078, Matsuyama was a research assistant at the Information Systems Laboratory of Stanford University Archived 2018-03-16 at the Wayback Machine. From 1979 to 1996, he was a faculty of Ibaraki University, Japan (the final position was a professor and chairperson of the Information and System Sciences Major). Since 1996, he was a Professor of Waseda University, Department of Computer Science and Engineering. From 2011 to 2013, he was the director of the Media Network Center of Waseda University. At the 2011 Tōhoku earthquake and tsunami of March 11, 2011, he was in charge of the safety inquiry of 65,000 students, staffs and faculties. Since 2017, Matsuyama is a Professor Emeritus and an Honorary Researcher of the Research Institute of Science and Engineering of Waseda University. Since 2018, he serves as an acting president of the Waseda Electrical Engineering Society. == Work == Matsuyama’s works on machine learning and human-aware information processing have dual foundations. Studies on the competitive learning (vector quantization) for his Ph.D. at Stanford University brought about his succeeding works on machine learning contributions. Studies on stochastic spiking neurons for his Dr. Engineering at Waseda University set off applications of biological signals to the machine learning. Thus, his works can be grouped reflecting these dual foundations. Statistical machine learning algorithms: The use of the alpha-logarithmic likelihood ratio in learning cycles generated the alpha-EM algorithm (alpha-Expectation maximization algorithm). Because the alpha-logarithm includes the usual logarithm, the alpha-EM algorithm contains the EM-algorithm (more precisely, the log-EM algorithm). The merit of the speedup by the alpha-EM over the log-EM is due to the ability to utilize the past information. Such a usage of the messages from the past brought about the alpha-HMM estimation algorithm (alpha-hidden Markov model estimation algorithm) that is a generalized and faster version of the hidden Markov model estimation algorithm (HMM estimation algorithm). Competitive learning on empirical data: Starting from the speech compression studies at Stanford, Matsuyama developed generalized competitive learning algorithms; the harmonic competition and the multiple descent cost competition. The former realizes the multiple-object optimization. The latter admits deformable centroids. Both algorithms generalize the batch-mode vector quantization (simply called, vector quantization) and the successive-mode vector quantization (or, called learning vector quantization). A hierarchy from the alpha-EM to the vector quantization: Matsuyama contributed to generate and identify the hierarchy of the above algorithms. Alpha-EM ⊃ log-EM ⊃ basic competitive learning (vector quantization, VQ; or clustering). On the class of the vector quantization and competitive learning, he contributed to generate and identify the hierarchy of VQs. VQ ⇔ {batch mode VQ, and learning VQ} ⊂ {harmonic competition} ⊂ {multiple descent cost competition}. Applications to Human-aware information processing: The dual foundations of his led to the applications to huma-aware information processing. Retrieval systems for similar images and videos. Bipedal humanoid operations via invasive and noninvasive brain signals as well as gestures. Continuous authentication of uses by brain signals. Self-organization and emotional feature injection based on the competitive learning. Decomposition of DNA sequences by the independent component analysis (US Patent: US 8,244,474 B2). Data compression of speech signals by the competitive learning. The above theories and applications work as contributions to IoCT (Internet of Collaborative Things) and IoXT (http://www.asc-events.org/ASC17/Workshop.php Archived 2018-02-06 at the Wayback Machine). == Awards and honors == 2016: e-Teaching Award of Waseda University 2015: Best Textbook Award by the Japanese Society of Information Processing 2014: Fellow of the Japanese Society of Information Processing 2013: IEEE Life Fellow 2008: Y. Dote Memorial Best Paper Award of CSTST 2008 from ACM and IEEE 2006: LSI Intellectual Property Design Award from the LSI IP Committee 2004: Best Paper Award for Application Oriented Research from Asia Pacific Neural Network Assembly 2002: Fellow Award from the Institute of Electronics, Information and Communication Engineers. 2001: Telecommunication System Major Award of the Telecommunications Advancement Foundation 2001: Outstanding Paper Award of IEEE Transactions on Neural Networks Archived 2013-01-17 at the Wayback Machine 1998: Fellow Award from IEEE for contributions to learning algorithms with competition. 1992: Best Paper Award from the Institute of Electronics, Information and Communication Engineers 1989: Telecommunication System Promotion Award of the Telecommunications Advancement Foundation

    Read more →
  • Janyce Wiebe

    Janyce Wiebe

    Janyce Marbury Wiebe (1959–2018) was an American computer science specializing in natural language processing and known for her work on subjectivity, sentiment analysis, opinion mining, discourse processing, and word-sense disambiguation. == Early life and education == Wiebe was born in 1959, in Albany, New York. She majored in English at the Binghamton University, graduating in 1981, and completed a Ph.D. in computer science in 1990, at the University at Buffalo. Her dissertation, Recognizing Subjective Sentences: A Computational Investigation of Narrative Text, was supervised by philosopher William J. Rapaport. == Career == After postdoctoral research at the University of Toronto, she became an assistant professor at New Mexico State University in 1992. In 2000, she moved to the University of Pittsburgh, where she became a professor of computer science and director of the Intelligent Systems Program. == Recognition == Wiebe was named a Fellow of the Association for Computational Linguistics in 2015. == Death == She died of leukemia on December 10, 2018.

    Read more →
  • Hit-testing

    Hit-testing

    In computer graphics programming, hit-testing (hit detection, picking, or pick correlation) is the process of determining whether a user-controlled cursor (such as a mouse cursor or touch-point on a touch-screen interface) intersects a given graphical object (such as a shape, line, or curve) drawn on the screen. Hit-testing may be performed on the movement or activation of a mouse or other pointing device. Hit-testing is used by GUI environments to respond to user actions, such as selecting a menu item or a target in a game based on its visual location. In web programming languages such as HTML, SVG, and CSS, this is associated with the concept of pointer-events (e.g. user-initiated cursor movement or object selection). Collision detection is a related concept for detecting intersections of two or more different graphical objects, rather than intersection of a cursor with one or more graphical objects. == Algorithm == There are many different algorithms that may be used to perform hit-testing, with different performance or accuracy outcomes. One common hit-test algorithm for axis aligned bounding boxes. A key idea is that the box being tested must be either entirely above, entirely below, entirely to the right or left of the current box. If this is not possible, they are colliding. Example logic is presented in the pseudo-code below: In Python:

    Read more →
  • Babel Fish (website)

    Babel Fish (website)

    Yahoo! Babel Fish was a free Web-based machine translation service by Yahoo!. In May 2012 it was replaced by Bing Translator (now Microsoft Translator), to which queries were redirected. Although Yahoo! has transitioned its Babel Fish translation services to Bing Translator, it did not sell its translation application to Microsoft outright. As the oldest free online language translator, the service translated text or Web pages in 36 pairs between 13 languages, including English, Simplified Chinese, Traditional Chinese, Dutch, French, German, Greek, Italian, Japanese, Korean, Portuguese, Russian, and Spanish. The internet service derived its name from the Babel fish, a fictional species in Douglas Adams's book and radio series The Hitchhiker's Guide to the Galaxy that could instantly translate languages. In turn, the name of the fictional creature refers to the biblical account of the confusion of languages that arose in the city of Babel. == History == On December 9, 1997, Digital Equipment Corporation (DEC) and SYSTRAN S.A. launched AltaVista Translation Service at babelfish.altavista.com, which was developed by a team of researchers at DEC. In February 2003, AltaVista was bought by Overture Services, Inc. In July 2003, Overture, in turn, was taken over by Yahoo!. The web address for Babel Fish remained at babelfish.altavista.com until May 9, 2008, when the address changed to babelfish.yahoo.com. In 2012, the Web address changed again, this time redirecting babelfish.yahoo.com to www.microsofttranslator.com when Microsoft's Bing Translator replaced Yahoo Babel Fish. As of June 2013, babelfish.yahoo.com no longer redirects to the Microsoft Bing Translator. Instead, it refers directly back to the main Yahoo.com page.

    Read more →
  • Hideto Tomabechi

    Hideto Tomabechi

    Hideto Tomabechi (苫米地 英人, Tomabechi Hideto; born 1959) is a Japanese cognitive scientist who is an adjunct fellow at Carnegie Mellon University and has had an executive role in several companies. == Early life and education == He grew up in Minato-ku, Tokyo. He graduated from Komaba Toho High School and then joined the University of Massachusetts Amherst. He received his first degree from Sophia University, then joined Mitsubishi Real Estate. Tomabechi was a Fulbright Scholar at Yale University and became member of Yale University Artificial Intelligence Research Center and Yale Cognitive Science Program. Hideto Tomabechi's research topic was: Cognition Models for Language Expressions and Computational Methods (Tomabechi Algorithm). Hideto Tomabechi received his Ph.D. in the field of computational linguistics from Carnegie Mellon University. His 1993 Ph.D. Thesis was entitled "Efficient Unification for Natural Language". == Career timeline == 1992-1998: Director, Justsystem Scientific Institute. 1998: CEO of Cognitive Research Laboratories Inc. 2007: Adjunct Fellow at the Cyber Security & Privacy Research Institute (CyLab) at Carnegie Mellon University. 2020: Visiting professor at Nano & Life Research Center, Waseda University. 2020: Chairman, Resilience Japan, LLC. 2022: Chairman of Japan Society for Foreign Policy. == Brain research == In 1993, Hideto Tomabechi became director of the Development Department. Later, Tomabechi became director of the JustSystems Basic Research Institute Tomabechi researched the basic functions of the human brain and mind. The purpose of brain and consciousness research were to develop the human machine interface. The main areas of research were altered states of consciousness, hypnosis, homeostasis, brain functions, and functions of the human mind in cyberspace. Dr. Tomabechi founded the Bechi Unit, the world's first virtual currency at JustSystems, based on Tomabech Algorithms. == Brainwashing == Tomabechi was the scientist who deprogrammed the leaders of the religious cult responsible for the terrorist attack in the Tokyo subway. The cult (Aum Shinrikyo) brainwashed its people and they carried out the attacks in an influenced state of consciousness.

    Read more →