AI Assistant Intellij

AI Assistant Intellij — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Hit-testing

    Hit-testing

    In computer graphics programming, hit-testing (hit detection, picking, or pick correlation) is the process of determining whether a user-controlled cursor (such as a mouse cursor or touch-point on a touch-screen interface) intersects a given graphical object (such as a shape, line, or curve) drawn on the screen. Hit-testing may be performed on the movement or activation of a mouse or other pointing device. Hit-testing is used by GUI environments to respond to user actions, such as selecting a menu item or a target in a game based on its visual location. In web programming languages such as HTML, SVG, and CSS, this is associated with the concept of pointer-events (e.g. user-initiated cursor movement or object selection). Collision detection is a related concept for detecting intersections of two or more different graphical objects, rather than intersection of a cursor with one or more graphical objects. == Algorithm == There are many different algorithms that may be used to perform hit-testing, with different performance or accuracy outcomes. One common hit-test algorithm for axis aligned bounding boxes. A key idea is that the box being tested must be either entirely above, entirely below, entirely to the right or left of the current box. If this is not possible, they are colliding. Example logic is presented in the pseudo-code below: In Python:

    Read more →
  • General Regionally Annotated Corpus of Ukrainian

    General Regionally Annotated Corpus of Ukrainian

    General Regionally Annotated Corpus of the Ukrainian Language (GRAC, Ukrainian: Генеральний регіонально анотований корпус української мови, romanized: Heneralnyi rehionalno anotovanyi korpus ukrainskoi movy, ГРАК, Ukrainian грак for rook) is a text corpus of the Ukrainian language comprising more than 2 billion tokens, intended for linguistic research in grammar, vocabulary, and the history of the Ukrainian literary language, as well as for use in compiling dictionaries and grammars. The corpus can be used for language study and also for preparing teaching materials, textbooks, learner’s dictionaries, and exercises using examples from real texts, taking into account frequency and collocational patterns, and so on. The corpus is not a model of standard Ukrainian: it may contain words and combinations that do not match current norms of the literary language. The corpus covers the period from 1816 to 2025, and as of 29 November 2025 it contains more than 812,000 texts by about 35,000 authors. == Composition of the corpus == In the 10th version of the corpus, available for searching from 20 October 2020, 35% consists of fiction. Some fiction genres are highlighted separately: children’s literature, folklore, dramatic works, and scripts. Among non-fiction texts: journalistic writing, including newspaper collections from 1888–1893, 1905, 1913–1918, 1919–1943, modern newspapers from different regions, and texts from online news/information sites; memoirs, letters, and diaries, including a sizeable corpus of Facebook texts representing blogs by people from all regions of Ukraine and the diaspora; scholarly and educational texts: monographs, dissertations, academic articles, textbooks; large subcorpora of academic literature in history, ethnography, philosophy, and law are singled out separately; religious texts, including two Ukrainian translations of the Bible; speeches and interviews. Some dictionaries that include phrasal examples and phraseology have also been incorporated, including the Ukrainian dictionary by Borys Hrinchenko and the Russian-Ukrainian idiomatic dictionary by I. Vyrhan and M. Pylynska. Using the corpus tools, these dictionaries can be searched not only for words, but also for lexico-grammatical patterns within examples and phraseological expressions. About 20% of the texts in the corpus are translations. The corpus includes translations from more than 80 languages, most of all from English and Russian. == Dating == Texts in the corpus are dated by the year of writing, or by the latest year in which a work could have been written; translated texts are dated by the year the translation was produced. A publication year may also be indicated, corresponding to the edition from which the text is taken. == Regional annotation == The corpus’s regional annotation is based on the modern administrative division of Ukraine. The corpus includes texts from all oblasts of Ukraine and from Crimea. A single text may belong to several regional subcorpora (if the author or translator was born, studied, or lived for a long time in different regions). In addition to regional subcorpora, there are subcorpora of works by authors of the Ukrainian diaspora (USA, Canada, Poland, Germany, the United Kingdom, France, etc.). These are mostly texts by emigrants of the 1940s, and to a lesser extent of 1917–1920s. == Morphological annotation == GRAC is based on the morphological analysis system nlp_uk, developed by specialists from the r2u group. The program analyzes the text and, for each word form, determines the lemma (lexeme) and tags (grammatical features). == Research based on the corpus == Research on the Ukrainian language has been carried out using the corpus, including studies of the historical dynamics of language norms, and letter and letter-combination frequencies for font development.

    Read more →
  • Markov information source

    Markov information source

    In mathematics, a Markov information source, or simply, a Markov source, is an information source whose underlying dynamics are given by a stationary finite Markov chain. == Formal definition == An information source is a sequence of random variables ranging over a finite alphabet Γ {\displaystyle \Gamma } , having a stationary distribution. A Markov information source is then a (stationary) Markov chain M {\displaystyle M} , together with a function f : S → Γ {\displaystyle f:S\to \Gamma } that maps states S {\displaystyle S} in the Markov chain to letters in the alphabet Γ {\displaystyle \Gamma } . A unifilar Markov source is a Markov source for which the values f ( s k ) {\displaystyle f(s_{k})} are distinct whenever each of the states s k {\displaystyle s_{k}} are reachable, in one step, from a common prior state. Unifilar sources are notable in that many of their properties are far more easily analyzed, as compared to the general case. == Applications == Markov sources are commonly used in communication theory, as a model of a transmitter. Markov sources also occur in natural language processing, where they are used to represent hidden meaning in a text. Given the output of a Markov source, whose underlying Markov chain is unknown, the task of solving for the underlying chain is undertaken by the techniques of hidden Markov models, such as the Viterbi algorithm.

    Read more →
  • Julie Beth Lovins

    Julie Beth Lovins

    Julie Beth Lovins (October 19, 1945, in Washington, D.C. – January 26, 2018, in Mountain View, California) was a computational linguist who published The Lovins Stemming Algorithm - a type of stemming algorithm for word matching - in 1968. The Lovins Stemmer is a single pass, context sensitive stemmer, which removes endings based on the longest-match principle. The stemmer was the first to be published and was extremely well developed considering the date of its release, having been the main influence on a large amount of the future work in the area. -Adam G., et al == Background == Born on October 19, 1945, in Washington, D.C., Lovins grew up in Amherst, Massachusetts. Her father Gerald H. Lovins was an engineer and her mother, Miriam Lovins, a social services administrator. Lovins' brother Amory Lovins is the co-founder and chief environmental scientist of Rocky Mountain Institute. For her undergraduate degree, Lovins attended Pembroke College, the women's college of Brown University, which later combined into Brown University in 1971. At Pembroke College, Lovins studied mathematics and linguistics, graduating with honors. Her thesis was named, A Study of Idioms. She received the inaugural Bloch Fellowship in 1970 from the Linguistic Society of America to attend graduate school. Lovins obtained her Master of Arts in 1970 and Doctor of Philosophy in 1973 from the University of Chicago, studying linguistics. At the University of Chicago, her dissertation was titled, Loan Phonology -- Subject Matter. A revision of her thesis on loanwords and the phonological structure of Japanese was published in 1975 by the Indiana University Linguistics Club. == Teaching career == Following Lovins' PhD, she spent a year working as a linguist-at-large at a University of Tokyo language research institute and as an English conversation teacher. She then joined the faculty at Tsuda College as a professor of English and linguistics, where she taught for seven years. During her time as a faculty member at Tsuda College, Lovins also served as a guest researcher in the University of Tokyo's Research Institute of Logopedics and Phoniatrics, a research center for speech science. == Industry career == After teaching Japanese phonology at Japanese universities abroad, Lovins moved back to the U.S. to work in the computing industry. She worked on early speech synthesis at Bell Labs in Murray Hill, New Jersey. At Bell Labs, Lovins worked with Osamu Fujimura, a Japanese linguist who is credited as a pioneer in speech sciences. Lovins also worked as a software engineer at various companies in Silicon Valley and served as a consultant for computational linguistics throughout the 1990s. As a consultant, she called her business, "The Language Doctor." == The Lovins Stemming Algorithm == Lovins published an article about her work on developing a stemming algorithm through the Research Laboratory of Electronics at MIT in 1968. Lovins' stemming algorithm is frequently referred to as the Lovins stemmer. A stemming algorithm is the process of taking a word with suffixes and reducing it to its root, or base word. Stemming algorithms are used to improve the accuracy in information retrieval and in domain analysis. These algorithms help find variants of the terms being queried. Stemming algorithms bring value in their reduction of a given query into its less complex form, allowing more similar documents to be retrieved for similar queries. Stemming algorithms are prevalent in search engines, such as Google Search, which did not implement word stemming until 2003. This means that up until 2003, a Google search for the word warm would not have explicitly returned results for related words like warmth or warming. As the first published stemming algorithm, Lovins' work set a precedent and influenced future work in stemming algorithms, such as the Porter Stemmer published by Martin Porter in 1980 which has been recognized widely as the most common stemming algorithm for stemming English. Additionally, the Dawson Stemmer developed by John Dawson is an extension of the Lovins stemmer. The Lovins stemmer follows a rule-based affix elimination approach. It first removes the longest identifiable suffix from the target word - producing a base stem word - then indexes a lookup table to convert the (potentially malformed) stem word to a valid word. This process can be split into two phases. In the first phase, a word is compared with a pre-determined list of endings, and when a word is found to contain one of these endings, the ending is removed, leaving only the stem of the word. The second phase standardizes spelling exceptions that come from the first phase, ensuring that words with only marginally varying stems are appropriately paired together. For example, with the word dried, phase one results in dri, which should match with the word dry. The second phase takes care of these exceptions. Compared to other stemmers, Lovins' algorithm is fast and equipped to handle irregular plural words like person and people. Disadvantages, however, include many suffixes not being available in the table of endings. Furthermore, it is sometimes highly unreliable and frequently fails to form valid words from the stems or to match the stems of like-meaning words. This is most often caused by the usage of specialist terminology and domain-specific vocabulary by the author. == Personal life == Lovins moved to Mountain View, California, in 1979, and later to Old Mountain View in 1981 with her partner and later husband Greg Fowler, a software engineer and advocate for environmental issues & the blind. In their free time, she and her husband enjoyed taking walks and volunteering for their local community. Lovins actively volunteered for organizations like the Old Mountain View Neighborhood Association, Mountain View Friends of the Library, League of Women Voters, Mountain View Cool Cities Team, and the Mountain View Sustainability Task Force. In 2016, Lovins' husband died unexpectedly, following a heart attack. Eighteen days after her husband died, Lovins was diagnosed with brain cancer. She died on January 26, 2018, at a hospice, surrounded by friends, family and caregivers.

    Read more →
  • List of chatbots

    List of chatbots

    A chatbot is a software application or web interface that is designed to mimic human conversation through text or voice interactions. Modern chatbots are typically online and use generative artificial intelligence systems that are capable of maintaining a conversation with a user in natural language and simulating the way a human would behave as a conversational partner. Such chatbots often use large language models (LLMs) and natural language processing, but simpler chatbots have existed for decades. == LLM chatbots == == General chatbots == == Historical chatbots ==

    Read more →
  • Boris Katz

    Boris Katz

    Boris Gershevich Katz (Russian: Борис Гершевич Кац; born October 5, 1947) is a principal American research scientist (computer scientist) at the MIT Computer Science and Artificial Intelligence Laboratory at the Massachusetts Institute of Technology in Cambridge and head of the Laboratory's InfoLab Group. His research interests include natural language processing and understanding, machine learning and intelligent information access. His brother Victor Kac is a mathematician at MIT. He was able to get out of the USSR with the help of U.S. Senator Ted Kennedy, before the end of the Cold War. Over the last several decades, Boris Katz has been developing the START natural language system that allows the user to access various types of information using English. == Biography == Boris Katz was born on October 5, 1947, in Chișinău in the family of Hersh Katz (died 1976) and Hayki (Klara) Landman (born 1921, Lipcani, Briceni District - died 2006, Cambridge, Middlesex County), who moved from Lipcani, a town located in the northern Bessarabian, to Chișinău before the war. He graduated from Moscow State University and in November 1978, he left for the United States thanks to the personal intervention of Senator Edward M. Kennedy. He defended his thesis as a candidate of physical and mathematical sciences in 1975 under the supervision of Evgenii M. Landis. He currently lives in Boston and heads the InfoLabresearch team at the Laboratory of Informatics and Artificial Intelligence at the Massachusetts Institute of Technology. Boris Katz is the creator of the START information processing system (since 1993 - on the Internet), the author of several works in the field of processing, generation and perception of natural languages, machine learning, and accelerated access to multimedia information. == Family == Brothers - Victor Gershevich Katz, American mathematician, professor at the Massachusetts Institute of Technology; Mikhail Gershevich Katz, Israeli mathematician, graduate of Harvard and Columbia (Ph.D., 1984) universities, professor at Bar-Ilan University, author of the monograph "Systolic Geometry and Topology" (Mathematical Surveys and Monographs, vol. 137. American Mathematical Society: Providence, 2007). Daughter - Luba Katz, a bioinformatics scientist (her husband is Alan Jasanoff, a neuroimaging scientist, a professor at MIT, the son of Harvard University professors Jay Jasanoff and Sheila Jasanoff). == Past works == A Knowledge Entry System for Subject Matter Experts: The goal of SHAKEN project is to enable subject matter experts, without any assistance from AI technologists, to assemble the models of processes and mechanisms so that questions about them can be answered by declarative inference and simulation. Exploiting lexical regularities in designing natural language systems Word sense disambiguation for information retrieval HIKE (HPKB integrated knowledge environment)- a query interface and integrated knowledge environment for HPKB Quantitative evaluation of passage retrieval algorithms for question answering Sticky notes for the semantic web Question answering from the web using knowledge annotation and knowledge mining techniques The role of context in question answering systems

    Read more →
  • StarDict

    StarDict

    StarDict, developed by Hu Zheng (胡正), is a free GUI released under the GPL-3.0-or-later license for accessing StarDict dictionary files (a dictionary shell). It is the successor of StarDic, developed by Ma Su'an (馬蘇安), continuing its version numbers. According to StarDict's earlier homepage on SourceForge, the project has been removed from SourceForge due to copyright infringement reports. It moved to Google Code and then back to SourceForge, while development is now seemingly continued on GitHub. == Supported platforms == StarDict runs under Linux, Windows, FreeBSD, Maemo and Solaris. Dictionaries of the user's choice are installed separately. Dictionary files can be created by converting dict files. Several programs compatible with the StarDict dictionary format are available for different platforms. For the iPhone, iPod Touch and iPad, applications available in the App Store include GuruDic, TouchDict, weDict, Dictionary Universal, Alpus and others, as well as the free iStarDict, which is available for the Cydia Store. == Dictionaries available == One can find here the partial list of FreeDict dictionaries which can be converted to the StarDict format. These include, in particular, some older versions of Webster's dictionary and many dictionaries for various languages. == Features == While StarDict is in scan mode, results are displayed in a tooltip, allowing easy dictionary lookup. When combined with Freedict, StarDict will quickly provide rough translations of foreign language websites. On September 25, 2006, an online version of Stardict began operation. This online version includes access to all the major dictionaries of StarDict, as well as Wikipedia in Chinese. Previous versions of StarDict were very similar to the PowerWord dictionary program, which is developed by a Chinese company, KingSoft. Since version 2.4.2, however, StarDict has diverged from the design of PowerWord by increasing its search capabilities and adding lexicons in a variety of languages. This was assisted by the collaboration of many developers with the author. == sdcv == Evgeniy A. Dushistov produced a command line version of StarDict called sdcv. It employed all the dictionary files that belong to StarDict. It is written in C++ and licensed under the terms of the GNU General Public License. sdcv runs under Linux, FreeBSD, and Solaris. As in StarDict, dictionaries of the user's choice have to be installed separately. At the end of 2006, software developer Hu Zheng cited personal financial problems as an excuse to charge users for downloading dictionary files from his website, which temporarily aroused strong doubts and dissatisfaction in the Linux community. In the end, under the pressure of public opinion, the charging plan was forced to be canceled and ended hastily.

    Read more →
  • AI Video Generators Reviews: What Actually Works in 2026

    AI Video Generators Reviews: What Actually Works in 2026

    Comparing the best AI video generator? An AI video generator is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI video generator slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • Light scanning photomacrography

    Light scanning photomacrography

    Light Scanning Photomacrography (LSP), also known as Scanning Light Photomacrography (SLP) or Deep-Field Photomacrography, is a photographic film technique that allows for high magnification light imaging with exceptional depth of field (DOF). This method overcomes the limitations of conventional macro photography, which typically only keeps a portion of the subject in acceptable focus at high magnifications. == Historical background == The principles of LSP were first documented in the early 1960s by Dan McLachlan Jr., who highlighted its capability for extreme focal depth in microscopy and in 1968 patented the process. The technique was revived and further developed in the 1980s by photographers such as Darwin Dale and Nile Root, a faculty member at the Rochester Institute of Technology. In the early 1990s, William Sharp and Charles Kazilek, both researchers at Arizona State University, also published articles describing their technique and system setup for capturing SLP images. == Predecessor to stack image photography == Light Scanning Photomacrography offered a powerful analog tool for high-detail imaging in the age of film photography. It provided a comprehensive depth of field, making it invaluable in scientific and biomedical photography. As technology and techniques continue to evolve, LSP has been replaced by digital image focus stacking. This technique uses a collection of images captured in series at different focal depths, which are then processed using computer software to create a single image with a greater focus depth than any single image. == LSP technique and results == LSP involves the use of a thin plane of light that scans across the subject, which is mounted on a stage moving perpendicular to the film plane. The technique utilizes traditional optics and is governed by the physical laws of depth of field. By moving the subject through a narrow band of illumination, the entire subject can be recorded in sharp focus from the nearest details to the farthest ones. This analog process produces sharp and detailed images by slowly recording the image on film as the specimen passes through the sheet of light that is thinner than the effective DOF. Because the image is captured at the same relative distance from the camera lens, the resulting images are axonometric rather than perspective projection, which is what the human eye sees and is typically captured by a film camera. Because all parts of an LSP image are captured at the same distance from the lens, relative measurements can be taken from an LSP photograph and can be used for comparison. == Equipment and setup == A typical LSP setup includes: A stage that can move the subject perpendicular to the film plane. Light sources, in some cases modified projectors, are used to project a thin plane of light. A camera mounted on a stable stand such as a tabletop copy stand. In 1991, Sharp and Kazilek described their SLP system that used three Kodak Ektagraphic slide projectors with zoom lenses to create a thin plane of light. The projectors each had a slide mount with two razor blades placed edge-to-edge to create a thin slit for the light to pass through. The image was captured using a Nikon FE-2 SLR camera mounted above the specimen. Kodachrome 25 slide film was used to record the image and to minimize film grain size and maximize image sharpness == Commercial systems == A commercial SLP instrument was produced by the Irvine Optical Corp. Their DYNAPHOT system was based on a photomacroscope and could capture images on 4x5 film. The instrument came with two or three illumination sources and a motorized specimen stage. The system advertised a 2X – 40X magnification range and the ability to capture images in black and white and color. Other systems have been developed by Nile Root and Theodore Clarke and reported higher magnification (up to 100X). == LSP process == Alignment and Focusing: The light sources are aligned and focused to project a thin, consistent plane of light across the subject. Stage Movement: The subject stage moves at a controlled speed, scanning through the plane of light. Image Capture: The camera shutter is set to a long exposure or can be opened and closed manually. As the subject moves through the illuminated plane, it is recorded on the film. This process is very much like painting an image onto the film using photons instead of paint. == Applications == LSP was particularly useful in biomedical photography, where it was used to document magnified subjects with increased depth of field over traditional macro and micro photography. It has been employed to capture detailed images of biological specimens, such as imaging small insects and their parts. SLP has been used to document shell collections for scientific documentation and research. Other applications include forensic science, mineralogy, and the imaging of fractured surfaces and parts == Advantages and challenges of LSP imaging == === Advantages === Exceptional depth of field: Subjects are rendered in sharp focus throughout. High magnification: Detailed images at significant magnification without sacrificing DOF. Analog precision: Provides a non-digital solution with accurate image representation. Versatility: Can be used for a range of subject sizes, from macro to non-macro scales. === Challenges === Technical complexity: Requires precise setup and alignment. Exposure time: Typically requires long exposure times due to the scanning process. Contrast control: The highly directional lighting can create harsh shadows and high contrast, which may need to be managed. Digital competition: Focus stacking has largely replaced LSP in the digital era due to convenience and flexibility. == DIY contributions == Enthusiasts and researchers have contributed to the development and accessibility of LSP by creating and sharing DIY guides. These contributions have enabled others to build their own LSP systems using readily available materials and components. Nile Root's publications provide detailed instructions and recommendations for constructing an LSP setup. These DIY systems have allowed a wider audience to explore and utilize the benefits of LSP imaging in various fields.

    Read more →
  • General Internet Corpus of Russian

    General Internet Corpus of Russian

    General Internet Corpus of Russian (GICR) is a corpus of Russian internet texts that has been accessible on request through an online query interface since 2013. The corpus includes rich text materials from the blogosphere, social networks, major news sources and literary magazines. == Goals of the project == The project has the status of an educational and scientific one, and many tasks of computational linguistics are solved by independent researchers and research groups with the materials obtained by GICR. While other corpus projects of Russian are focused on fiction and edited texts, General Internet Corpus provides linguists timely opportunity to learn the language as it is, with all the slang and regional peculiarities. Corpus gives the opportunity to carry out research in Linguistic research of a wide range: dialectological research, study of word distribution, study of the language of the social networks, study of the influence of gender, age and other factors on the language, frequency of words, fixed expressions and different constructions, stylistic features of texts of different segments of the Internet, etc. Social media analysis Corpus-based machine learning for evaluating automatic tagging At various times, student papers and independent researches were carried out on the project material by students, graduates and employees of MSU, MIPT, Russian State Humanitarian University, Novosibirsk State University, Higher School of Economics, Russian Academy of Sciences, SFU, CSU, SGMP, IAAS of MSU. Scientific project leaders: Belikov V. - RSUH, Moscow, Russia Selegey V. - RSUH, ABBYY, Moscow, Russia Sharoff S. - RSUH, Moscow, Russia; University of Leeds, UK The organizations involved in support of GICR: Russian State University of Humanities ABBYY Company Moscow Institute of Physics and Technology Skolkovo Institute of Science and Technology == Size and content of the corpus == Corpus size for the summer 2016 is 19.8 billion tokens, of which 49% are from VKontakte, 40% are from LiveJournal, another 4% - from Mail.ru Blogs and News, and 2% - from Russian Magazine Hall. The sources collected in news segment are: RIA Novosti, Regnum, Lenta.ru, Rosbalt. Texts are provided with metamarkup (by date of creation of the text, sex, place and year of birth of the author, Internet genre, etc.); all texts are provided with automatic morphological tagging and lemmatization. Most of the texts collected are of 2013–2014 years of creation, although in some segments, such as in Russian Magazine Hall, there are some texts collected since 1994. GICR is one of the few mega-corpora projects nowadays, which means its available size is reaching several billion of words. == Access == Currently the interface of GICR is in beta stage, so access to the search in the corpora is provided and is free, but is available for researchers on request.

    Read more →
  • How to Choose an AI Sales Assistant

    How to Choose an AI Sales Assistant

    In search of the best AI sales assistant? An AI sales assistant is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI sales assistant slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • The Best Free AI Analytics Tool for Beginners

    The Best Free AI Analytics Tool for Beginners

    Trying to pick the best AI analytics tool? An AI analytics tool is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI analytics tool slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • Image-based modeling and rendering

    Image-based modeling and rendering

    In computer graphics and computer vision, image-based modeling and rendering (IBMR) methods rely on a set of two-dimensional images of a scene to generate a three-dimensional model and then render some novel views of this scene. The traditional approach of computer graphics has been used to create a geometric model in 3D and try to reproject it onto a two-dimensional image. Computer vision, conversely, is mostly focused on detecting, grouping, and extracting features (edges, faces, etc.) present in a given picture and then trying to interpret them as three-dimensional clues. Image-based modeling and rendering allows the use of multiple two-dimensional images in order to generate directly novel two-dimensional images, skipping the manual modeling stage. == Light modeling == Instead of considering only the physical model of a solid, IBMR methods usually focus more on light modeling. The fundamental concept behind IBMR is the plenoptic illumination function which is a parametrisation of the light field. The plenoptic function describes the light rays contained in a given volume. It can be represented with seven dimensions: a ray is defined by its position ( x , y , z ) {\displaystyle (x,y,z)} , its orientation ( θ , ϕ ) {\displaystyle (\theta ,\phi )} , its wavelength ( λ ) {\displaystyle (\lambda )} and its time ( t ) {\displaystyle (t)} : P ( x , y , z , θ , ϕ , λ , t ) {\displaystyle P(x,y,z,\theta ,\phi ,\lambda ,t)} . IBMR methods try to approximate the plenoptic function to render a novel set of two-dimensional images from another. Given the high dimensionality of this function, practical methods place constraints on the parameters in order to reduce this number (typically to 2 to 4). == IBMR methods and algorithms == View morphing generates a transition between images Panoramic imaging renders panoramas using image mosaics of individual still images Lumigraph relies on a dense sampling of a scene Space carving generates a 3D model based on a photo-consistency check

    Read more →
  • Unsupervised learning

    Unsupervised learning

    Unsupervised learning is a framework in machine learning where, in contrast to supervised learning, algorithms learn patterns exclusively from unlabeled data. Other frameworks in the spectrum of supervisions include weak- or semi-supervision, where a small portion of the data is tagged, and self-supervision. Some researchers consider self-supervised learning a form of unsupervised learning. Conceptually, unsupervised learning divides into the aspects of data, training, algorithm, and downstream applications. Typically, the dataset is harvested cheaply "in the wild", such as massive text corpus obtained by web crawling, with only minor filtering (such as Common Crawl). This compares favorably to supervised learning, where the dataset (such as the ImageNet1000) is typically constructed manually, which is much more expensive. There are algorithms designed specifically for unsupervised learning, such as clustering algorithms like k-means, dimensionality reduction techniques like principal component analysis (PCA), Boltzmann machine learning, and autoencoders. After the rise of deep learning, most large-scale unsupervised learning has been done by training general-purpose neural network architectures by gradient descent, adapted to performing unsupervised learning by designing an appropriate training procedure. Sometimes a trained model can be used as-is, but more often they are modified for downstream applications. For example, the generative pretraining method trains a model to generate a textual dataset, before finetuning it for other applications, such as text classification. As another example, autoencoders are trained to produce good features, which can then be used as a module for other models, such as in a latent diffusion model. == Tasks == Tasks are often categorized as discriminative (recognition) or generative (imagination). Often but not always, discriminative tasks use supervised methods and generative tasks use unsupervised (see Venn diagram); however, the separation is very hazy. For example, object recognition favors supervised learning but unsupervised learning can also cluster objects into groups. Furthermore, as progress marches onward, some tasks employ both methods, and some tasks swing from one to another. For example, image recognition started off as heavily supervised, but became hybrid by employing unsupervised pre-training, and then moved towards supervision again with the advent of dropout, ReLU, and adaptive learning rates. A typical generative task is as follows. At each step, a datapoint is sampled from the dataset, and part of the data is removed, and the model must infer the removed part. This is particularly clear for the denoising autoencoders and BERT. == Neural network architectures == === Training === During the learning phase, an unsupervised network tries to mimic the data it is given and uses the error in its mimicked output to correct itself (i.e. correct its weights and biases). Sometimes the error is expressed as a low probability that the erroneous output occurs, or it might be expressed as an unstable high energy state in the network. In contrast to supervised methods' dominant use of backpropagation, unsupervised learning also employs other methods including: Hopfield learning rule, Boltzmann learning rule, Contrastive Divergence, Wake Sleep, Variational Inference, Maximum Likelihood, Maximum A Posteriori, Gibbs Sampling, and backpropagating reconstruction errors or hidden state reparameterizations. See the table below for more details. === Energy === An energy function is a macroscopic measure of a network's activation state. In Boltzmann machines, it plays the role of the Cost function. This analogy with physics is inspired by Ludwig Boltzmann's analysis of a gas' macroscopic energy from the microscopic probabilities of particle motion p ∝ e − E / k T {\displaystyle p\propto e^{-E/kT}} , where k is the Boltzmann constant and T is temperature. In the RBM network the relation is p = e − E / Z {\displaystyle p=e^{-E}/Z} , where p {\displaystyle p} and E {\displaystyle E} vary over every possible activation pattern and Z = ∑ All Patterns e − E ( pattern ) {\displaystyle \textstyle {Z=\sum _{\scriptscriptstyle {\text{All Patterns}}}e^{-E({\text{pattern}})}}} . To be more precise, p ( a ) = e − E ( a ) / Z {\displaystyle p(a)=e^{-E(a)}/Z} , where a {\displaystyle a} is an activation pattern of all neurons (visible and hidden). Hence, some early neural networks bear the name Boltzmann Machine. Paul Smolensky calls − E {\displaystyle -E\,} the Harmony. A network seeks low energy which is high Harmony. === Networks === This table shows connection diagrams of various unsupervised networks, the details of which will be given in the section Comparison of Networks. Circles are neurons and edges between them are connection weights. As network design changes, features are added on to enable new capabilities or removed to make learning faster. For instance, neurons change between deterministic (Hopfield) and stochastic (Boltzmann) to allow robust output, weights are removed within a layer (RBM) to hasten learning, or connections are allowed to become asymmetric (Helmholtz). Of the networks bearing people's names, only Hopfield worked directly with neural networks. Boltzmann and Helmholtz came before artificial neural networks, but their work in physics and physiology inspired the analytical methods that were used. === History === === Specific Networks === Here, we highlight some characteristics of select networks. The details of each are given in the comparison table below. Hopfield Network Ferromagnetism inspired Hopfield networks. A neuron corresponds to an iron domain with binary magnetic moments Up and Down, and neural connections correspond to the domain's influence on each other. Symmetric connections enable a global energy formulation. During inference the network updates each state using the standard activation step function. Symmetric weights and the right energy functions guarantees convergence to a stable activation pattern. Asymmetric weights are difficult to analyze. Hopfield nets are used as Content Addressable Memories (CAM). Boltzmann Machine These are stochastic Hopfield nets. Their state value is sampled from this pdf as follows: suppose a binary neuron fires with the Bernoulli probability p(1) = 1/3 and rests with p(0) = 2/3. One samples from it by taking a uniformly distributed random number y, and plugging it into the inverted cumulative distribution function, which in this case is the step function thresholded at 2/3. The inverse function = { 0 if x <= 2/3, 1 if x > 2/3 }. Sigmoid Belief Net Introduced by Radford Neal in 1992, this network applies ideas from probabilistic graphical models to neural networks. A key difference is that nodes in graphical models have pre-assigned meanings, whereas Belief Net neurons' features are determined after training. The network is a sparsely connected directed acyclic graph composed of binary stochastic neurons. The learning rule comes from Maximum Likelihood on p(X): Δwij ∝ {\displaystyle \propto } sj (si - pi), where pi = 1 / ( 1 + eweighted inputs into neuron i ). sj's are activations from an unbiased sample of the posterior distribution and this is problematic due to the Explaining Away problem raised by Judea Perl. Variational Bayesian methods uses a surrogate posterior and blatantly disregard this complexity. Deep Belief Network Introduced by Hinton, this network is a hybrid of RBM and Sigmoid Belief Network. The top 2 layers is an RBM and the second layer downwards form a sigmoid belief network. One trains it by the stacked RBM method and then throw away the recognition weights below the top RBM. As of 2009, 3-4 layers seems to be the optimal depth. Helmholtz machine These are early inspirations for the Variational Auto Encoders. Its 2 networks combined into one—forward weights operates recognition and backward weights implements imagination. It is perhaps the first network to do both. Helmholtz did not work in machine learning but he inspired the view of "statistical inference engine whose function is to infer probable causes of sensory input". the stochastic binary neuron outputs a probability that its state is 0 or 1. The data input is normally not considered a layer, but in the Helmholtz machine generation mode, the data layer receives input from the middle layer and has separate weights for this purpose, so it is considered a layer. Hence this network has 3 layers. Variational autoencoder These are inspired by Helmholtz machines and combines probability network with neural networks. An Autoencoder is a 3-layer CAM network, where the middle layer is supposed to be some internal representation of input patterns. The encoder neural network is a probability distribution qφ(z given x) and the decoder network is pθ(x given z). The weights are named phi & theta rather than W and V as in Helmholtz—a cosmetic difference. These 2 networks h

    Read more →
  • The Best Free AI Subtitle Generator for Beginners

    The Best Free AI Subtitle Generator for Beginners

    In search of the best AI subtitle generator? An AI subtitle generator is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI subtitle generator slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →