AI Art Video

AI Art Video — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Dr. Sbaitso

    Dr. Sbaitso

    Dr. Sbaitso ( SPAYT-soh) is an artificial intelligence speech synthesis program released late in 1991 by Creative Labs in Singapore for MS-DOS-based personal computers. The name is an acronym for "SoundBlaster Acting Intelligent Text-to-Speech Operator." == History == Dr. Sbaitso was distributed with various sound cards manufactured by Creative Technology in the early 1990s. The text-to-speech engine used is a version of Monologue, which was developed by First Byte Software. Monologue is a later release of First Byte's "SmoothTalker" software from 1984. The program "conversed" with the user as if it were a psychologist, though most of its responses were along the lines of "WHY DO YOU FEEL THAT WAY?" rather than any sort of complicated interaction. When confronted with a phrase it could not understand, it would often reply with something such as "THAT'S NOT MY PROBLEM." Dr. Sbaitso repeated text out loud that was typed after the word "SAY." Repeated swearing or abusive behavior on the part of the user caused Dr. Sbaitso to "break down" in a "PARITY ERROR" before resetting itself. The same would happen, if the user types "SAY PARITY." The program introduced itself with the following lines: HELLO [UserName], MY NAME IS DOCTOR SBAITSO. I AM HERE TO HELP YOU. SAY WHATEVER IS IN YOUR MIND FREELY, OUR CONVERSATION WILL BE KEPT IN STRICT CONFIDENCE. MEMORY CONTENTS WILL BE WIPED OFF AFTER YOU LEAVE, SO, TELL ME ABOUT YOUR PROBLEMS. The program was designed to showcase the digitized voices the cards were able to produce, though the quality was far from lifelike. Additionally, there was a version of this program for Microsoft Windows through the use of a program called Prody Parrot; this version of the software featured a more detailed graphical user interface. The text-to-speech was also used as the voice of 1st Prize from the Baldi's Basics series, albeit slowed down. == Commands == If the user submits "HELP", a list of commands will appear. If the user then submits "M", more commands will appear. There are three pages of commands in total, with guidance on how to use each of the features.

    Read more →
  • Samer Hassan

    Samer Hassan

    Samer Hassan is a computer scientist, social scientist, activist and researcher, focused on the study of the collaborative economy, online communities and decentralized technologies. He is an associate professor at Universidad Complutense de Madrid (Spain) and Faculty Associate at the Berkman Klein Center for Internet & Society at Harvard University. He is the recipient of an ERC Grant of 1.5M€ with the P2P Models project, to research blockchain-based decentralized autonomous organizations for the collaborative economy. == Education and career == Hassan is a Spanish/Lebanese scholar with an interdisciplinary background, which combines computer sciences with social sciences and activism. He received a degree in Computer Science and MSc in Artificial Intelligence from the Universidad Complutense de Madrid (UCM) in Spain. He also studied three years of Political Science at the distance learning university UNED. He then pursued a PhD in Social Simulation at the department of Software Engineering and Artificial Intelligence of UCM, supervised by the computer scientist Juan Pavón and the sociologist Millán Arroyo-Menéndez. He has been researching in several institutions, funded by several scholarships and awards, most notably Harvard's Real Colegio Complutense, and the Spanish postdoctoral grants Juan de la Cierva and José Castillejo. Thus, he was a visiting researcher at the Centre for Research in Social Simulation, in the Department of Sociology at the University of Surrey in the UK, working under the supervision of Nigel Gilbert (2007-2008), and a lecturer at the American University of Science and Technology in Lebanon (2010–11). He was selected as Fellow at the Berkman Klein Center for Internet & Society at Harvard University (2015-2017) and is presently a Faculty Associate at the same structure. Starting in 2024, he joined, as affiliate faculty, the Institute for Digital Cooperative Economy (The New School), part of the Platform Cooperativism Consortium. == Activism and social engagement == As an activist, Hassan has been engaged in both offline (La Tabacalera de Lavapiés, Medialab-Prado) and online (Ourproject.org, Barrapunto, Wikipedia) initiatives. He was accredited as a grassroots facilitator by the Altekio Cooperative. He co-founded the Comunes Nonprofit in 2009 and the Move Commons webtool project in 2010. He has co-organized practitioner-oriented workshops on platform co-ops and free/open source decentralized tools for communities, and has presented his work in non-academic conferences of Mozilla, the Internet Archive, and others. As a privacy advocate, he co-created a course on cyber-ethics which has been teaching since 2013 (as of 2021). He was co-founder of the Sci-Fdi Spanish science-fiction magazine. His gender is non-binary and uses he/they pronouns. == Work == Hassan's interdisciplinary research spans multiple fields, including online communities, online governance, online collaboration, decentralized technologies, blockchain-based decentralized autonomous organizations, free/libre/open source software, Commons-based peer production, agent-based social simulation, social movements and cyberethics. He has published more than 60 works in these fields. Hassan's PhD thesis focused on the methodological challenges for building data-driven social simulation models. The main model built simulated the transition from modern values to postmodern values in Spain. His methodological work also explored the combination of different artificial intelligence technologies, i.e. software agents with fuzzy logic, data mining, natural language processing, and microsimulation. In his postdoctoral period, he focused on experimenting with multiple software systems to facilitate the collaborative economy, e.g. semantic-web labelling for commons-based initiatives, distribution of value in peer production communities, agent-supported online assemblies, decentralized real-time collaborative software, decentralized blockchain based reputation, or blockchain-enabled commons governance. Hassan was Principal Investigator of the UCM partner in the EU-funded P2Pvalue project on building decentralized web-tools for collaborative communities. As such, he led the team that created SwellRT, a federated backend-as-a-service focused to ease development of apps featuring real-time collaboration. Intellectual Property of this project was transferred to the Apache Software Foundation in 2017. As part of this research line, Hassan's team also develop two SwellRT-based apps, "Teem" for management of social collectives and Jetpad, a federated real time editor. He presented the innovations concerning these software at Harvard's Berkman Klein Center and Harvard's Center for Research on Computation and Society. Other research lines offered outcomes beyond publications. "Wikichron", coled by Javier Arroyo, is a web tool to visualize MediaWiki community metrics, currently in production and available for third-parties. "Decentralized Science", led by Hassan's PhD student Ámbar Tenorio-Fornés, is a framework to facilitate decentralized infrastructure and open peer review in the scientific publication process, which has been selected by the European Commission to receive funding as a spin-off social enterprise. His research on blockchain and crowdfunding models awarded him with a commission from Triple Canopy. His team pushed forward a mapping of the ecosystem of blockchain for social good, led by the Joint Research Centre and published by the European Commission. As part of his ERC project P2P Models, Hassan and his team –including Silvia Semenzin– are investigating whether blockchain technology and Decentralized Autonomous Organizations could contribute to improving the governance of commons-oriented communities, both online and offline. Their work has been showcased for tackling the impact of blockchain on governance, proposing alternatives to the current sharing economy, emerging forms of techno-social systems like NFTs or prediction markets, or giving relevance to gender issues in the field. Hassan was invited to present the project achievements in Harvard Kennedy School, MIT Media Lab, Harvard's Data Privacy Lab, Harvard's Center for Research on Computation and Society, and Harvard's SEAS EconCS. British MP and Opposition Leader Ed Miliband showcased his research and its potential impact on policy. The project made public its way of organizing and its core values. In particular, it has shown a commitment to diversity as a core value in hiring, or choosing case studies. == Selected works == Arroyo, Javier; Davó, David; Martínez-Vicente, Elena; Faqir-Rhazoui, Youssef; Hassan, Samer (8 November 2022). "DAO-Analyzer: Exploring Activity and Participation in Blockchain Organizations" (PDF). Companion Publication of the 2022 Conference on Computer Supported Cooperative Work and Social Computing. CSCW'22 Companion. New York, NY, USA: Association for Computing Machinery. pp. 193–196. doi:10.1145/3500868.3559707. ISBN 978-1-4503-9190-0. Rozas, David; Tenorio-Fornés, Antonio; Díaz-Molina, Silvia; Hassan, Samer (2021). "When Ostrom Meets Blockchain: Exploring the Potentials of Blockchain for Commons Governance". SAGE Open. 11 (1): 215824402110025. doi:10.1177/21582440211002526. ISSN 2158-2440. Faqir-Rhazoui, Youssef; Ariza-Garzón, Miller-Janny; Arroyo, Javier; Hassan, Samer (8 May 2021). "Effect of the Gas Price Surges on User Activity in the DAOs of the Ethereum Blockchain" (PDF). Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems. CHI EA '21. New York, NY, USA: Association for Computing Machinery. pp. 1–7. doi:10.1145/3411763.3451755. ISBN 978-1-4503-8095-9. Hassan, Samer; Filippi, Primavera De (20 April 2021). "Decentralized Autonomous Organization". Internet Policy Review. 10 (2). doi:10.14763/2021.2.1556. hdl:10419/235960. ISSN 2197-6775. Joint Research Centre (European Commission); Hassan, Samer; Hakami, Anna; Brekke, Jaya Klara; De Filippi, Primavera; Lopéz Morales, Genoveva; Pólvora, Alexandre; Orgaz Alonso, Christian; Bodó, Balázs (2020). Scanning the European ecosystem of distributed ledger technologies for social and public good: what, why, where, how, and ways to move forward. LU: Publications Office of the European Union. doi:10.2760/300796. ISBN 978-92-76-21578-3. Filippi, Primavera De; Hassan, Samer (14 November 2016). "Blockchain technology as a regulatory technology: From code is law to law is code". First Monday. arXiv:1801.02507. doi:10.5210/fm.v21i12.7113. ISSN 1396-0466.

    Read more →
  • Human-readable medium and data

    Human-readable medium and data

    In computing, a human-readable medium or human-readable format is any encoding of data or information that can be naturally read by humans, resulting in human-readable data. It is often encoded as ASCII or Unicode text, rather than as binary data. In most contexts, the alternative to a human-readable representation is a machine-readable format or medium of data primarily designed for reading by electronic, mechanical or optical devices, or computers. For example, Universal Product Code (UPC) barcodes are very difficult to read for humans, but very effective and reliable with the proper equipment, whereas the strings of numerals that commonly accompany the label are the human-readable form of the barcode information. Since any type of data encoding can be parsed by a suitably programmed computer, the decision to use binary encoding rather than text encoding is usually made to conserve storage space. Encoding data in a binary format typically requires fewer bytes of storage and increases efficiency of access (input and output) by eliminating format parsing or conversion. With the advent of standardized, highly structured markup languages, such as Extensible Markup Language (XML), the decreasing costs of data storage, and faster and cheaper data communication networks, compromises between human-readability and machine-readability are now more common-place than they were in the past. This has led to humane markup languages and modern configuration file formats that are far easier for humans to read. In addition, these structured representations can be compressed very effectively for transmission or storage. Human-readable protocols greatly reduce the cost of debugging. Various organizations have standardized the definition of human-readable and machine-readable data and how they are applied in their respective fields of application, e.g., the Universal Postal Union. Often the term human-readable is also used to describe shorter names or strings, that are easier to comprehend or to remember than long, complex syntax notations, such as some Uniform Resource Locator strings. Occasionally "human-readable" is used to describe ways of encoding an arbitrary integer into a long series of English words. Compared to decimal or other compact binary-to-text encoding systems, English words are easier for humans to read, remember, and type in.

    Read more →
  • Top 10 AI Pair Programmers Compared (2026)

    Top 10 AI Pair Programmers Compared (2026)

    Looking for the best AI pair programmer? An AI pair programmer is software that uses machine learning to help you get more done — it can save you hours every week by automating repetitive work. Most options offer a generous free tier, with paid plans unlocking higher limits, faster processing, and team features. Whether you are a beginner or a pro, the right AI pair programmer slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • VideoPoet

    VideoPoet

    VideoPoet is a large language model developed by Google Research in 2023 for video making. It can be asked to animate still images. The model accepts text, images, and videos as inputs, with a program to add feature for any input to any format generated content. VideoPoet was publicly announced on December 19, 2023. It uses an autoregressive language model.

    Read more →
  • Topic model

    Topic model

    In natural language processing, a topic model is a type of probabilistic, neural, or algebraic model for discovering the abstract topics that occur in a collection of documents. Topic modeling is a frequently used text mining tool for discovering hidden semantic features and structures in a text. The topics produced by topic models are generated through a variety of mathematical frameworks, including probabilistic generative models, matrix factorization methods based on word co-occurrence, and clustering algorithms applied to semantic embeddings. Topic models are commonly used to organize and discover latent features in large collections of unstructured text and other forms of big data. Beyond text mining, topic models have also been used to uncover latent structures in fields such as genetic information, bioinformatics, computer vision, and social networks. == History == An early topic model was described by Papadimitriou, Raghavan, Tamaki and Vempala in 1998. Another one, called probabilistic latent semantic analysis (PLSA), was created by Thomas Hofmann in 1999. Latent Dirichlet allocation (LDA), perhaps the most common topic model currently in use, is a generalization of PLSA. Developed by David Blei, Andrew Ng, and Michael I. Jordan in 2002, LDA introduces sparse Dirichlet prior distributions over document-topic and topic-word distributions, encoding the intuition that documents cover a small number of topics and that topics often use a small number of words. Other topic models are generally extensions on LDA, such as Pachinko allocation, which improves on LDA by modeling correlations between topics in addition to the word correlations which constitute topics. Hierarchical latent tree analysis (HLTA) is an alternative to LDA, which models word co-occurrence using a tree of latent variables and the states of the latent variables, which correspond to soft clusters of documents, are interpreted as topics. == Topic models for context information == Approaches for temporal information include Block and Newman's determination of the temporal dynamics of topics in the Pennsylvania Gazette during 1728–1800. Griffiths & Steyvers used topic modeling on abstracts from the journal PNAS to identify topics that rose or fell in popularity from 1991 to 2001 whereas Lamba & Madhusushan used topic modeling on full-text research articles retrieved from DJLIT journal from 1981 to 2018. In the field of library and information science, Lamba & Madhusudhan applied topic modeling on different Indian resources like journal articles and electronic theses and resources (ETDs). Nelson has been analyzing change in topics over time in the Richmond Times-Dispatch to understand social and political changes and continuities in Richmond during the American Civil War. Yang, Torget and Mihalcea applied topic modeling methods to newspapers from 1829 to 2008. Mimno used topic modelling with 24 journals on classical philology and archaeology spanning 150 years to look at how topics in the journals change over time and how the journals become more different or similar over time. Yin et al. introduced a topic model for geographically distributed documents, where document positions are explained by latent regions which are detected during inference. Chang and Blei included network information between linked documents in the relational topic model, to model the links between websites. The author-topic model by Rosen-Zvi et al. models the topics associated with authors of documents to improve the topic detection for documents with authorship information. HLTA was applied to a collection of recent research papers published at major AI and Machine Learning venues. The resulting model is called The AI Tree. The resulting topics are used to index the papers at aipano.cse.ust.hk to help researchers track research trends and identify papers to read, and help conference organizers and journal editors identify reviewers for submissions. To improve the qualitative aspects and coherency of generated topics, some researchers have explored the efficacy of "coherence scores", or otherwise how computer-extracted clusters (i.e. topics) align with a human benchmark. Coherence scores are metrics for optimising the number of topics to extract from a document corpus. == Algorithms == In practice, researchers attempt to fit appropriate model parameters to the data corpus using one of several heuristics for maximum likelihood fit. A survey by D. Blei describes this suite of algorithms. Several groups of researchers starting with Papadimitriou et al. have attempted to design algorithms with provable guarantees. Assuming that the data were actually generated by the model in question, they try to design algorithms that probably find the model that was used to create the data. Techniques used here include singular value decomposition (SVD) and the method of moments. In 2012 an algorithm based upon non-negative matrix factorization (NMF) was introduced that also generalizes to topic models with correlations among topics. Since 2017, neural networks has been leveraged in topic modeling in order to improve the speed of inference, and leading to further advancements like vONTSS, which allows humans to incorporate domain knowledge via weakly supervised learning. In 2018, a new approach to topic models was proposed based on the stochastic block model. Topic modeling has leveraged LLMs through contextual embedding and fine tuning. == Applications of topic models == === To quantitative biomedicine === Topic models are being used also in other contexts. For examples uses of topic models in biology and bioinformatics research emerged. Recently topic models has been used to extract information from dataset of cancers' genomic samples. In this case topics are biological latent variables to be inferred. === To analysis of music and creativity === Topic models can be used for analysis of continuous signals like music. For instance, they were used to quantify how musical styles change in time, and identify the influence of specific artists on later music creation.

    Read more →
  • Best AI Video Generators in 2026

    Best AI Video Generators in 2026

    Curious about the best AI video generator? An AI video generator is software that uses machine learning to help you get more done — it combines speed, accuracy, and an interface that just works. Hands-on testing shows real-world results vary, so a short free trial is the smartest way to decide. Whether you are a beginner or a pro, the right AI video generator slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Supervised learning

    Supervised learning

    In machine learning, supervised learning (SL) is a type of machine learning paradigm where an algorithm learns to map input data to a specific output based on example input-output pairs. This process involves training a statistical model using labeled data, meaning each piece of input data is provided with the correct output. The term "supervised" refers to the role of a teacher or supervisor who provides this training data, guiding the algorithm towards correct predictions. For instance, if you want a model to identify cats in images, supervised learning would involve feeding it many images of cats (inputs) that are explicitly labeled "cat" (outputs). The goal of supervised learning is for the trained model to accurately predict the output for new, unseen data. This requires the algorithm to effectively generalize from the training examples, a quality measured by its generalization error. Supervised learning is commonly used for tasks like classification (predicting a category, e.g., spam or not spam) and regression (predicting a continuous value, e.g., house prices). == Steps to follow == To solve a given problem of supervised learning, the following steps must be performed: Determine the type of training samples. Before doing anything else, the user should decide what kind of data is to be used as a training set. In the case of handwriting analysis, for example, this might be a single handwritten character, an entire handwritten word, an entire sentence of handwriting, or a full paragraph of handwriting. Gather a training set. The training set needs to be representative of the real-world use of the function. Thus, a set of input objects is gathered together with corresponding outputs, either from human experts or from measurements. Determine the input feature representation of the learned function. The accuracy of the learned function depends strongly on how the input object is represented. Typically, the input object is transformed into a feature vector, which contains a number of features that are descriptive of the object. The number of features should not be too large, because of the curse of dimensionality; but should contain enough information to accurately predict the output. Determine the structure of the learned function and corresponding learning algorithm. For example, one may choose to use support-vector machines or decision trees. Complete the design. Run the learning algorithm on the gathered training set. Some supervised learning algorithms require the user to determine certain control parameters. These parameters may be adjusted by optimizing performance on a subset (called a validation set) of the training set, or via cross-validation. Evaluate the accuracy of the learned function. After parameter adjustment and learning, the performance of the resulting function should be measured on a test set that is separate from the training set. == Algorithm choice == A wide range of supervised learning algorithms are available, each with its strengths and weaknesses. There is no single learning algorithm that works best on all supervised learning problems (see the No free lunch theorem). There are four major issues to consider in supervised learning: === Bias–variance tradeoff === A first issue is the tradeoff between bias and variance. Imagine that we have available several different, but equally good, training data sets. A learning algorithm is biased for a particular input x {\displaystyle x} if, when trained on each of these data sets, it is systematically incorrect when predicting the correct output for x {\displaystyle x} . A learning algorithm has high variance for a particular input x {\displaystyle x} if it predicts different output values when trained on different training sets. The prediction error of a learned classifier is related to the sum of the bias and the variance of the learning algorithm. Generally, there is a tradeoff between bias and variance. A learning algorithm with low bias must be "flexible" so that it can fit the data well. But if the learning algorithm is too flexible, it will fit each training data set differently, and hence have high variance. A key aspect of many supervised learning methods is that they are able to adjust this tradeoff between bias and variance (either automatically or by providing a bias/variance parameter that the user can adjust). === Function complexity and amount of training data === The second issue is of the amount of training data available relative to the complexity of the "true" function (classifier or regression function). If the true function is simple, then an "inflexible" learning algorithm with high bias and low variance will be able to learn it from a small amount of data. But if the true function is highly complex (e.g., because it involves complex interactions among many different input features and behaves differently in different parts of the input space), then the function will only be able to learn with a large amount of training data paired with a "flexible" learning algorithm with low bias and high variance. === Dimensionality of the input space === A third issue is the dimensionality of the input space. If the input feature vectors have large dimensions, learning the function can be difficult even if the true function only depends on a small number of those features. This is because the many "extra" dimensions can confuse the learning algorithm and cause it to have high variance. Hence, input data of large dimensions typically requires tuning the classifier to have low variance and high bias. In practice, if the engineer can manually remove irrelevant features from the input data, it will likely improve the accuracy of the learned function. In addition, there are many algorithms for feature selection that seek to identify the relevant features and discard the irrelevant ones. This is an instance of the more general strategy of dimensionality reduction, which seeks to map the input data into a lower-dimensional space prior to running the supervised learning algorithm. === Noise in the output values === A fourth issue is the degree of noise in the desired output values (the supervisory target variables). If the desired output values are often incorrect (because of human error or sensor errors), then the learning algorithm should not attempt to find a function that exactly matches the training examples. Attempting to fit the data too carefully leads to overfitting. You can overfit even when there are no measurement errors (stochastic noise) if the function you are trying to learn is too complex for your learning model. In such a situation, the part of the target function that cannot be modeled "corrupts" your training data – this phenomenon has been called deterministic noise. When either type of noise is present, it is better to go with a higher bias, lower variance estimator. In practice, there are several approaches to alleviate noise in the output values such as early stopping to prevent overfitting as well as detecting and removing the noisy training examples prior to training the supervised learning algorithm. There are several algorithms that identify noisy training examples and removing the suspected noisy training examples prior to training has decreased generalization error with statistical significance. === Other factors to consider === Other factors to consider when choosing and applying a learning algorithm include the following: Heterogeneity of the data. If the feature vectors include features of many different kinds (discrete, discrete ordered, counts, continuous values), some algorithms are easier to apply than others. Many algorithms, including support-vector machines, linear regression, logistic regression, neural networks, and nearest neighbor methods, require that the input features be numerical and scaled to similar ranges (e.g., to the [-1,1] interval). Methods that employ a distance function, such as nearest neighbor methods and support-vector machines with Gaussian kernels, are particularly sensitive to this. An advantage of decision trees is that they easily handle heterogeneous data. Redundancy in the data. If the input features contain redundant information (e.g., highly correlated features), some learning algorithms (e.g., linear regression, logistic regression, and distance-based methods) will perform poorly because of numerical instabilities. These problems can often be solved by imposing some form of regularization. Presence of interactions and non-linearities. If each of the features makes an independent contribution to the output, then algorithms based on linear functions (e.g., linear regression, logistic regression, support-vector machines, naive Bayes) and distance functions (e.g., nearest neighbor methods, support-vector machines with Gaussian kernels) generally perform well. However, if there are complex interactions among features, then algorithms such as decision trees and neural networks work better, becaus

    Read more →
  • The Visualization Handbook

    The Visualization Handbook

    The Visualization Handbook is a textbook by Charles D. Hansen and Christopher R. Johnson that serves as a survey of the field of scientific visualization by presenting the basic concepts and algorithms in addition to a current review of visualization research topics and tools. It is commonly used as a textbook for scientific visualization graduate courses. It is also commonly cited as a reference for scientific visualization and computer graphics in published papers, with almost 500 citations documented on Google Scholar. == Table of Contents == PART I - Introduction Overview of Visualization - William J. Schroeder and Kenneth M. Martin PART II - Scalar Field Visualization: Isosurfaces Accelerated Isosurface Extraction Approaches -Yarden Livnat Time-Dependent Isosurface Extraction - Han-Wei Shen Optimal Isosurface Extraction - Paolo Cignoni, Claudio Montani, Robert Scopigno, and Enrico Puppo Isosurface Extraction Using Extrema Graphs - Takayuki Itoh and Koji Koyamada Isosurfaces and Level-Sets - Ross Whitaker PART III - Scalar Field Visualization: Volume Rendering Overview of Volume Rendering - Arie E. Kaufman and Klaus Mueller Volume Rendering Using Splatting - Roger Crawfis, Daqing Xue, and Caixia Zhang Multidimensional Transfer Functions for Volume Rendering - Joe Kniss, Gordon Kindlmann, and Charles D. Hansen Pre-Integrated Volume Rendering - Martin Kraus and Thomas Ertl Hardware-Accelerated Volume Rendering - Hanspeter Pfister PART IV - Vector Field Visualization Overview of Flow Visualization - Daniel Weiskopf and Gordon Erlebacher Flow Textures: High-Resolution Flow Visualization - Gordon Erlebacher, Bruno Jobard, and Daniel Weiskopf Detection and Visualization of Vortices - Ming Jiang, Raghu Machiraju, and David Thompson PART V - Tensor Field Visualization Oriented Tensor Reconstruction - Leonid Zhukov and Alan H. Barr Diffusion Tensor MRI Visualization - Song Zhang, David Laidlaw, and Gordon Kindlmann Topological Methods for Flow Visualization - Gerik Scheuermann and Xavier Tricoche PART VI - Geometric Modeling for Visualization 3D Mesh Compression - Jarek Rossignac Variational Modeling Methods for Visualization - Hans Hagen and Ingrid Hotz Model Simplification - Jonathan D. Cohen and Dinesh Manocha PART VII - Virtual Environments for Visualization Direct Manipulation in Virtual Reality - Steve Bryson The Visual Haptic Workbench - Milan Ikits and J. Dean Brederson Virtual Geographic Information Systems - William Ribarsky Visualization Using Virtual Reality - R. Bowen Loftin, Jim X. Chen, and Larry Rosenblum PART VIII - Large-Scale Data Visualization Desktop Delivery: Access to Large Datasets - Philip D. Heermann and Constantine Pavlakos Techniques for Visualizing Time-Varying Volume Data - Kwan-Liu Ma and Eric B. Lum Large-Scale Data Visualization and Rendering: A Problem-Driven Approach - Patrick McCormick and James Ahrens Issues and Architectures in Large-Scale Data Visualization - Constantine Pavlakos and Philip D. Heermann Consuming Network Bandwidth with Visapult - Wes Bethel and John Shalf PART IX - Visualization Software and Frameworks The Visualization Toolkit - William J. Schroeder and Kenneth M. Martin Visualization in the SCIRun Problem-Solving Environment - David M. Weinstein, Steven Parker, Jenny Simpson, Kurt Zimmerman, and Greg M. Jones Numerical Algorithms Group IRIS Explorer - Jeremy Walton AVS and AVS/Express - Jean M. Favre and Mario Valle Vis5D, Cave5D, and VisAD - Bill Hibbard Visualization with AVS - W. T. Hewitt, Nigel W. John, Matthew D. Cooper, K. Yien Kwok, George W. Leaver, Joanna M. Leng, Paul G. Lever, Mary J. McDerby, James S. Perrin, Mark Riding, I. Ari Sadarjoen, Tobias M. Schiebeck, and Colin C. Venters ParaView: An End-User Tool for Large-Data Visualization - James Ahrens, Berk Geveci, and Charles Law The Insight Toolkit: An Open-Source Initiative in Data Segmentation and Registration - Terry S. Yoo amira: A Highly Interactive System for Visual Data Analysis - Detlev Stalling, Malte Westerhoff, and Hans-Christian Hege PART X - Perceptual Issues in Visualization Extending Visualization to Perceptualization: The Importance of Perception in Effective Communication of Information - David S. Ebert Art and Science in Visualization - Victoria Interrante Exploiting Human Visual Perception in Visualization - Alan Chalmers and Kirsten Cater PART XI - Selected Topics and Applications Scalable Network Visualization - Stephen G. Eick Visual Data-Mining Techniques - Daniel A. Keim, Mike Sips, and Mihael Ankerst Visualization in Weather and Climate Research - Don Middleton, Tim Scheitlin, and Bob Wilhelmson Painting and Visualization - Robert M. Kirby, Daniel F. Keefe, and David Laidlaw Visualization and Natural Control Systems for Microscopy - Russell M. Taylor II, David Borland, Frederick P. Brooks, Jr., Mike Falvo, Kevin Jeffay, Gail Jones, David Marshburn, Stergios J. Papadakis, Lu-Chang Qin, Adam Seeger, F. Donelson Smith, Dianne Sonnenwald, Richard Superfine, Sean Washburn, Chris Weigle, Mary Whitton, Leandra Vicci, Martin Guthold, Tom Hudson, Philip Williams, and Warren Robinett Visualization for Computational Accelerator Physics - Kwan-Liu Ma, Greg Schussman, and Brett Wilson

    Read more →
  • Ofer Dekel (researcher)

    Ofer Dekel (researcher)

    Ofer Dekel (Hebrew: עופר דקל) is a computer science researcher in the Machine Learning Department of Microsoft Research. He obtained his PhD in computer science from the Hebrew University of Jerusalem and is an affiliate faculty at the Computer Science & Engineering department at the University of Washington. == Areas of research == Dekel's research topics include machine learning, online prediction, statistical learning theory, and stochastic optimization. He is currently engaged in the application of machine learning techniques in the development of the Bing search engine.

    Read more →
  • AI Text-to-video Tools: Free vs Paid (2026)

    AI Text-to-video Tools: Free vs Paid (2026)

    Curious about the best AI text-to-video tool? An AI text-to-video tool is software that uses machine learning to help you get more done — it combines speed, accuracy, and an interface that just works. Hands-on testing shows real-world results vary, so a short free trial is the smartest way to decide. Whether you are a beginner or a pro, the right AI text-to-video tool slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Restricted Boltzmann machine

    Restricted Boltzmann machine

    A restricted Boltzmann machine (RBM) (also called a restricted Sherrington–Kirkpatrick model with external field or restricted stochastic Ising–Lenz–Little model) is a generative stochastic artificial neural network that can learn a probability distribution over its set of inputs. RBMs were initially proposed under the name Harmonium by Paul Smolensky in 1986, and rose to prominence after Geoffrey Hinton and collaborators used fast learning algorithms for them in the mid-2000s. RBMs have found applications in dimensionality reduction, classification, collaborative filtering, feature learning, topic modelling, immunology, and even many‑body quantum mechanics. They can be trained in either supervised or unsupervised ways, depending on the task. As their name implies, RBMs are a variant of Boltzmann machines, with the restriction that their neurons must form a bipartite graph: a pair of nodes from each of the two groups of units (commonly referred to as the "visible" and "hidden" units respectively) may have a symmetric connection between them; and there are no connections between nodes within a group. By contrast, "unrestricted" Boltzmann machines may have connections between hidden units. This restriction allows for more efficient training algorithms than are available for the general class of Boltzmann machines, in particular the gradient-based contrastive divergence algorithm. Restricted Boltzmann machines can also be used in deep learning networks. In particular, deep belief networks can be formed by "stacking" RBMs and optionally fine-tuning the resulting deep network with gradient descent and backpropagation. == Structure == The standard type of RBM has binary-valued (Boolean) hidden and visible units, and consists of a matrix of weights W {\displaystyle W} of size m × n {\displaystyle m\times n} . Each weight element ( w i , j ) {\displaystyle (w_{i,j})} of the matrix is associated with the connection between the visible (input) unit v i {\displaystyle v_{i}} and the hidden unit h j {\displaystyle h_{j}} . In addition, there are bias weights (offsets) a i {\displaystyle a_{i}} for v i {\displaystyle v_{i}} and b j {\displaystyle b_{j}} for h j {\displaystyle h_{j}} . Given the weights and biases, the energy of a configuration (pair of Boolean vectors) (v,h) is defined as E ( v , h ) = − ∑ i a i v i − ∑ j b j h j − ∑ i ∑ j v i w i , j h j {\displaystyle E(v,h)=-\sum _{i}a_{i}v_{i}-\sum _{j}b_{j}h_{j}-\sum _{i}\sum _{j}v_{i}w_{i,j}h_{j}} or, in matrix notation, E ( v , h ) = − a T v − b T h − v T W h . {\displaystyle E(v,h)=-a^{\mathrm {T} }v-b^{\mathrm {T} }h-v^{\mathrm {T} }Wh.} This energy function is analogous to that of a Hopfield network. As with general Boltzmann machines, the joint probability distribution for the visible and hidden vectors is defined in terms of the energy function as follows, P ( v , h ) = 1 Z e − E ( v , h ) {\displaystyle P(v,h)={\frac {1}{Z}}e^{-E(v,h)}} where Z {\displaystyle Z} is a partition function defined as the sum of e − E ( v , h ) {\displaystyle e^{-E(v,h)}} over all possible configurations, which can be interpreted as a normalizing constant to ensure that the probabilities sum to 1. The marginal probability of a visible vector is the sum of P ( v , h ) {\displaystyle P(v,h)} over all possible hidden layer configurations, P ( v ) = 1 Z ∑ { h } e − E ( v , h ) {\displaystyle P(v)={\frac {1}{Z}}\sum _{\{h\}}e^{-E(v,h)}} , and vice versa. Since the underlying graph structure of the RBM is bipartite (meaning there are no intra-layer connections), the hidden unit activations are mutually independent given the visible unit activations. Conversely, the visible unit activations are mutually independent given the hidden unit activations. That is, for m visible units and n hidden units, the conditional probability of a configuration of the visible units v, given a configuration of the hidden units h, is P ( v | h ) = ∏ i = 1 m P ( v i | h ) {\displaystyle P(v|h)=\prod _{i=1}^{m}P(v_{i}|h)} . Conversely, the conditional probability of h given v is P ( h | v ) = ∏ j = 1 n P ( h j | v ) {\displaystyle P(h|v)=\prod _{j=1}^{n}P(h_{j}|v)} . The individual activation probabilities are given by P ( h j = 1 | v ) = σ ( b j + ∑ i = 1 m w i , j v i ) {\displaystyle P(h_{j}=1|v)=\sigma \left(b_{j}+\sum _{i=1}^{m}w_{i,j}v_{i}\right)} and P ( v i = 1 | h ) = σ ( a i + ∑ j = 1 n w i , j h j ) {\displaystyle \,P(v_{i}=1|h)=\sigma \left(a_{i}+\sum _{j=1}^{n}w_{i,j}h_{j}\right)} where σ {\displaystyle \sigma } denotes the logistic sigmoid. The visible units of Restricted Boltzmann Machine can be multinomial, although the hidden units are Bernoulli. In this case, the logistic function for visible units is replaced by the softmax function P ( v i k = 1 | h ) = exp ⁡ ( a i k + Σ j W i j k h j ) Σ k ′ = 1 K exp ⁡ ( a i k ′ + Σ j W i j k ′ h j ) {\displaystyle P(v_{i}^{k}=1|h)={\frac {\exp(a_{i}^{k}+\Sigma _{j}W_{ij}^{k}h_{j})}{\Sigma _{k'=1}^{K}\exp(a_{i}^{k'}+\Sigma _{j}W_{ij}^{k'}h_{j})}}} where K is the number of discrete values that the visible values have. They are applied in topic modeling, and recommender systems. === Relation to other models === Restricted Boltzmann machines are a special case of Boltzmann machines and Markov random fields. The graphical model of RBMs corresponds to that of factor analysis. == Training algorithm == Restricted Boltzmann machines are trained to maximize the product of probabilities assigned to some training set V {\displaystyle V} (a matrix, each row of which is treated as a visible vector v {\displaystyle v} ), arg ⁡ max W ∏ v ∈ V P ( v ) {\displaystyle \arg \max _{W}\prod _{v\in V}P(v)} or equivalently, to maximize the expected log probability of a training sample v {\displaystyle v} selected randomly from V {\displaystyle V} : arg ⁡ max W E [ log ⁡ P ( v ) ] {\displaystyle \arg \max _{W}\mathbb {E} \left[\log P(v)\right]} The algorithm most often used to train RBMs, that is, to optimize the weight matrix W {\displaystyle W} , is the contrastive divergence (CD) algorithm due to Hinton, originally developed to train PoE (product of experts) models. The algorithm performs Gibbs sampling and is used inside a gradient descent procedure (similar to the way backpropagation is used inside such a procedure when training feedforward neural nets) to compute weight update. The basic, single-step contrastive divergence (CD-1) procedure for a single sample can be summarized as follows: Take a training sample v, compute the probabilities of the hidden units and sample a hidden activation vector h from this probability distribution. Compute the outer product of v and h and call this the positive gradient. From h, sample a reconstruction v' of the visible units, then resample the hidden activations h' from this. (Gibbs sampling step) Compute the outer product of v' and h' and call this the negative gradient. Let the update to the weight matrix W {\displaystyle W} be the positive gradient minus the negative gradient, times some learning rate: Δ W = ϵ ( v h T − v ′ h ′ T ) {\displaystyle \Delta W=\epsilon (vh^{\mathsf {T}}-v'h'^{\mathsf {T}})} . Update the biases a and b analogously: Δ a = ϵ ( v − v ′ ) {\displaystyle \Delta a=\epsilon (v-v')} , Δ b = ϵ ( h − h ′ ) {\displaystyle \Delta b=\epsilon (h-h')} . A Practical Guide to Training RBMs written by Hinton can be found on his homepage. == Stacked Restricted Boltzmann Machine == The difference between the Stacked Restricted Boltzmann Machines and RBM is that RBM has lateral connections within a layer that are prohibited to make analysis tractable. On the other hand, the Stacked Boltzmann consists of a combination of an unsupervised three-layer network with symmetric weights and a supervised fine-tuned top layer for recognizing three classes. The usage of Stacked Boltzmann is to understand Natural languages, retrieve documents, image generation, and classification. These functions are trained with unsupervised pre-training and/or supervised fine-tuning. Unlike the undirected symmetric top layer, with a two-way unsymmetric layer for connection for RBM. The restricted Boltzmann's connection is three-layers with asymmetric weights, and two networks are combined into one. Stacked Boltzmann does share similarities with RBM, the neuron for Stacked Boltzmann is a stochastic binary Hopfield neuron, which is the same as the Restricted Boltzmann Machine. The energy from both Restricted Boltzmann and RBM is given by Gibb's probability measure: E = − 1 2 ∑ i , j w i j s i s j + ∑ i θ i s i {\displaystyle E=-{\frac {1}{2}}\sum _{i,j}{w_{ij}{s_{i}}{s_{j}}}+\sum _{i}{\theta _{i}}{s_{i}}} . The training process of Restricted Boltzmann is similar to RBM. Restricted Boltzmann train one layer at a time and approximate equilibrium state with a 3-segment pass, not performing back propagation. Restricted Boltzmann uses both supervised and unsupervised on different RBM for pre-training for classification and recognition. The training uses contrastive divergence with

    Read more →
  • Computational photography

    Computational photography

    Computational photography refers to digital image capture and processing techniques that use digital computation instead of optical processes. Computational photography can improve the capabilities of a camera, or introduce features that were not possible at all with film-based photography, or reduce the cost or size of camera elements. Examples of computational photography include in-camera computation of digital panoramas, high-dynamic-range images, and light field cameras. Light field cameras use novel optical elements to capture three-dimensional scene information, which can then be used to produce 3D images, enhanced depth-of-field, and selective de-focusing (or "post focus"). Enhanced depth-of-field reduces the need for mechanical focusing systems. All of these features use computational imaging techniques. The definition of computational photography has evolved to cover a number of subject areas in computer graphics, computer vision, and applied optics. These areas are given below, organized according to a taxonomy proposed by Shree K. Nayar. Within each area is a list of techniques, and for each technique, one or two representative papers or books are cited. Deliberately omitted from the taxonomy are image processing (see also digital image processing) techniques applied to traditionally captured images to produce better images. Examples of such techniques are image scaling, dynamic range compression (i.e. tone mapping), color management, image completion (a.k.a. inpainting or hole filling), image compression, digital watermarking, and artistic image effects. Also omitted are techniques that produce range data, volume data, 3D models, 4D light fields, 4D, 6D, or 8D BRDFs, or other high-dimensional image-based representations. Epsilon photography is a sub-field of computational photography. == Effect on photography == Photos taken using computational photography can allow amateurs to produce photographs rivalling the quality of professional photographers, but as of 2019 do not outperform the use of professional-level equipment. == Computational illumination == This is controlling photographic illumination in a structured fashion, then processing the captured images, to create new images. The applications include image-based relighting, image enhancement, image deblurring, geometry/material recovery and so forth. High-dynamic-range imaging uses differently exposed pictures of the same scene to extend dynamic range. Other examples include processing and merging differently illuminated images of the same subject matter ("lightspace"). == Computational optics == This is a capture of optically coded images, followed by computational decoding to produce new images. Coded aperture imaging was mainly applied in astronomy and X-ray imaging to boost the image quality. Instead of a single pin-hole, a pinhole pattern is applied in imaging, and deconvolution is performed to recover the image. In coded exposure imaging, the on/off state of the shutter is coded to modify the kernel of motion blur. In this way, motion deblurring becomes a well-conditioned problem. Similarly, in a lens based coded aperture, the aperture can be modified by inserting a broadband mask. Thus, out of focus deblurring becomes a well-conditioned problem. The coded aperture can also improve the quality in light field acquisition using Hadamard transform optics. Coded aperture patterns can also be designed using color filters, in order to apply different codes at different wavelengths. This allows for increase the amount of light that reaches the camera sensor, compared to binary masks. == Computational imaging == Computational imaging is a set of imaging techniques that combine data acquisition and data processing to create the image of an object through indirect means to yield enhanced resolution, additional information such as optical phase or 3D reconstruction. The information is often recorded without using a conventional optical microscope configuration or with limited datasets. Computational imaging allows going beyond physical limitations of optical systems, such as numerical aperture, or even obliterates the need for optical elements. For parts of the optical spectrum where imaging elements such as objectives are difficult to manufacture or image sensors cannot be miniaturized, computational imaging provides useful alternatives, in fields such as X-ray and THz radiations. === Common techniques === Among common computational imaging techniques are lensless imaging, computational speckle imaging , ptychography and Fourier ptychography. Computational imaging technique often draws on compressive sensing or phase retrieval techniques, where the angular spectrum of the object is reconstructed. Other techniques are related to the field of computational imaging, such as digital holography, computer vision and inverse problems such as tomography. == Computational processing == This is the processing of non-optically-coded images to produce new images. == Computational sensors == These are detectors that combine sensing and processing, typically in hardware, like the oversampled binary image sensor. == Early work in computer vision == Although computational photography is a currently popular buzzword in computer graphics, many of its techniques first appeared in the computer vision literature, either under other names or within papers aimed at 3D shape analysis. == Art history == Computational photography, as an art form, has been practiced by capturing differently exposed pictures of the same subject matter and combining them. This was the inspiration for the development of the wearable computer in the 1970s and early 1980s. Computational photography was inspired by the work of Charles Wyckoff, and thus computational photography datasets (e.g. differently exposed pictures of the same subject matter that are taken in order to make a single composite image) are sometimes referred to as Wyckoff Sets, in his honor. Early work in this area (joint estimation of image projection and exposure value) was undertaken by Mann and Candoccia. Charles Wyckoff devoted much of his life to creating special kinds of 3-layer photographic films that captured different exposures of the same subject matter. A picture of a nuclear explosion, taken on Wyckoff's film, appeared on the cover of Life Magazine and showed the dynamic range from the dark outer areas to the inner core.

    Read more →
  • Forrest N. Iandola

    Forrest N. Iandola

    Forrest N. Iandola is an American computer scientist specializing in efficient AI. == Career == Iandola earned a PhD in Electrical Engineering and Computer Science from UC Berkeley in 2016, advised by Kurt Keutzer. As part of his dissertation, he co-authored SqueezeNet, a deep neural network for image classification optimized for smartphones and other mobile devices. Iandola and Keutzer went on to co-found DeepScale. The firm squeezes deep neural networks onto low-cost automotive-grade processors for use in driver assistance systems. Tesla acquired DeepScale in 2019. In 2020, he co-authored SqueezeBERT, an efficient neural network for natural language processing. In 2022, he joined Meta as an AI research scientist. His research at Meta includes developing efficient AI models, such as EfficientSAM and MobileLLM.

    Read more →
  • Top 10 AI Writing Assistants Compared (2026)

    Top 10 AI Writing Assistants Compared (2026)

    Trying to pick the best AI writing assistant? An AI writing assistant is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI writing assistant slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →