AI For Business Guide

AI For Business Guide — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Crucible (software)

    Crucible (software)

    Crucible is a collaborative code review application by Australian software company Atlassian. Like other Atlassian products, Crucible is a Web-based application primarily aimed at enterprise, and certain features that enable peer review of a codebase may be considered enterprise social software. Crucible is particularly tailored to remote workers, and facilitates asynchronous review and commenting on code. Crucible also integrates with popular source control tools, such as Git and Subversion. Crucible is not open source, but customers are allowed to view and modify the code for their own use.

    Read more →
  • Wolfgang Ketter

    Wolfgang Ketter

    Wolfgang Ketter (born Traben-Trarbach, Germany, 1972) is Chaired Professor of Information Systems for a Sustainable Society at the University of Cologne. and a prominent scientist in the application of artificial intelligence, machine learning and intelligent agents in the design of smart markets, including demand response mechanisms and in particular automated auctions. He is a co-founder of the open energy system platform Power TAC, an automated retail electricity trading platform that simulates the performance of retail markets in an increasingly prosumer- and renewable-energy-influenced electricity landscape. == Career == === Advisory roles === Ketter is an advisor on the energy transition to the German government, in particular, the energy-intensive German state of North Rhine-Westphalia. He is also a fellow of the World Economic Forum and member of the WEF Global Council on Future Mobility and the Global New Mobility Coalition, contributing on the use of AI and machine learning to address issues arising from growth in electrification of energy such as the use of batteries as virtual power plants, the management of electric vehicle charging to prevent grid congestion, or the potential for peer-to-peer electricity trading. Ketter has also been an advisor for over a decade to the Port of Rotterdam on the design of energy cooperatives and energy trading platforms as well as one of the largest auction companies in the world, Royal FloraHolland, where his initial research led to a redesign of auction mechanisms and decision support systems. The cumulative research project team received the Association for Information Systems Impact Award in 2020 === Research === Ketter’s research is multidisciplinary, addressing the overlap of AI and ML in the economics of retail energy and mobility markets. The industry and policy applications of his research interconnect in large-scale projects such as the EU Smart city development project Ruggedised, for which the Erasmus University-based team's publication on the optimization of the City of Rotterdam's electric transit bus network was recognized with the Institute for Operations Research and the Management Sciences Daniel H. Wagner runner-up award. His research focuses on the use of competitive benchmarking and intelligent agents in virtual world simulations of retail energy markets as part of a smart grid. A small-scale version of the Power TAC project led to a publication on demand side management, 'A simulation of household behavior under variable prices' that has several hundred citations in publications representing a variety of scientific disciplines. Two of his publications in the Management Information Systems Quarterly journal and one in Energy Economics form the foundation for the current Power TAC platform. In 2016 and 2019 he was Chair of the Workshop on Information Technologies and Systems. Ketter is Coordinator of the Key Research Initiative Sustainable Smart Energy & Mobility at the University of Cologne, where he is a chaired Professor of Information Systems for a Sustainable Society. At the Rotterdam School of Management, Erasmus University, he is Professor of Next Generation Information Systems as well as Director of the Erasmus Centre for Future Energy Business and Academic Director of Smart Cities and Smart Energy at the Erasmus Centre of Data Analytics. He has been a visiting professor at the Haas School of Business and Berkeley Institute of Data Science, University of California at Berkeley in 2016 to 2017.

    Read more →
  • The Best Free AI Image Generator for Beginners

    The Best Free AI Image Generator for Beginners

    In search of the best AI image generator? An AI image generator is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI image generator slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • Internettolken

    Internettolken

    Internettolken (or InternetPreter) is a web-based machine translating tool. As the first Swedish online translating service, it was started in 2002 and included the English and Swedish languages. Today, there are 14 languages with more than 120 possible combinations. The service is free up to 150 words per day, and as a 2,000-word free testing account. It is available both on its website, and as a gadget on iGoogle. The interface is either English or Swedish. Being a dictionary-based tool, with its own translation software, it can sometimes offer a more accurate translation than Google Translate and others, although the grammar will be incorrect. == Languages currently available ==

    Read more →
  • Color moments

    Color moments

    Color moments are measures that characterise color distribution in an image in the same way that central moments uniquely describe a probability distribution. Color moments are mainly used for color indexing purposes as features in image retrieval applications in order to compare how similar two images are based on color. Usually one image is compared to a database of digital images with pre-computed features in order to find and retrieve a similar Image. Each comparison between images results in a similarity score, and the lower this score is the more identical the two images are supposed to be. == Overview == Color moments are scaling and rotation invariant. It is usually the case that only the first three color moments are used as features in image retrieval applications as most of the color distribution information is contained in the low-order moments. Since color moments encode both shape and color information they are a good feature to use under changing lighting conditions, but they cannot handle occlusion very successfully. Color moments can be computed for any color model. Three color moments are computed per channel (e.g. 9 moments if the color model is RGB and 12 moments if the color model is CMYK). Computing color moments is done in the same way as computing moments of a probability distribution. === Mean === The first color moment can be interpreted as the average color in the image, and it can be calculated by using the following formula E i = ∑ j = 1 N 1 N p i j {\displaystyle E_{i}=\textstyle \sum _{j=1}^{N}{\frac {1}{N}}p_{ij}} where N is the number of pixels in the image and p i j {\displaystyle p_{ij}} is the value of the j-th pixel of the image at the i-th color channel. === Standard Deviation === The second color moment is the standard deviation, which is obtained by taking the square root of the variance of the color distribution. σ i = ( 1 N ∑ j = 1 N ( p i j − E i ) 2 ) {\displaystyle \sigma _{i}={\sqrt {({\frac {1}{N}}\textstyle \sum _{j=1}^{N}(p_{ij}-E_{i})^{2})}}} where E i {\displaystyle E_{i}} is the mean value, or first color moment, for the i-th color channel of the image. === Skewness === The third color moment is the skewness. It measures how asymmetric the color distribution is, and thus it gives information about the shape of the color distribution. Skewness can be computed with the following formula: s i = ( 1 N ∑ j = 1 N ( p i j − E i ) 3 ) 3 σ i {\displaystyle s_{i}={\frac {\sqrt[{3}]{\left({\frac {1}{N}}\textstyle \sum _{j=1}^{N}(p_{ij}-E_{i})^{3}\right)}}{\sigma _{i}}}} === Kurtosis === Kurtosis is the fourth color moment, and, similarly to skewness, it provides information about the shape of the color distribution. More specifically, kurtosis is a measure of how extreme the tails are in comparison to the normal distribution. === Higher-order color moments === Higher-order color moments are usually not part of the color moments feature set in image retrieval tasks as they require more data in order to obtain a good estimate of their value, and also the lower-order moments generally provide enough information. == Applications == Color moments have significant applications in image retrieval. They can be used in order to compare how similar two images are. This is a relatively new approach to color indexing. The greatest advantage of using color moments comes from the fact that there is no need to store the complete color distribution. This greatly speeds up image retrieval since there are less features to compare. In addition, the first three color moments have the same units, which allows for comparison between them. === Color indexing === Color indexing is the main application of color moments. Images can be indexed, and the index will contain the computed color moments. Then, if someone has a particular image and wants to find similar images in the database, the color moments of the image of interest will also be computed. After that the following function will be used in order to compute a similarity score between the image of interest and all the images in the database: d m o m ( H , I ) = ∑ i = 1 r w i 1 | E i 1 − E i 2 | + w i 2 | σ i 1 − σ i 2 | + w i 3 | s i 1 − s i 2 | {\displaystyle d_{mom}(H,I)=\textstyle \sum _{i=1}^{r}w_{i1}|E_{i}^{1}-E_{i}^{2}|+w_{i2}|\sigma _{i}^{1}-\sigma _{i}^{2}|+w_{i3}|s_{i}^{1}-s_{i}^{2}|} where: H and I are the color distributions of the two images that are being compared i is the channel index and r is the total number of channels E i 1 {\displaystyle E_{i}^{1}} and E i 2 {\displaystyle E_{i}^{2}} are the first order moments computed for the image distributions. σ i 1 {\displaystyle \sigma _{i}^{1}} and σ i 2 {\displaystyle \sigma _{i}^{2}} are the second order moments computed for the image distributions. s_i^1 and s_i^2 are the third order moments computed for the image distributions. w i 1 {\displaystyle w_{i1}} , w i 2 {\displaystyle w_{i2}} , and w i 3 {\displaystyle w_{i3}} are weights, specified by the user, for each of the three color moments used. Finally, the images in the database will be ranked according to the computed similarity score with the image of interest, and the database images with the lowest d m o m ( H , I ) {\displaystyle d_{mom}(H,I)} value should be retrieved. "A retrieval based on d m o m ( H , I ) {\displaystyle d_{mom}(H,I)} may produce false positives because the index contains no information about the correlation between the color channels". == Example == A simple and concise example of the use of color moments for image retrieval tasks is illustrated in. Consider having several test images in a database and a "New Image". The goal is to retrieve images from the database that are similar to the "New Image". The first three color moments are used as features. There are several steps in this computation. Image preprocessing (Optional) - The image preprocessing step of the computation process is optional. For example, in this step all images could be modified to be the same size (in terms of pixels). However, since color moments are invariant to scaling, it is not necessary to make all images the same width and height. Computing the features - Use the color moments formulae in order to compute the first three moments for each of the color channels in the image. For example, if the HSV color space is used, this means that for each of the images, 9 features in total will be computed (the first three order moments for the Hue, Saturation, and Value channels). Calculating the similarity score - After computing the color moments the weights for each of the moments in the d m o m ( H , I ) {\displaystyle d_{mom}(H,I)} function should be determined by the user. The weights have to be adjusted each time in accordance with the application or condition and quality of the images. Following that the d m o m ( H , I ) {\displaystyle d_{mom}(H,I)} function is used to calculate a similarity score for the "New Image" and each of the images in the database. Ranking and image retrieval - From the previous step the d m o m ( H , I ) {\displaystyle d_{mom}(H,I)} values were obtained. Now a comparison of these values can be made in order to decide which of the images in the database are more similar to the "New Image", and thus rank the database images accordingly. The smaller the d m o m ( H , I ) {\displaystyle d_{mom}(H,I)} value is the more similar the two color distributions are supposed to be. Finally, some of the top ranked images (the ones with the smallest d m o m ( H , I ) {\displaystyle d_{mom}(H,I)} value) from the database are retrieved.

    Read more →
  • NFA minimization

    NFA minimization

    In automata theory (a branch of theoretical computer science), NFA minimization is the task of transforming a given nondeterministic finite automaton (NFA) into an equivalent NFA that has a minimum number of states, transitions, or both. While efficient algorithms exist for DFA minimization, NFA minimization is PSPACE-complete. No efficient (polynomial time) algorithms are known, and under the standard assumption that P ≠ PSPACE, none exist. The most efficient known algorithm is the Kameda–Weiner algorithm. == Non-uniqueness of minimal NFA == Unlike deterministic finite automata, minimal NFAs may not be unique. There may be multiple NFAs with the same number of states that accept the same regular language, but for which there is no equivalent NFA or DFA with fewer states.

    Read more →
  • Marcus Hutter

    Marcus Hutter

    Marcus Hutter (born 14 April 1967 in Munich) is a German computer scientist, professor and artificial intelligence researcher. As a senior researcher at DeepMind, he studies the mathematical foundations of artificial general intelligence. Hutter studied physics and computer science at the Technical University of Munich. In 2000, he joined Jürgen Schmidhuber's group at the Dalle Molle Institute for Artificial Intelligence Research in Manno, Switzerland. He developed a mathematical formalism of artificial general intelligence named AIXI. He has served as a professor at the College of Engineering, Computing and Cybernetics of the Australian National University in Canberra, Australia. == Research == Starting in 2000, Hutter developed and published a mathematical theory of artificial general intelligence, AIXI, based on idealised intelligent agents and reward-motivated reinforcement learning. His first book Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability was published in 2005 by Springer. Also in 2005, Hutter published with his doctoral student Shane Legg an intelligence test for artificial intelligence devices. In 2009, Hutter developed and published the theory of feature reinforcement learning. In 2014, Lattimore and Hutter published an asymptotically optimal extension of the AIXI agent. An accessible podcast with Lex Fridman about his theory of Universal AI appeared in 2021 and a more technical follow-up with Tim Nguyen in 2024 in the Cartesian Cafe. His new (2024) book also gives a more accessible introduction to Universal AI and progress in the 20 years since his first book, including a chapter on ASI safety, which featured as a keynote at the inaugural workshop on AI safety in Sydney. == Hutter Prize == In 2006, Hutter announced the Hutter Prize for Lossless Compression of Human Knowledge, with a total of €50,000 in prize money. In 2020, Hutter raised the prize money for the Hutter Prize to €500,000.

    Read more →
  • Top 10 AI Chatbots Compared (2026)

    Top 10 AI Chatbots Compared (2026)

    Shopping for the best AI chatbot? An AI chatbot is software that uses machine learning to help you get more done — it keeps getting smarter as the underlying models improve. Pricing, accuracy, and the size of the model behind the tool are the three factors that most affect daily usefulness. Whether you are a beginner or a pro, the right AI chatbot slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • Hekaton (database)

    Hekaton (database)

    Hekaton (also known as SQL Server In-Memory OLTP) is an in-memory database for OLTP workloads built into Microsoft SQL Server. Hekaton was designed in collaboration with Microsoft Research and was released in SQL Server 2014. Traditional RDBMS systems were designed when memory resources were expensive, and were optimized for disk storage. Hekaton is instead optimized for a working set stored entirely in main memory, but is still accessible via T-SQL like normal tables. It is fundamentally different from the "DBCC PINTABLE" feature in earlier SQL Server versions. Hekaton was announced at the Professional Association for SQL Server (PASS) conference 2012.

    Read more →
  • Unsupervised learning

    Unsupervised learning

    Unsupervised learning is a framework in machine learning where, in contrast to supervised learning, algorithms learn patterns exclusively from unlabeled data. Other frameworks in the spectrum of supervisions include weak- or semi-supervision, where a small portion of the data is tagged, and self-supervision. Some researchers consider self-supervised learning a form of unsupervised learning. Conceptually, unsupervised learning divides into the aspects of data, training, algorithm, and downstream applications. Typically, the dataset is harvested cheaply "in the wild", such as massive text corpus obtained by web crawling, with only minor filtering (such as Common Crawl). This compares favorably to supervised learning, where the dataset (such as the ImageNet1000) is typically constructed manually, which is much more expensive. There are algorithms designed specifically for unsupervised learning, such as clustering algorithms like k-means, dimensionality reduction techniques like principal component analysis (PCA), Boltzmann machine learning, and autoencoders. After the rise of deep learning, most large-scale unsupervised learning has been done by training general-purpose neural network architectures by gradient descent, adapted to performing unsupervised learning by designing an appropriate training procedure. Sometimes a trained model can be used as-is, but more often they are modified for downstream applications. For example, the generative pretraining method trains a model to generate a textual dataset, before finetuning it for other applications, such as text classification. As another example, autoencoders are trained to produce good features, which can then be used as a module for other models, such as in a latent diffusion model. == Tasks == Tasks are often categorized as discriminative (recognition) or generative (imagination). Often but not always, discriminative tasks use supervised methods and generative tasks use unsupervised (see Venn diagram); however, the separation is very hazy. For example, object recognition favors supervised learning but unsupervised learning can also cluster objects into groups. Furthermore, as progress marches onward, some tasks employ both methods, and some tasks swing from one to another. For example, image recognition started off as heavily supervised, but became hybrid by employing unsupervised pre-training, and then moved towards supervision again with the advent of dropout, ReLU, and adaptive learning rates. A typical generative task is as follows. At each step, a datapoint is sampled from the dataset, and part of the data is removed, and the model must infer the removed part. This is particularly clear for the denoising autoencoders and BERT. == Neural network architectures == === Training === During the learning phase, an unsupervised network tries to mimic the data it is given and uses the error in its mimicked output to correct itself (i.e. correct its weights and biases). Sometimes the error is expressed as a low probability that the erroneous output occurs, or it might be expressed as an unstable high energy state in the network. In contrast to supervised methods' dominant use of backpropagation, unsupervised learning also employs other methods including: Hopfield learning rule, Boltzmann learning rule, Contrastive Divergence, Wake Sleep, Variational Inference, Maximum Likelihood, Maximum A Posteriori, Gibbs Sampling, and backpropagating reconstruction errors or hidden state reparameterizations. See the table below for more details. === Energy === An energy function is a macroscopic measure of a network's activation state. In Boltzmann machines, it plays the role of the Cost function. This analogy with physics is inspired by Ludwig Boltzmann's analysis of a gas' macroscopic energy from the microscopic probabilities of particle motion p ∝ e − E / k T {\displaystyle p\propto e^{-E/kT}} , where k is the Boltzmann constant and T is temperature. In the RBM network the relation is p = e − E / Z {\displaystyle p=e^{-E}/Z} , where p {\displaystyle p} and E {\displaystyle E} vary over every possible activation pattern and Z = ∑ All Patterns e − E ( pattern ) {\displaystyle \textstyle {Z=\sum _{\scriptscriptstyle {\text{All Patterns}}}e^{-E({\text{pattern}})}}} . To be more precise, p ( a ) = e − E ( a ) / Z {\displaystyle p(a)=e^{-E(a)}/Z} , where a {\displaystyle a} is an activation pattern of all neurons (visible and hidden). Hence, some early neural networks bear the name Boltzmann Machine. Paul Smolensky calls − E {\displaystyle -E\,} the Harmony. A network seeks low energy which is high Harmony. === Networks === This table shows connection diagrams of various unsupervised networks, the details of which will be given in the section Comparison of Networks. Circles are neurons and edges between them are connection weights. As network design changes, features are added on to enable new capabilities or removed to make learning faster. For instance, neurons change between deterministic (Hopfield) and stochastic (Boltzmann) to allow robust output, weights are removed within a layer (RBM) to hasten learning, or connections are allowed to become asymmetric (Helmholtz). Of the networks bearing people's names, only Hopfield worked directly with neural networks. Boltzmann and Helmholtz came before artificial neural networks, but their work in physics and physiology inspired the analytical methods that were used. === History === === Specific Networks === Here, we highlight some characteristics of select networks. The details of each are given in the comparison table below. Hopfield Network Ferromagnetism inspired Hopfield networks. A neuron corresponds to an iron domain with binary magnetic moments Up and Down, and neural connections correspond to the domain's influence on each other. Symmetric connections enable a global energy formulation. During inference the network updates each state using the standard activation step function. Symmetric weights and the right energy functions guarantees convergence to a stable activation pattern. Asymmetric weights are difficult to analyze. Hopfield nets are used as Content Addressable Memories (CAM). Boltzmann Machine These are stochastic Hopfield nets. Their state value is sampled from this pdf as follows: suppose a binary neuron fires with the Bernoulli probability p(1) = 1/3 and rests with p(0) = 2/3. One samples from it by taking a uniformly distributed random number y, and plugging it into the inverted cumulative distribution function, which in this case is the step function thresholded at 2/3. The inverse function = { 0 if x <= 2/3, 1 if x > 2/3 }. Sigmoid Belief Net Introduced by Radford Neal in 1992, this network applies ideas from probabilistic graphical models to neural networks. A key difference is that nodes in graphical models have pre-assigned meanings, whereas Belief Net neurons' features are determined after training. The network is a sparsely connected directed acyclic graph composed of binary stochastic neurons. The learning rule comes from Maximum Likelihood on p(X): Δwij ∝ {\displaystyle \propto } sj (si - pi), where pi = 1 / ( 1 + eweighted inputs into neuron i ). sj's are activations from an unbiased sample of the posterior distribution and this is problematic due to the Explaining Away problem raised by Judea Perl. Variational Bayesian methods uses a surrogate posterior and blatantly disregard this complexity. Deep Belief Network Introduced by Hinton, this network is a hybrid of RBM and Sigmoid Belief Network. The top 2 layers is an RBM and the second layer downwards form a sigmoid belief network. One trains it by the stacked RBM method and then throw away the recognition weights below the top RBM. As of 2009, 3-4 layers seems to be the optimal depth. Helmholtz machine These are early inspirations for the Variational Auto Encoders. Its 2 networks combined into one—forward weights operates recognition and backward weights implements imagination. It is perhaps the first network to do both. Helmholtz did not work in machine learning but he inspired the view of "statistical inference engine whose function is to infer probable causes of sensory input". the stochastic binary neuron outputs a probability that its state is 0 or 1. The data input is normally not considered a layer, but in the Helmholtz machine generation mode, the data layer receives input from the middle layer and has separate weights for this purpose, so it is considered a layer. Hence this network has 3 layers. Variational autoencoder These are inspired by Helmholtz machines and combines probability network with neural networks. An Autoencoder is a 3-layer CAM network, where the middle layer is supposed to be some internal representation of input patterns. The encoder neural network is a probability distribution qφ(z given x) and the decoder network is pθ(x given z). The weights are named phi & theta rather than W and V as in Helmholtz—a cosmetic difference. These 2 networks h

    Read more →
  • Best AI Clip Makers in 2026

    Best AI Clip Makers in 2026

    Trying to pick the best AI clip maker? An AI clip maker is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI clip maker slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • AI Text-to-image Tools: Free vs Paid (2026)

    AI Text-to-image Tools: Free vs Paid (2026)

    Shopping for the best AI text-to-image tool? An AI text-to-image tool is software that uses machine learning to help you get more done — it keeps getting smarter as the underlying models improve. Pricing, accuracy, and the size of the model behind the tool are the three factors that most affect daily usefulness. Whether you are a beginner or a pro, the right AI text-to-image tool slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • Agent Ruby

    Agent Ruby

    Agent Ruby (1998–2002) by Lynn Hershman Leeson is an interactive, multiuser work using artificial intelligence. == Description == On Agent Ruby's website, "Agent Ruby's Edream Portal," a female face moves her eyes and lips. Ruby, named from Hershman Leeson's own film, Teknolust, answers questions and often responds that she needs a better algorithm to answer questions not within her database. The work, created with AI, explores relationships between real and virtual worlds. Hershman Leeson had created an earlier version of Ruby, CyberRoberta, which was a custom-made doll with webcam eyes that interacted with the internet. The work in a gallery provides a screen and a sign inviting gallery-goers to "Chat with Ruby." == Artificial intelligence == In 2015 when Agent Ruby was exhibited at the gallery Modern Art Oxford, a review in Aesthetica Magazine described it as an artificial intelligence agent. A review in New Scientist noted that "Ruby is a fast learner, but perhaps not a natural conversationalist." A 2024 list of "25 Essential AI Artworks" published by ARTnews wrote that while "Agent Ruby's capabilities seem limited by today's standards," it was extensive for its day. == Publications and exhibitions == Agent Ruby was commissioned and displayed at the San Francisco Museum of Modern Art, Modern Art Oxford, and the ZKM Center for Art and Media in Karlsruhe, Germany. The San Francisco Museum of Modern Art (SFMOMA) presented Lynn Hershman Leeson: The Agent Ruby Files, March 30 through June 2, 2013 which presented the project server's archive of user conversations over the 12 years of exhibitions.

    Read more →
  • How to Choose an AI Analytics Tool

    How to Choose an AI Analytics Tool

    Looking for the best AI analytics tool? An AI analytics tool is software that uses machine learning to help you get more done — it can save you hours every week by automating repetitive work. Most options offer a generous free tier, with paid plans unlocking higher limits, faster processing, and team features. Whether you are a beginner or a pro, the right AI analytics tool slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Salvatore J. Stolfo

    Salvatore J. Stolfo

    Salvatore J. Stolfo is an academic and professor of computer science at Columbia University, specializing in computer security. == Early life == Born in Brooklyn, New York, Stolfo received a Bachelor of Science degree in Computer Science and Mathematics from Brooklyn College in 1974. He received his Ph.D. from NYU Courant Institute in 1979 and has been on the faculty of Columbia ever since, where he's taught courses in Artificial Intelligence, Intrusion and Anomaly Detection Systems, Introduction to Programming, Fundamental Algorithms, Data Structures, and Knowledge-Based Expert Systems. == Academic research == While at Columbia, Stolfo has received close to $50M in funding for research that has broadly focused on Security, Intrusion Detection, Anomaly Detection, Machine Learning and includes early work in parallel computing and artificial intelligence. He has published or co-authored over 250 papers and has over 46,000 citations with an H-index of 102. In 1996 he proposed a project with DARPA that applies machine learning to behavioral patterns to detect fraud or intrusion in networks. DADO, developed by in part by Stolfo, introduced the parallel computing primitive: “Broadcast, Resolve, Report”, a hardwire implemented mechanism that today is called MapReduce. Among his earliest work, Stolfo along with colleague Greg Vesonder of Bell Labs, developed a large-scale expert data analysis system, called ACE (Automated Cable Expertise) for the nation's phone system. AT&T Bell Labs distributed ACE to a number of telephone wire centers to improve the management and scheduling of repairs in the local loop. Stolfo coined the term FOG computing (not to be confused with fog computing) where technology is used “to launch disinformation attacks against malicious insiders, preventing them from distinguishing the real sensitive customer data from fake worthless data.” In 2005 Stolfo received funding from the Army Research Office to conduct a workshop to bring together a group of researchers to help identify a research program to focus on insider threats. He was elevated to IEEE Fellow in 2018 "for his contributions to machine learning based cybersecurity." He was elected as an ACM Fellow in 2019 "for contributions to machine-learning-based cybersecurity and parallel hardware for database inference systems". == Career == Founded in 2011, Red Balloon Security (or RBS) is a cyber security company founded by Dr Sal Stolfo and Dr Ang Cui. A spinout from the IDS lab, RBS developed a symbiote technology called FRAK as a host defense for embedded systems under the sponsorship of DARPA's Cyber Fast Track program. Created based on their IDS lab research for the DARPA Active Authentication and the Anomaly Detection at Multiple Scales program, Dr Sal Stolfo and Dr. Angelos Keromytis founded Allure Security Technologies. Using active behavioral authentication and decoy technology Stolfo pioneered and patented in 1996. Founded in 2009, Allure Security Technology was created based on work done under DARPA sponsorship in Columbia's IDS lab based on DARPA prompts to research how to detect hackers once they are inside an organization's perimeter and how to continuously authenticate a user without a password. Stolfo's company Electronic Digital Documents produced a “DataBlade” technology, which Informix marketed during their strategy of acquisition and development in the mid 80's. Stolfo's patented merge/purge technology called EDD DataCleanser DataBlade was licensed by Informix. Since its acquisition by IBM in 2005, IBM Informix is one of the world's most widely used database servers, with users ranging from the world's largest corporations to startups. System Detection was one of the companies founded by Prof. Stolfo to commercialize the Anomaly Detection technology developed in the IDS lab. The company ultimately reorganized and was rebranded as Trusted Computer Solutions. That company was recently acquired by Raytheon. Recently a jury awarded Columbia University $185 million for patent infringement for one of Prof. Stolfo's inventions, the Application Communities technology. https://news.columbia.edu/news/columbia-university-awarded-185-million-patent-infringement-nortonlifelock-inc. The final order from the judge applied nearly treble damages: https://www.reuters.com/legal/litigation/gen-digital-owes-columbia-481-mln-us-patent-fight-judge-says-2023-10-02/

    Read more →