AI Analytics For Retail

AI Analytics For Retail — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • BioBIKE

    BioBIKE

    BioBike(nee. BioLingua ) is a cloud-based, through-the-web programmable (Paas) symbolic biocomputing and bioinformatics platform that aims to make computational biology, and especially intelligent biocomputing (that is, the application of Artificial Intelligence to computational biology) accessible to research scientists who are not expert programmers. == Unique capabilities == BioBIKE is an integrated symbolic biocomputing and bioinformatics platform, built from the start as an entirely (what is now called) cloud-based architecture where all computing is done in remote servers, and all user access is accomplished through web browsers. BioBIKE has a built-in frame system in which all objects, data, and knowledge are represented. This enables code written either in the native Lisp, in the visual programming language, or systems of rules expressed in the SNARK theorem prover to access the whole of biological knowledge in an integrated manner. For its time (released in 2002) it was unique in permitting users to create fully functional biocomputing programs that run on the back-end servers entirely through the web browser UI. (In modern terms it was one of the first PaaS (Platform as a Service) systems, predating even Salesforce in this capability.) Initially this programming was carried out in raw Lisp, but Jeff Elhai's team at VCU, with NSF funding, created an entirely graphical programming environment on top of BioBIKE based upon the Boxer-style programming environments. Being a multi-headed, multi-threaded, multi-user, multi-tenancy cloud-based system, BioBIKE users were able to directly work together through their web browsers, remotely sharing the same listener and memory space. This permitted a unique sort of collaboration, discussed in Shrager (2007). A specialized offshoot of BioBIKE called "BioDeducta" includes SRI's SNARK theorem prover, offering unique "deductive biocomputing" capabilities. == Implementation == BioBIKE is open-source software implemented using the Lisp programming language. Continuing development takes place by the BioBIKE team centered at Virginia Commonwealth University . == History == BioBIKE was originally called "BioLingua", and was developed by Jeff Shrager at The Carnegie Inst. of Washington Dept. of Plant Biology, and JP Massar with funding from NASA's Astrobiology Division. Shrager and Massar wanted to create a web-based, multi-user Lisp Machine, specialized for bioinformatics. Other early contributors to the project included Mike Travers, and Jeff Elhai of VCU. Elhai obtained continuing funding from the National Science Foundation for the project, which was renamed BioBIKE. Elhai and colleagues added BioBIKE's unique visual programming language. Shrager, meanwhile, collaborated with Richard Waldinger at SRI to build SRI's (SNARK) theorem prover into BioBIKE, creating a deductive biocomputing system, called BioDeducta. == Instances == There used to be a number of BioBIKE verticals in different biological domains, including viral pathogens, cyanobacteria and other bacteria, Arabidopsis thaliana, and several others described in the references.

    Read more →
  • Best AI Code Generators in 2026

    Best AI Code Generators in 2026

    Comparing the best AI code generator? An AI code generator is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI code generator slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • Marine Carpuat

    Marine Carpuat

    Marine Carpuat is a computer scientist who works on machine translation and natural language processing. She is known for her research connecting cross-lingual semantics with machine translation. She has been recognized with a NSF Career Award in 2018, a Google Research award in 2016, and Amazon Faculty Awards in 2016 and 2018. == Education == Marine Carpuat obtained her MPhil and PhD from Hong Kong University of Science and Technology in 2008 under the supervision of Dekai Wu. Her PhD thesis was on the topic of machine translation, and demonstrated the first results showing that explicit modeling of lexical semantics could improve the accuracy of a machine translation system. == Career == After completing her education, Carpuat worked at the National Research Council Canada as a researcher. In 2015, she joined University of Maryland as an assistant professor in Computer Science where she is a member of the CLIP lab. Carpuat works in the area of natural language processing with a focus on machine translation and cross-lingual semantics. She has published over 100 peer-reviewed research papers. Her work is published in the proceedings of computer science conferences, including the Annual Meeting of the Association for Computational Linguistics and Empirical Methods in Natural Language Processing. == Selected honors and distinctions == 2016 Google Research Award 2016, 2018 Amazon Research Awards 2018 NSF Career Award

    Read more →
  • AI Paragraph Rewriters: Free vs Paid (2026)

    AI Paragraph Rewriters: Free vs Paid (2026)

    Curious about the best AI paragraph rewriter? An AI paragraph rewriter is software that uses machine learning to help you get more done — it combines speed, accuracy, and an interface that just works. Hands-on testing shows real-world results vary, so a short free trial is the smartest way to decide. Whether you are a beginner or a pro, the right AI paragraph rewriter slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Vatican News App

    Vatican News App

    The Vatican News App is an official mobile application software issued by the Vatican's Dicastery for Communication. Formerly titled The Pope App, the app was launched on January 23, 2013, under the auspices of the Pontifical Council for Social Communications, a now-defunct dicastery that was merged into the Secretariat (now Dicastery) for Communication in March 2016. Initially, The Pope App was available only on iOS devices, but became available for Android phones at the end of February 2013. The app is available for download on iOS and Android in five languages: English, French, Italian, Portuguese and Spanish. It was originally promoted as an application with focus on the figure of the Pope which made it possible to follow the Pope's events while they are taking place. Alerts notified the followers by informing and offering access to "official papal-related content in a variety of formats". The app also enabled its users to see areas of the Vatican through webcams allocated throughout St. Peter's Square in Rome that broadcast images. In early 2018, The Pope App was relaunched as the Vatican News App, accompanied by a redesign that eliminated many of the previous version's features, reducing the app to a more conventional news service, with increased emphasis on news from the Vatican and the worldwide Catholic Church and less focus on the day-to-day activities of the Pope.

    Read more →
  • How to Choose an AI Art Generator

    How to Choose an AI Art Generator

    Looking for the best AI art generator? An AI art generator is software that uses machine learning to help you get more done — it can save you hours every week by automating repetitive work. Most options offer a generous free tier, with paid plans unlocking higher limits, faster processing, and team features. Whether you are a beginner or a pro, the right AI art generator slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Global Language Monitor

    Global Language Monitor

    The Global Language Monitor (GLM) is a company based in Austin, Texas, that analyzes trends in the English language. == History == Founded in Silicon Valley in 2003 by Paul J.J. Payack, the GLM describes its role as "a media analytics company that documents, analyzes and tracks cultural trends in language the world over, with a particular emphasis upon International and Global English". In April 2008, GLM moved its headquarters from San Diego to Austin. In July 2020, GLM announced that the word covid was its Top Word of 2020 for English. The company has been repeatedly criticized by linguists for promoting misinformation about language. Writing on Language Log, the linguist Ben Zimmer accused it of "hoodwink[ing] unsuspecting journalists on a range of pseudoscientific claims".

    Read more →
  • Markov property

    Markov property

    In probability theory and statistics, the Markov property is the memoryless property of a stochastic process, which means that its future evolution is independent of its history. It is named after the Russian mathematician Andrey Markov. The term strong Markov property is similar to the Markov property, except that the meaning of "present" is defined in terms of a random variable known as a stopping time. The term Markov assumption is used to describe a model where the Markov property is assumed to hold, such as a hidden Markov model. A Markov random field extends this property to two or more dimensions or to random variables defined for an interconnected network of items. An example of a model for such a field is the Ising model. A discrete-time stochastic process satisfying the Markov property is known as a Markov chain. == Introduction == A stochastic process has the Markov property if the conditional probability distribution of future states of the process (conditional on both past and present values) depends only upon the present state; that is, given the present, the future does not depend on the past. A process with this property is said to be Markov or Markovian and known as a Markov process. Two famous classes of Markov process are the Markov chain and Brownian motion. Note that there is a subtle, often overlooked and very important point that is often missed in the plain English statement of the definition: the statespace of the process is constant through time. The conditional description involves a fixed "bandwidth". For example, without this restriction we could augment any process to one which includes the complete history from a given initial condition and it would be made to be Markovian. But the state space would be of increasing dimensionality over time and does not meet the definition. == History == == Definition == Let ( Ω , F , P ) {\displaystyle (\Omega ,{\mathcal {F}},P)} be a probability space with a filtration ( F s , s ∈ I ) {\displaystyle ({\mathcal {F}}_{s},\ s\in I)} , for some (totally ordered) index set I {\displaystyle I} ; and let ( S , Σ ) {\displaystyle (S,\Sigma )} be a measurable space. An ( S , Σ ) {\displaystyle (S,\Sigma )} -valued stochastic process X = { X t : Ω → S } t ∈ I {\displaystyle X=\{X_{t}:\Omega \to S\}_{t\in I}} adapted to the filtration is said to possess the Markov property if, for each A ∈ Σ {\displaystyle A\in \Sigma } and each s , t ∈ I {\displaystyle s,t\in I} with s < t {\displaystyle s Read more →

  • List of monochrome and RGB color formats

    List of monochrome and RGB color formats

    This list of monochrome and RGB palettes includes generic repertoires of colors (color palettes) to produce black-and-white and RGB color pictures by a computer's display hardware. RGB is the most common method to produce colors for displays; so these complete RGB color repertoires have every possible combination of R-G-B triplets within any given maximum number of levels per component. Each palette is represented by a series of color patches. When the number of colors is low, a 1-pixel-size version of the palette appears below it, for easily comparing relative palette sizes. Huge palettes are given directly in one-color-per-pixel color patches. For each unique palette, an image color test chart and sample image (truecolor original follows) rendered with that palette (without dithering) are given. The test chart shows the full 256 levels of the red, green, and blue (RGB) primary colors and cyan, magenta, and yellow complementary colors, along with a full 256-level grayscale. Gradients of RGB intermediate colors (orange, lime green, sea green, sky blue, violet, and fuchsia), and a full hue spectrum are also present. Color charts are not gamma corrected. These elements illustrate the color depth and distribution of the colors of any given palette, and the sample image indicates how the color selection of such palettes could represent real-life images. These images are not necessarily representative of how the image would be displayed on the original graphics hardware, as the hardware may have additional limitations regarding the maximum display resolution, pixel aspect ratio and color placement. Implementation of these formats is specific to each machine. Therefore, the number of colors that can be simultaneously displayed in a given text or graphic mode might be different. Also, the actual displayed colors are subject to the output format used - PAL or NTSC, composite or component video, etc. - and might be slightly different. For simulated images and specific hardware and alternate methods to produce colors other than RGB (ex: composite), see the List of 8-bit computer hardware palettes, the List of 16-bit computer hardware palettes and the List of video game console palettes. For various software arrangements and sorts of colors, including other possible full RGB arrangements within 8-bit color depth displays, see the List of software palettes. == Monochrome palettes == These palettes only have some shades of gray, from black to white (considered the darkest and lightest "grays", respectively). The general rule is that those palettes have 2n different shades of gray, where n is the number of bits needed to represent a single pixel. === Monochrome (1-bit grayscale) === Monochrome graphics displays typically have a black background with a white or light gray image, though green and amber monochrome monitors were also common. Such a palette requires only one bit per pixel. Where photo-realism was desired, these early computer systems had a heavy reliance on dithering to make up for the limits of the technology. In some systems, as Hercules and CGA graphic cards for the IBM PC, a bit value of 1 represents white pixels (light on) and a value of 0 the black ones (light off); others, like the Playdate and Atari ST and Apple Macintosh with monochrome monitors, a bit value of 0 means a white pixel (no ink) and a value of 1 means a black pixel (dot of ink), which it approximates to the printing logic. === 2-bit Grayscale === In a 2-bit color palette each pixel's value is represented by 2 bits resulting in a 4-value palette (22 = 4). 2-bit dithering: It has black, white and two intermediate levels of gray as follows: A monochrome 2-bit palette is used on: The Monochrome Display Adapter for the IBM PC NeXT Computer, NeXTcube and NeXTstation monochrome graphic displays. Original Game Boy system portable video game console. Macintosh PowerBook 150 monochrome LC displays. Amiga with A2024 monochrome monitor in high-resolution mode. The original Amazon Kindle The original WonderSwan The Tiger Electronics Game.com portable video game console The original Neo Geo Pocket. === 4-bit Grayscale === In a 4-bit color palette each pixel's value is represented by 4 bits resulting in a 16-value palette (24 = 16): 4-bit grayscale dithering does a fairly good job of reducing visible banding of the level changes: A monochrome 4-bit palette is used on: MOS Technology VDC (on the Commodore 128 with monochrome monitor) Amstrad CPC series with a GT64/GT65 Green Monitor (16 unique green shades) Amstrad CPC Plus series with the MM12 Monochrome monitor (16 shades of grey) Some Apple PowerBooks equipped with monochrome displays like the PowerBook 5300 The original VideoNow === 8-bit Grayscale === In an 8-bit color palette each pixel's value is represented by 8 bits resulting in a 256-value palette (28 = 256). This is usually the maximum number of grays in ordinary monochrome systems; each image pixel occupies a single memory byte. Most scanners can capture images in 8-bit grayscale, and image file formats like TIFF and JPEG natively support this monochrome palette size. Alpha channels employed for video overlay also use (conceptually) this palette. The gray level indicates the opacity of the blended image pixel over the background image pixel. == Dichrome palettes == === 16-bit RG palette === The RG or red–green color space is a color space that uses only two primary colors: red and green. It was used on early color processes for films. It was used as an additive format, similar to the RGB color model but without a blue channel, on processes such as Kinemacolor, Prizma, Technicolor I, Raycol, etc., producing shades of black, red, green and yellow. Alternatively, it was used as a subtractive format on Brewster Color I, Kodachrome I, Prizma II, Technicolor II, etc., producing shades of transparent, red, green and black. Until recently, its primary use was in low-cost light-emitting diode displays in which red and green tended to be far more common than the still nascent blue LED technology, but full-color LEDs with blue have become more common in recent years. ColorCode 3-D, a anaglyph stereoscopic color scheme, uses the RG color space to simulate a broad spectrum of color in one eye, while the blue portion of the spectrum transmits a black-and-white (black-and-blue) image to the other eye to give depth perception. === 16-bit RB palette === === 16-bit GB palette === == Regular RGB palettes == Here are grouped those full RGB hardware palettes that have the same number of binary levels (i.e., the same number of bits) for every red, green and blue components using the full RGB color model. Thus, the total number of colors are always the number of possible levels by component, n, raised to a power of 3: n×n×n = n3. === 3-bit RGB === 3-bit RGB dithering: Systems with a 3-bit RGB palette use 1 bit for each of the red, green and blue color components. That is, each component is either "on" or "off" with no intermediate states. This results in an 8-color palette ((21)3 = 23 = 8) that has black, white, the three RGB primary colors red, green and blue and their correspondent complementary colors cyan, magenta and yellow as follows: The color indices vary between implementations; therefore, index numbers are not given. The 3-bit RGB palette is used by: Text terminals following the ECMA-48 standard (sometimes known as the "ANSI standard", although ANSI X3.128 does not define colors) World System Teletext Level 1/1.5 Videotex Oric computers BBC Micro PC-8801 (up to the MkII) PC-9801 (with original 8086 CPU, before the VM/VX models) Sharp X1 (models before the X1 Turbo Z) Sharp MZ 700 FM-7, FM New 7, FM 77 (before the FM77AV) Sinclair QL Space Invaders Part II (arcade hardware) Macintosh SE (with a color printer or external monitor) Atari 2600 (SECAM version) Color Maximite (PIC32 based microcomputer) Arcadia 2001 PV-1000 Monkey Magic (arcade hardware) VIC-20 (high-res mode) Mouse Trap (arcade hardware) Sanyo MBC-550 series Windows 1.0 (includes dithering) === 6-bit RGB === Systems with a 6-bit RGB palette use 2 bits for each of the red, green, and blue color components. This results in a (22)3 = 43 = 64-color palette as follows: 6-bit RGB systems include the following: Enhanced Graphics Adapter (EGA) for IBM PC/AT (16 colors at once) Sega Master System video game console (32 colors at once) GIME for TRS-80 Color Computer 3 (16 colors at once) Pebble Time smartwatch which has a 6-bit (64 color) e-paper display Parallax Propeller using the reference VGA circuit === 9-bit RGB === Systems with a 9-bit RGB palette use 3 bits for each of the red, green, and blue color components. This results in a (23)3 = 83 = 512-color palette as follows: 9-bit RGB systems include the following: Atari ST (Normally 4 to 16 at once without tricks) MSX2 computers (up to 16 at once) Sega Genesis video game console, (64 colors at once) Sega Nomad TurboGrafx-16 (NEC PC-Engine) ZX Spectrum Next The NEC PC-88

    Read more →
  • AI Analytics Tools Reviews: What Actually Works in 2026

    AI Analytics Tools Reviews: What Actually Works in 2026

    Comparing the best AI analytics tool? An AI analytics tool is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI analytics tool slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • The Best Free AI Website Builder for Beginners

    The Best Free AI Website Builder for Beginners

    In search of the best AI website builder? An AI website builder is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI website builder slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • How to Choose an AI Content Generator

    How to Choose an AI Content Generator

    Curious about the best AI content generator? An AI content generator is software that uses machine learning to help you get more done — it combines speed, accuracy, and an interface that just works. Hands-on testing shows real-world results vary, so a short free trial is the smartest way to decide. Whether you are a beginner or a pro, the right AI content generator slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Noisy text analytics

    Noisy text analytics

    Noisy text analytics is a process of information extraction whose goal is to automatically extract structured or semistructured information from noisy unstructured text data. While Text analytics is a growing and mature field that has great value because of the huge amounts of data being produced, processing of noisy text is gaining in importance because a lot of common applications produce noisy text data. Noisy unstructured text data is found in informal settings such as online chat, text messages, e-mails, message boards, newsgroups, blogs, wikis and web pages. Also, text produced by processing spontaneous speech using automatic speech recognition and printed or handwritten text using optical character recognition contains processing noise. Text produced under such circumstances is typically highly noisy containing spelling errors, abbreviations, non-standard words, false starts, repetitions, missing punctuations, missing letter case information, pause filling words such as “um” and “uh” and other texting and speech disfluencies. Such text can be seen in large amounts in contact centers, chat rooms, optical character recognition (OCR) of text documents, short message service (SMS) text, etc. Documents with historical language can also be considered noisy with respect to today's knowledge about the language. Such text contains important historical, religious, ancient medical knowledge that is useful. The nature of the noisy text produced in all these contexts warrants moving beyond traditional text analysis techniques. == Techniques for noisy text analysis == Missing punctuation and the use of non-standard words can often hinder standard natural language processing tools such as part-of-speech tagging and parsing. Techniques to both learn from the noisy data and then to be able to process the noisy data are only now being developed. == Possible source of noisy text == World Wide Web: Poorly written text is found in web pages, online chat, blogs, wikis, discussion forums, newsgroups. Most of these data are unstructured and the style of writing is very different from, say, well-written news articles. Analysis for the web data is important because they are sources for market buzz analysis, market review, trend estimation, etc. Also, because of the large amount of data, it is necessary to find efficient methods of information extraction, classification, automatic summarization and analysis of these data. Contact centers: This is a general term for help desks, information lines and customer service centers operating in domains ranging from computer sales and support to mobile phones to apparels. On an average a person in the developed world interacts at least once a week with a contact center agent. A typical contact center agent handles over a hundred calls per day. They operate in various modes such as voice, online chat and E-mail. The contact center industry produces gigabytes of data in the form of E-mails, chat logs, voice conversation transcriptions, customer feedback, etc. A bulk of the contact center data is voice conversations. Transcription of these using state of the art automatic speech recognition results in text with 30-40% word error rate. Further, even written modes of communication like online chat between customers and agents and even the interactions over email tend to be noisy. Analysis of contact center data is essential for customer relationship management, customer satisfaction analysis, call modeling, customer profiling, agent profiling, etc., and it requires sophisticated techniques to handle poorly written text. Printed Documents: Many libraries, government organizations and national defence organizations have vast repositories of hard copy documents. To retrieve and process the content from such documents, they need to be processed using Optical Character Recognition. In addition to printed text, these documents may also contain handwritten annotations. OCRed text can be highly noisy depending on the font size, quality of the print etc. It can range from 2-3% word error rates to as high as 50-60% word error rates. Handwritten annotations can be particularly hard to decipher, and error rates can be quite high in their presence. Short Messaging Service (SMS): Language usage over computer mediated discourses, like chats, emails and SMS texts, significantly differs from the standard form of the language. An urge towards shorter message length facilitating faster typing and the need for semantic clarity, shape the structure of this non-standard form known as the texting language.

    Read more →
  • Dynamic topic model

    Dynamic topic model

    Within statistics, Dynamic topic models' are generative models that can be used to analyze the evolution of (unobserved) topics of a collection of documents over time. This family of models was proposed by David Blei and John Lafferty and is an extension to Latent Dirichlet Allocation (LDA) that can handle sequential documents. In LDA, both the order the words appear in a document and the order the documents appear in the corpus are oblivious to the model. Whereas words are still assumed to be exchangeable, in a dynamic topic model the order of the documents plays a fundamental role. More precisely, the documents are grouped by time slice (e.g.: years) and it is assumed that the documents of each group come from a set of topics that evolved from the set of the previous slice. == Topics == Similarly to LDA and pLSA, in a dynamic topic model, each document is viewed as a mixture of unobserved topics. Furthermore, each topic defines a multinomial distribution over a set of terms. Thus, for each word of each document, a topic is drawn from the mixture and a term is subsequently drawn from the multinomial distribution corresponding to that topic. The topics, however, evolve over time. For instance, the two most likely terms of a topic at time t could be "network" and "Zipf" (in descending order) while the most likely ones at time t+1 could be "Zipf" and "percolation" (in descending order). == Model == Define α t {\displaystyle \alpha _{t}} as the per-document topic distribution at time t. β t , k {\displaystyle \beta _{t,k}} as the word distribution of topic k at time t. η t , d {\displaystyle \eta _{t,d}} as the topic distribution for document d in time t, z t , d , n {\displaystyle z_{t,d,n}} as the topic for the nth word in document d in time t, and w t , d , n {\displaystyle w_{t,d,n}} as the specific word. In this model, the multinomial distributions α t + 1 {\displaystyle \alpha _{t+1}} and β t + 1 , k {\displaystyle \beta _{t+1,k}} are generated from α t {\displaystyle \alpha _{t}} and β t , k {\displaystyle \beta _{t,k}} , respectively. Even though multinomial distributions are usually written in terms of the mean parameters, representing them in terms of the natural parameters is better in the context of dynamic topic models. The former representation has some disadvantages due to the fact that the parameters are constrained to be non-negative and sum to one. When defining the evolution of these distributions, one would need to assure that such constraints were satisfied. Since both distributions are in the exponential family, one solution to this problem is to represent them in terms of the natural parameters, that can assume any real value and can be individually changed. Using the natural parameterization, the dynamics of the topic model are given by β t , k | β t − 1 , k ∼ N ( β t − 1 , k , σ 2 I ) {\displaystyle \beta _{t,k}|\beta _{t-1,k}\sim N(\beta _{t-1,k},\sigma ^{2}I)} and α t | α t − 1 ∼ N ( α t − 1 , δ 2 I ) {\displaystyle \alpha _{t}|\alpha _{t-1}\sim N(\alpha _{t-1},\delta ^{2}I)} . The generative process at time slice 't' is therefore: Draw topics β t , k | β t − 1 , k ∼ N ( β t − 1 , k , σ 2 I ) ∀ k {\displaystyle \beta _{t,k}|\beta _{t-1,k}\sim N(\beta _{t-1,k},\sigma ^{2}I)\forall k} Draw mixture model α t | α t − 1 ∼ N ( α t − 1 , δ 2 I ) {\displaystyle \alpha _{t}|\alpha _{t-1}\sim N(\alpha _{t-1},\delta ^{2}I)} For each document: Draw η t , d ∼ N ( α t , a 2 I ) {\displaystyle \eta _{t,d}\sim N(\alpha _{t},a^{2}I)} For each word: Draw topic Z t , d , n ∼ Mult ( π ( η t , d ) ) {\displaystyle Z_{t,d,n}\sim {\textrm {Mult}}(\pi (\eta _{t,d}))} Draw word W t , d , n ∼ Mult ( π ( β t , Z t , d , n ) ) {\displaystyle W_{t,d,n}\sim {\textrm {Mult}}(\pi (\beta _{t,Z_{t,d,n}}))} where π ( x ) {\displaystyle \pi (x)} is a mapping from the natural parameterization x to the mean parameterization, namely π ( x i ) = exp ⁡ ( x i ) ∑ i exp ⁡ ( x i ) {\displaystyle \pi (x_{i})={\frac {\exp(x_{i})}{\sum _{i}\exp(x_{i})}}} . == Inference == In the dynamic topic model, only W t , d , n {\displaystyle W_{t,d,n}} is observable. Learning the other parameters constitutes an inference problem. Blei and Lafferty argue that applying Gibbs sampling to do inference in this model is more difficult than in static models, due to the nonconjugacy of the Gaussian and multinomial distributions. They propose the use of variational methods, in particular, the Variational Kalman Filtering and the Variational Wavelet Regression. == Applications == In the original paper, a dynamic topic model is applied to the corpus of Science articles published between 1881 and 1999 aiming to show that this method can be used to analyze the trends of word usage inside topics. The authors also show that the model trained with past documents is able to fit documents of an incoming year better than LDA. A continuous dynamic topic model was developed by Wang et al. and applied to predict the timestamp of documents. Going beyond text documents, dynamic topic models were used to study musical influence, by learning musical topics and how they evolve in recent history.

    Read more →
  • Lingoes

    Lingoes

    Lingoes is a dictionary and machine translation app. Lingoes was created in China. Lingoes is often compared to its competitor Babylon because of similarities in their GUI, functionalities and most importantly being freeware. == Features and expandability == Dictionaries and encyclopedias can be installed on Lingoes in the form of new add-ons to extend its functionality. Add-ons for Wikipedia, Baidu Baike, Longman Dictionary of Contemporary English, Merriam-Webster's Collegiate Dictionary, WordNet, MacMillan English Dictionary, Collins English Dictionary and other cross-English dictionaries (e.g. Arabic, French or German) are available in Lingoes' official website. The program has the ability to pronounce words and install additional text-to-speech engines available for download also through Lingoes' website. Lingoes also offers a whole-text translation ability using online translation service providers like Google Translate, Yahoo! Babel Fish Translation, SYSTRAN, Cross-Language, Click2Translate, and others. Lingoes offers to translate a text via a mouse-over popup, or by double-clicking the selected text. Additional tools, termed as appendices in the program, include a currency converter, weights and measure units converter and international time zones converter. Additional ones, such as the periodic table of elements, a scientific calculator, Traditional Chinese and Simplified Chinese conversion utility or a Base64 encoding utility, can be added through the website.

    Read more →