AI For Business Rules

AI For Business Rules — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Caspio

    Caspio

    Caspio, Inc. is an American software company providing a low-code platform for building cloud-based business applications. Founded in 2000 by Frank Zamani, the company is headquartered in Sunnyvale, California, with operations in Poland, the Philippines, and Spain. Caspio’s platform allows organizations to create online database applications and workflow tools without extensive coding. == History == Caspio was founded by Frank Zamani in 2000. The company initially focused on simplifying custom cloud applications and reducing development time and cost as compared to traditional software development. Caspio released the first version of its platform, Caspio Bridge, in 2001. In 2014, Caspio released a HIPAA-Compliant Edition of its low-code application development platform. Caspio also released an EU General Data Protection Regulation (GDPR) Compliance Edition of its low-code application development platform in 2016. Caspio's second European Software Development Center opened in Kraków, Poland in 2017. In 2019, Forrester Research listed Caspio and three other platforms in its highest of four ranked tiers of twelve low-code platforms for business developers based on rankings of offerings and strategy at that time. Caspio also opened data centers in Montreal, Canada and India in 2020.

    Read more →
  • Yorick Wilks

    Yorick Wilks

    Yorick Alexander Wilks FBCS (27 October 1939 – 14 April 2023) was a British computer scientist. He was an emeritus professor of artificial intelligence at the University of Sheffield, visiting professor of artificial intelligence at Gresham College (a post created especially for him), senior research fellow at the Oxford Internet Institute, senior scientist at the Florida Institute for Human and Machine Cognition, and a member of the Epiphany Philosophers. In February 2023, Wilks joined WiredVibe as Director of AI and a Board Member, with the goal of commercializing his previous research and ideas. He remained in this role until his death, which occurred shortly before WiredVibe was acquired by AKY X, a company that continues to build on his legacy and contributions. == Biography == Wilks was born in Gerrards Cross, Buckinghamshire in England. He was educated at Torquay Boys' Grammar School, followed by Pembroke College, Cambridge, where he read Philosophy, joined the Epiphany Philosophers and obtained his Doctor of Philosophy degree (1968) under Professor R. B. Braithwaite for the thesis 'Argument and Proof'; he was an early pioneer in meaning-based approaches to the understanding of natural language content by computers. His main early contribution in the 1970s was called "Preference Semantics" (Wilks, 1973; Wilks and Fass, 1992), an algorithmic method for assigning the "most coherent" interpretation to a sentence in terms of having the maximum number of internal preferences of its parts (normally verbs or adjectives) satisfied. That early work was hand-coded with semantic entries (of the order of some hundreds) as was normal at the time, but since then has led to the empirical determinations of preferences (chiefly of English verbs) in the 1980s and 1990s. A key component of the notion of preference in semantics was that the interpretation of an utterance is not a well- or ill-formed notion, as was argued in Chomskyan approaches, such as those of Jerry Fodor and Jerrold Katz. It was rather that a semantic interpretation was the best available, even though some preferences might not be satisfied. So, in "The machine answered the question with a low whine" the agent of "answer" does not satisfy that verb's preference for a human answerer—which would cause it to be deemed ill-formed by Fodor and Katz—but is accepted as sub-optimal or metaphorical, and, now, conventional. The function of the algorithm is not to determine well-formedness at all but to make the optimal selection of word-senses to participate in the overall interpretation. Thus, in "The Pole answered..." the system will always select the human sense of the agent and not the inanimate one if it gives a more coherent interpretation overall. Preference Semantics is thus some of the earliest computational work—with programs run at Systems Development Corporation in Santa Monica in 1967 in LISP on an IBM360—in the now established field of word sense disambiguation. This approach was used in the first operational machine translation system based principally on meaning structures and built by Wilks at Stanford Artificial Intelligence Laboratory in the early 1970s (Wilks, 1973) at the same time and place as Roger Schank was applying his "Conceptual Dependency" approach to machine translation. The LISP code of Wilks' system was in The Computer Museum, Boston. Wilks was elected a fellow of the American and European Associations for Artificial Intelligence, of the British Computer Society, a member of the UK Computing Research Committee, and a permanent member of ICCL, the International Committee on Computational Linguistics. He was professor of artificial intelligence at the University of Sheffield and a senior research fellow at the Oxford Internet Institute. In 1991 he received a Defense Advanced Projects Agency grant on interlingual pragmatics-based machine translation and in 1994 he received a grant by the Engineering and Physical Sciences Research Council to investigate in the field of large-scale information extraction (LaSIE); in the following years he would obtain more grants to carry on exploring the field of information extraction (AVENTINUS, ECRAN, PASTA...). In the 1990s Wilks also became interested in modelling human-computer dialogue and the team led by David Levy and him as chief researcher won the Loebner Prize in 1997. He was the founding director of the EU funded Companions Project on creating long-term computer companions for people. At his Festschrift in 2007 at the British Computer Society in London a volume of his own papers was presented along with a volume of essays in his honour. He was awarded the Antonio Zampolli prize in honour of his lifetime work at the LREC 2008 conference on 28 May 2008, and the Lifetime Achievement Award at the ACL 2008 conference on 18 June 2008. In 2009, he was awarded the British Computer Society's Lovelace Medal, its annual award for research achievement, and was awarded the Fellowship of the Association for Computing Machinery. In 1998, Wilks became head of the Department of Computer Science of the University of Sheffield, where he had started working in the year 1993 as professor of artificial intelligence, a post he still held. In 1993 he became the founding director of the Institute of Language, Speech and Hearing (ILASH). Wilks also set up the Natural Language Processing Group of the University of Sheffield. In 1994 he (along with Rob Gaizauskas and Hamish Cunningham) designed GATE, an advanced NLP architecture that has been widely distributed. National Life Stories conducted an oral history interview (C1672/24) with Yorick Wilks in 2016 for its Science and Religion collection held by the British Library. Wilks died on 14 April 2023, at the age of 83. == Awards == Wilks received many awards: (2009) Elected Fellow of the Association for Computing Machinery (2009) Lovelace Medal by the British Computer Society (2008) Zampolli Prize (ELRA, awarded at LREC in Marrakech, Morocco) (2008) Lifetime Achievement Award (Association for Computational Linguistics, in Columbus) (2006) Visiting Professor, University of Oxford (2004) Elected to UK Computing Research Committee (2004) Elected Fellow, British Computer Society (2003) Visiting Fellow, Oxford Internet Institute (1998) Elected Fellow of European Association for Artificial Intelligence (1997) Elected Fellow, EPSRC College of Computing (1991) Visiting Fellow, Trinity Hall, Cambridge (1991) Elected Fellow of the American Association for Artificial Intelligence (1983) Royal Society Travel Fellowship (1983) Commonwealth of Australia Visiting Professor (1981) Visiting Sloan Fellow, University of California, Berkeley (1980) Invited Participant in the Nobel Symposium on Language, Stockholm (1979) NATO Senior Scientist Fellowship (1979) Visiting Sloan Fellow, Yale University (1975) SRC Senior Visiting Fellowship, University of Edinburgh == Membership == Wilks was an active member of the following associations: Association for Computational Linguistics Society for the Study of AI and Simulation of Behaviour Association for Computing Machinery Cognitive Science Society British Society for the Philosophy of Science American Association for Artificial Intelligence Aristotelian Society == Selected works == === Books === Wilks, Y. (2019) Artificial Intelligence: Modern Magic or Dangerous Future?.Icon Books. New illustrated edition, 2023, MIT Press. Wilks, Y. (2015) Machine Translation: its scope and limits. Springer Wilks, Y (ed.) (2010) Close Engagements with Artificial Companions: Key Social, Psychological and Design issues. John Benjamins; Amsterdam Wilks, Y., Brewster, C. (2009) Natural Language Processing as a Foundation of the Semantic Web. Now Press: London. Wilks, Y. (2007) Words and Intelligence I, Selected papers by Yorick Wilks. In K. Ahmad, C. Brewster & M. Stevenson (eds.), Springer: Dordrecht. Wilks, Y. (ed. and with introduction and commentaries). (2006) Language, cohesion and form: selected papers of Margaret Masterman. Cambridge: Cambridge University Press. Wilks, Y., Nirenburg, S., Somers, H. (eds.) (2003) Readings in Machine Translation. Cambridge, MA: MIT Press. Wilks, Y.(ed.). (1999) Machine Conversations. Kluwer: New York. Wilks, Y., Slator, B., Guthrie, L. (1996) Electric Words: dictionaries, computers and meanings. Cambridge, MA: MIT Press. Ballim, A., Wilks, Y. (1991) Artificial Believers. Norwood, NJ: Erlbaum. Wilks, Y.(ed.). (1990) Theoretical Issues in Natural Language Processing. Norwood, NJ: Erlbaum. Wilks, Y., Partridge, D. (eds. plus three YW chapters and an introduction). (1990) The Foundations of Artificial Intelligence: a sourcebook. Cambridge: Cambridge University Press. Wilks, Y., Sparck-Jones, K.(eds.). (1984) Automatic Natural Language Processing, paperback edition. New York: Wiley. Originally published by Ellis Horwood. Wilks, Y., Charniak, E. (eds and principal authors). (1976) Computational Semantics—an Introduction to Artificial Intelligence and

    Read more →
  • Nathalie Japkowicz

    Nathalie Japkowicz

    Nathalie Japkowicz is a Canadian computer scientist specializing in machine learning. She is a professor and department chair of computer science at the American University College of Arts and Sciences. == Life == Nathalie Japkowicz completed a B.Sc. at McGill University in 1988. She earned an M.Sc. from the University of Toronto in 1990. She completed a Ph.D. at Rutgers University in 1999. Her dissertation was titled Concept-learning in the absence of counter-examples: an autoassociation-based approach to classification. Stephen José Hanson and Casimir Alexander Kulikowski were her doctoral advisors. Japkowicz worked at the University of Ottawa in the school of electrical engineering and computer science. She was the lead of its laboratory for research on machine learning for defense security. From 2003 to 2005, Japkowicz was the secretary of the Canadian Artificial Intelligence Association (CAIAC). She was CAIAC vice president from 2009 to 2014 and president from 2013 to 2015, and part-president from 2015 to 2017. Japkowicz is a professor and department chair of computer science at the American University College of Arts and Sciences. She researches artificial intelligence, machine learning, data mining, and big data analysis. == Selected works == Gao, Yong; Japkowicz, Nathalie, eds. (2009). Advances in Artificial Intelligence: 22nd Canadian Conference on Artificial Intelligence, Canadian AI 2009 Kelowna, Canada, May 25–27, 2009 Proceedings. Lecture Notes in Computer Science. Vol. 5549. Berlin, Heidelberg: Springer Berlin Heidelberg. doi:10.1007/978-3-642-01818-3. ISBN 978-3-642-01817-6. S2CID 27083226. Japkowicz, Nathalie; Shah, Mohak (2011). Evaluating Learning Algorithms: A Classification Perspective (1 ed.). Cambridge University Press. doi:10.1017/cbo9780511921803. ISBN 978-0-511-92180-3. Japkowicz, Nathalie; Matwin, Stan, eds. (2015). Discovery Science: 18th International Conference, DS 2015, Banff, AB, Canada, October 4–6, 2015. Proceedings. Lecture Notes in Computer Science. Vol. 9356. Cham: Springer International Publishing. doi:10.1007/978-3-319-24282-8. ISBN 978-3-319-24281-1. S2CID 1302223. Japkowicz, Nathalie; Stefanowski, Jerzy, eds. (2016). Big Data Analysis: New Algorithms for a New Society. Studies in Big Data. Vol. 16. Cham: Springer International Publishing. doi:10.1007/978-3-319-26989-4. ISBN 978-3-319-26987-0. Ceci, Michelangelo; Japkowicz, Nathalie; Liu, Jiming; Papadopoulos, George A.; Raś, Zbigniew W., eds. (2018). Foundations of Intelligent Systems: 24th International Symposium, ISMIS 2018, Limassol, Cyprus, October 29–31, 2018, Proceedings. Lecture Notes in Computer Science. Vol. 11177. Cham: Springer International Publishing. doi:10.1007/978-3-030-01851-1. ISBN 978-3-030-01850-4. S2CID 53038780.

    Read more →
  • NFA minimization

    NFA minimization

    In automata theory (a branch of theoretical computer science), NFA minimization is the task of transforming a given nondeterministic finite automaton (NFA) into an equivalent NFA that has a minimum number of states, transitions, or both. While efficient algorithms exist for DFA minimization, NFA minimization is PSPACE-complete. No efficient (polynomial time) algorithms are known, and under the standard assumption that P ≠ PSPACE, none exist. The most efficient known algorithm is the Kameda–Weiner algorithm. == Non-uniqueness of minimal NFA == Unlike deterministic finite automata, minimal NFAs may not be unique. There may be multiple NFAs with the same number of states that accept the same regular language, but for which there is no equivalent NFA or DFA with fewer states.

    Read more →
  • KoalaPad

    KoalaPad

    The KoalaPad is a graphics tablet, released in 1983 by US company Koala Technologies Corporation, for the Apple II, TRS-80 Color Computer (as the TRS-80 Touch Pad), Atari 8-bit computers, Commodore 64, and IBM PC compatibles. Originally designed by Dr. David Thornburg as a low-cost computer drawing tool for schools, the Koala Pad and the bundled drawing program, KoalaPainter, was popular with home users as well. KoalaPainter was called KoalaPaint in some versions for the Apple II, and PC Design for the IBM PC. A program called Graphics Exhibitor was included for creating slideshow presentations from KoalaPainter drawings. == Description == The pad was four inches square (i.e. roughly 10×10 cm) and mounted on a slightly inclined base with the back of the pad higher than the front. At the top, "behind" the pad, were two buttons. The pad hooked into the computer using the analog signals of the joystick ports (the so-called paddle inputs), which meant that it had a low resolution and tended to jostle the cursor if moved during use. As an alternative to the drawing stylus, the pad could as easily be operated by the user's fingers for tasks that demanded less precision, such as selecting between menu items (thus using the pad as a kind of "indirect touch screen"). The top-mounted buttons tended to be somewhat frustrating to use, as the user had to "reach around" the stylus to push the buttons in order to start or stop drawing. A similar tablet from Atari, the Atari CX77 Touch Tablet, addressed this with a built-in button on the stylus, which some enterprising users adapted for use with their KoalaPad. == KoalaPainter == The pad shipped with a simple bitmap graphics editor developed by Audio Light called KoalaPainter, PC Design or Micro Illustrator depending on the target machine (see release history). Although bundled with the pad, KoalaPainter could also be operated using an ordinary digital joystick. One unique feature of the program, for its time, was that it held two pictures in the computer's memory, allowing the user to flip from one to the other—a function commonly used in order to study the differences between an original and a modified picture, and to copy and paste between two different pictures. Some third-party bitmap editors could also be used with the KoalaPad, such as Broderbund's Dazzle Draw for the Apple II. === Release history === KoalaPainter for Commodore 64 (1983) and Atari 8-bit computers (1983) PC Design for the IBM PC (1983) Micro Illustrator for the Apple II (1983), Atari 8-bit computers (1983) and Commodore Plus/4 (1984) KoalaPainter II for Commodore 64 (1984) === Reception === Ahoy! called KoalaPainter "a very powerful and effective color drawing package", and concluded that it and the KoalaPad were "excellent in ease of use, a fine choice for a beginner as well as young children". BYTE's reviewer stated in December 1984 that he made far fewer errors when using an Apple Mouse with MousePaint than with a KoalaPad and its software. He found that MousePaint was easier to use and more efficient, predicting that the mouse would receive more software support than the pad. Cassie Stahl in InfoWorld's Essential Guide to Atari Computers praised the tablet and its documentation, rating it "Excellent" among all categories and stating that "Playing with the KoalaPad becomes addictive. It does everything it claims to, and it does it well". She also liked Micro Illustrator, rating it "Excellent" except for "Good" for Performance. While criticizing the limited erase function, Stahl reported an undocumented feature enabling exporting pictures to other software. === File format === The Commodore 64 version of KoalaPainter used a fairly simple file format corresponding directly to the way bitmapped graphics are handled on the computer: A two-byte load address, followed immediately by 8,000 bytes of raw bitmap data, 1,000 bytes of raw "Video Matrix" data, 1,000 bytes of raw "Color RAM" data, and a one-byte Background Color field. == KoalaWare == Koala Technologies offered more software beyond the bundled KoalaPainter and Graphics Exhibitor for use with the pad. Among these applications, marketed under the moniker KoalaWare (like KoalaPainter itself), was educational software for use with customized keypads and overlays, such as spelling tools, music programs, and mathematics instruction software, as well as software for "translating" graphical designs into Logo programs.

    Read more →
  • Global Language Monitor

    Global Language Monitor

    The Global Language Monitor (GLM) is a company based in Austin, Texas, that analyzes trends in the English language. == History == Founded in Silicon Valley in 2003 by Paul J.J. Payack, the GLM describes its role as "a media analytics company that documents, analyzes and tracks cultural trends in language the world over, with a particular emphasis upon International and Global English". In April 2008, GLM moved its headquarters from San Diego to Austin. In July 2020, GLM announced that the word covid was its Top Word of 2020 for English. The company has been repeatedly criticized by linguists for promoting misinformation about language. Writing on Language Log, the linguist Ben Zimmer accused it of "hoodwink[ing] unsuspecting journalists on a range of pseudoscientific claims".

    Read more →
  • How to Choose an AI Humanizer

    How to Choose an AI Humanizer

    In search of the best AI humanizer? An AI humanizer is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI humanizer slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • Cognitive computer

    Cognitive computer

    A cognitive computer is a computer that hardwires artificial intelligence and machine learning algorithms into an integrated circuit that closely reproduces the behavior of the human brain. It generally adopts a neuromorphic engineering approach. Synonyms include neuromorphic chip and cognitive chip. In 2023, IBM's proof-of-concept NorthPole chip (optimized for 2-, 4- and 8-bit precision) achieved remarkable performance in image recognition. In 2013, IBM developed Watson, a cognitive computer that uses neural networks and deep learning techniques. The following year, it developed the 2014 TrueNorth microchip architecture which is designed to be closer in structure to the human brain than the von Neumann architecture used in conventional computers. In 2017, Intel also announced its version of a cognitive chip in "Loihi, which it intended to be available to university and research labs in 2018. Intel (most notably with its Pohoiki Beach and Springs systems), Qualcomm, and others are improving neuromorphic processors steadily. == IBM TrueNorth chip == TrueNorth was a neuromorphic CMOS integrated circuit produced by IBM in 2014. It is a manycore processor network on a chip design, with 4096 cores, each one having 256 programmable simulated neurons for a total of just over a million neurons. In turn, each neuron has 256 programmable "synapses" that convey the signals between them. Hence, the total number of programmable synapses is just over 268 million (228). Its basic transistor count is 5.4 billion. In 2023 Zhejiang University and Alibaba developed Darwin a neuromorphic chip The darwin3 chip was designed around 2023 so it is fairly modern compared to IBM's TrueNorth or Intel's LoihI. === Details === Memory, computation, and communication are handled in each of the 4096 neurosynaptic cores, TrueNorth circumvents the von Neumann-architecture bottleneck and is very energy-efficient, with IBM claiming a power consumption of 70 milliwatts and a power density that is 1/10,000th of conventional microprocessors. The SyNAPSE chip operates at lower temperatures and power because it only draws power necessary for computation. Skyrmions have been proposed as models of the synapse on a chip. The neurons are emulated using a Linear-Leak Integrate-and-Fire (LLIF) model, a simplification of the leaky integrate-and-fire model. According to IBM, it does not have a clock, operates on unary numbers, and computes by counting to a maximum of 19 bits. The cores are event-driven by using both synchronous and asynchronous logic, and are interconnected through an asynchronous packet-switched mesh network on chip (NOC). IBM developed a new network to program and use TrueNorth. It included a simulator, a new programming language, an integrated programming environment, and libraries. This lack of backward compatibility with any previous technology (e.g., C++ compilers) poses serious vendor lock-in risks and other adverse consequences that may prevent it from commercialization in the future. === Research === In 2018, a cluster of TrueNorth network-linked to a master computer was used in stereo vision research that attempted to extract the depth of rapidly moving objects in a scene. == IBM NorthPole chip == In 2023, IBM released its NorthPole chip, which is a proof-of-concept for dramatically improving performance by intertwining compute with memory on-chip, thus eliminating the Von Neumann bottleneck. It blends approaches from IBM's 2014 TrueNorth system with modern hardware designs to achieve speeds about 4,000 times faster than TrueNorth. It can run ResNet-50 or Yolo-v4 image recognition tasks about 22 times faster, with 25 times less energy and 5 times less space, when compared to GPUs which use the same 12-nm node process that it was fabricated with. It includes 224 MB of RAM and 256 processor cores and can perform 2,048 operations per core per cycle at 8-bit precision, and 8,192 operations at 2-bit precision. It runs at between 25 and 425 MHz. This is an inferencing chip, but it cannot yet handle GPT-4 because of memory and accuracy limitations == Intel Loihi chip == === Pohoiki Springs === Pohoiki Springs is a system that incorporates Intel's self-learning neuromorphic chip, named Loihi, introduced in 2017, perhaps named after the Hawaiian seamount Lōʻihi. Intel claims Loihi is about 1000 times more energy efficient than general-purpose computing systems used to train neural networks. In theory, Loihi supports both machine learning training and inference on the same silicon independently of a cloud connection, and more efficiently than convolutional neural networks or deep learning neural networks. Intel points to a system for monitoring a person's heartbeat, taking readings after events such as exercise or eating, and using the chip to normalize the data and work out the ‘normal’ heartbeat. It can then spot abnormalities and deal with new events or conditions. The first iteration of the chip was made using Intel's 14 nm fabrication process and houses 128 clusters of 1,024 artificial neurons each for a total of 131,072 simulated neurons. This offers around 130 million synapses, far less than the human brain's 800 trillion synapses, and behind IBM's TrueNorth. Loihi is available for research purposes among more than 40 academic research groups as a USB form factor. In October 2019, researchers from Rutgers University published a research paper to demonstrate the energy efficiency of Intel's Loihi in solving simultaneous localization and mapping. In March 2020, Intel and Cornell University published a research paper to demonstrate the ability of Intel's Loihi to recognize different hazardous materials, which could eventually aid to "diagnose diseases, detect weapons and explosives, find narcotics, and spot signs of smoke and carbon monoxide". === Pohoiki Beach === Intel's Loihi 2, named Pohoiki Beach, was released in September 2021 with 64 cores. It boasts faster speeds, higher-bandwidth inter-chip communications for enhanced scalability, increased capacity per chip, a more compact size due to process scaling, and improved programmability. === Hala Point === Hala Point packages 1,152 Loihi 2 processors produced on Intel 3 process node in a six-rack-unit chassis. The system supports up to 1.15 billion neurons and 128 billion synapses distributed over 140,544 neuromorphic processing cores, consuming 2,600 watts of power. It includes over 2,300 embedded x86 processors for ancillary computations. Intel claimed in 2024 that Hala Point was the world’s largest neuromorphic system. It uses Loihi 2 chips. It is claimed to offer 10x more neuron capacity and up to 12x higher performance. The Darwin3 chip exceeds these specs. Hala Point provides up to 20 quadrillion operations per second, (20 petaops), with efficiency exceeding 15 trillion (8-bit) operations s−1 W−1 on conventional deep neural networks. Hala Point integrates processing, memory and communication channels in a massively parallelized fabric, providing 16 PB s−1 of memory bandwidth, 3.5 PB s−1 of inter-core communication bandwidth, and 5 TB s−1 of inter-chip bandwidth. The system can process its 1.15 billion neurons 20 times faster than a human brain. Its neuron capacity is roughly equivalent to that of an owl brain or the cortex of a capuchin monkey. Loihi-based systems can perform inference and optimization using 100 times less energy at speeds as much as 50 times faster than CPU/GPU architectures. Intel claims that Hala Point can create LLMs. Much further research is needed == SpiNNaker == SpiNNaker (Spiking Neural Network Architecture) is a massively parallel, manycore supercomputer architecture designed by the Advanced Processor Technologies Research Group at the Department of Computer Science, University of Manchester. == Criticism == Critics argue that a room-sized computer – as in the case of IBM's Watson – is not a viable alternative to a three-pound human brain. Some also cite the difficulty for a single system to bring so many elements together, such as the disparate sources of information as well as computing resources. In 2021, The New York Times released Steve Lohr's article "What Ever Happened to IBM’s Watson?". He wrote about some costly failures of IBM Watson. One of them, a cancer-related project called the Oncology Expert Advisor, was abandoned in 2016 as a costly failure. During the collaboration, Watson could not use patient data. Watson struggled to decipher doctors’ notes and patient histories. The development of LLMs has placed a new emphasis on cognitive computers, because the Transformer technology that underpins LLMs demands huge energy for GPUs and PCs. Cognitive computers use significantly less energy, but the details of STDPs and neuron models cannot yet match the accuracy of backprop, and so ANN to SNN weight translations such as QAT and PQT or progressive quantization are becoming popular, with their own limitations.

    Read more →
  • Attention (machine learning)

    Attention (machine learning)

    In machine learning, attention is a method that determines the importance of each component in a sequence relative to the other components in that sequence. In natural language processing, importance is represented by "soft" weights assigned to each word in a sentence. More generally, attention encodes vectors called token embeddings across a fixed-width sequence that can range from tens to millions of tokens in size. Unlike "hard" weights, which are computed during the backwards training pass, "soft" weights exist only in the forward pass and therefore change with every step of the input. Earlier designs implemented the attention mechanism in a serial recurrent neural network (RNN) language translation system, but a more recent design, namely the transformer, removed the slower sequential RNN and relied more heavily on the faster parallel attention scheme. Inspired by ideas about attention in humans, the attention mechanism was developed to address the weaknesses of using information from the hidden layers of recurrent neural networks. Recurrent neural networks favor information contained in words at the end of a sentence and thus deemed more recent, thereby tending to attenuate the significance and associated predictive weight assigned to information earlier in the sentence. Attention allows a token equal access to any part of a sentence directly, rather than only through the previous state. == History == Additional surveys of the attention mechanism in deep learning are provided by Niu et al. and Soydaner. The major breakthrough came with self-attention, where each element in the input sequence attends to all others, enabling the model to capture global dependencies. This idea was central to the Transformer architecture, which replaced recurrence with attention mechanisms. As a result, Transformers became the foundation for models like BERT, T5 and generative pre-trained transformers (GPT). == Overview == The modern era of machine attention was revitalized by grafting an attention mechanism (Fig 1. orange) to an Encoder-Decoder. Figure 2 shows the internal step-by-step operation of the attention block (A) in Fig 1. === Interpreting attention weights === In translating between languages, alignment is the process of matching words from the source sentence to words of the translated sentence. Networks that perform verbatim translation without regard to word order would show the highest scores along the (dominant) diagonal of the matrix. The off-diagonal dominance shows that the attention mechanism is more nuanced. Consider an example of translating I love you to French. On the first pass through the decoder, 94% of the attention weight is on the first English word I, so the network offers the word je. On the second pass of the decoder, 88% of the attention weight is on the third English word you, so it offers t'. On the last pass, 95% of the attention weight is on the second English word love, so it offers aime. In the I love you example, the second word love is aligned with the third word aime. Stacking soft row vectors together for je, t', and aime yields an alignment matrix: Sometimes, alignment can be multiple-to-multiple. For example, the English phrase look it up corresponds to cherchez-le. Thus, "soft" attention weights work better than "hard" attention weights (setting one attention weight to 1, and the others to 0), as we would like the model to make a context vector consisting of a weighted sum of the hidden vectors, rather than "the best one", as there may not be a best hidden vector. == Variants == Many variants of attention implement soft weights, such as fast weight programmers, or fast weight controllers (1992). A "slow" neural network outputs the "fast" weights of another neural network through outer products. The slow network learns by gradient descent. It was later renamed as "linearized self-attention". Bahdanau-style attention, also referred to as additive attention, Luong-style attention, which is known as multiplicative attention, Early attention mechanisms similar to modern self-attention were proposed using recurrent neural networks. However, the highly parallelizable self-attention was introduced in 2017 and successfully used in the Transformer model, positional attention and factorized positional attention. For convolutional neural networks, attention mechanisms can be distinguished by the dimension on which they operate, namely: spatial attention, channel attention, or combinations. These variants recombine the encoder-side inputs to redistribute those effects to each target output. Often, a correlation-style matrix of dot products provides the re-weighting coefficients. In the figures below, W is the matrix of context attention weights, similar to the formula in Overview section above. == Optimizations == === Flash attention === The size of the attention matrix is proportional to the square of the number of input tokens. Therefore, when the input is long, calculating the attention matrix requires a lot of GPU memory. Flash attention is an implementation that reduces the memory needs and increases efficiency without sacrificing accuracy. It achieves this by partitioning the attention computation into smaller blocks that fit into the GPU's faster on-chip memory, reducing the need to store large intermediate matrices and thus lowering memory usage while increasing computational efficiency. === FlexAttention === FlexAttention is an attention kernel developed by Meta that allows users to modify attention scores prior to softmax and dynamically chooses the optimal attention algorithm. == Applications == Attention is widely used in natural language processing, computer vision, and speech recognition. In NLP, it improves context understanding in tasks like question answering and summarization. In vision, visual attention helps models focus on relevant image regions, enhancing object detection and image captioning. === Attention maps as explanations for vision transformers === From the original paper on vision transformers (ViT), visualizing attention scores as a heat map (called saliency maps or attention maps) has become an important and routine way to inspect the decision making process of ViT models. One can compute the attention maps with respect to any attention head at any layer, while the deeper layers tend to show more semantically meaningful visualization. Attention rollout is a recursive algorithm to combine attention scores across all layers, by computing the dot product of successive attention maps. Because vision transformers are typically trained in a self-supervised manner, attention maps are generally not class-sensitive. When a classification head is attached to the ViT backbone, class-discriminative attention maps (CDAM) combines attention maps and gradients with respect to the class [CLS] token. Some class-sensitive interpretability methods originally developed for convolutional neural networks can be also applied to ViT, such as GradCAM, which back-propagates the gradients to the outputs of the final attention layer. Using attention as basis of explanation for the transformers in language and vision is not without debate. While some pioneering papers analyzed and framed attention scores as explanations, higher attention scores do not always correlate with greater impact on model performances. == Mathematical representation == === Standard scaled dot-product attention === For matrices: Q ∈ R m × d k , K ∈ R n × d k {\displaystyle Q\in \mathbb {R} ^{m\times d_{k}},K\in \mathbb {R} ^{n\times d_{k}}} and V ∈ R n × d v {\displaystyle V\in \mathbb {R} ^{n\times d_{v}}} , the scaled dot-product, or QKV attention, is defined as: Attention ( Q , K , V ) = softmax ( Q K T d k ) V ∈ R m × d v {\displaystyle {\text{Attention}}(Q,K,V)={\text{softmax}}\left({\frac {QK^{T}}{\sqrt {d_{k}}}}\right)V\in \mathbb {R} ^{m\times d_{v}}} where T {\displaystyle {}^{T}} denotes transpose and the softmax function is applied independently to every row of its argument. The matrix Q {\displaystyle Q} contains m {\displaystyle m} queries, while matrices K , V {\displaystyle K,V} jointly contain an unordered set of n {\displaystyle n} key-value pairs. Value vectors in matrix V {\displaystyle V} are weighted using the weights resulting from the softmax operation, so that the rows of the m {\displaystyle m} -by- d v {\displaystyle d_{v}} output matrix are confined to the convex hull of the points in R d v {\displaystyle \mathbb {R} ^{d_{v}}} given by the rows of V {\displaystyle V} . To understand the permutation invariance and permutation equivariance properties of QKV attention, let A ∈ R m × m {\displaystyle A\in \mathbb {R} ^{m\times m}} and B ∈ R n × n {\displaystyle B\in \mathbb {R} ^{n\times n}} be permutation matrices; and D ∈ R m × n {\displaystyle D\in \mathbb {R} ^{m\times n}} an arbitrary matrix. The softmax function is permutation equivariant in the sense that: softmax ( A D B ) = A softmax ( D ) B {\displays

    Read more →
  • Struc2vec

    Struc2vec

    struc2vec is a framework to generate node vector representations on a graph that preserve the structural identity. In contrast to node2vec, that optimizes node embeddings so that nearby nodes in the graph have similar embedding, struc2vec captures the roles of nodes in a graph, even if structurally similar nodes are far apart in the graph. It learns low-dimensional representations for nodes in a graph, generating random walks through a constructed multi-layer graph starting at each graph node. It is useful for machine learning applications where the downstream application is more related with the structural equivalence of the nodes (e.g., it can be used to detect nodes in networks with similar functions, such as interns in the social network of a corporation). struc2vec identifies nodes that play a similar role based solely on the structure of the graph, for example computing the structural identity of individuals in social networks. In particular, struc2vec employs a degree-based method to measure the pairwise structural role similarity, which is then adopted to build the multi-layer graph. Moreover, the distance between the latent representation of nodes is strongly correlated to their structural similarity. The framework contains three optimizations: reducing the length of degree sequences considered, reducing the number of pairwise similarity calculations, and reducing the number of layers in the generated graph. struc2vec follows the intuition that random walks through a graph can be treated as sentences in a corpus. Each node in a graph is treated as an individual word, and short random walk is treated as a sentence. In its final phase, the algorithm employs Gensim's word2vec algorithm to learn embeddings based on biased random walks. Sequences of nodes are fed into a skip-gram or continuous bag of words model and traditional machine-learning techniques for classification can be used. It is considered a useful framework to learn node embeddings based on structural equivalence.

    Read more →
  • AI Text-to-video Tools Reviews: What Actually Works in 2026

    AI Text-to-video Tools Reviews: What Actually Works in 2026

    Looking for the best AI text-to-video tool? An AI text-to-video tool is software that uses machine learning to help you get more done — it can save you hours every week by automating repetitive work. Most options offer a generous free tier, with paid plans unlocking higher limits, faster processing, and team features. Whether you are a beginner or a pro, the right AI text-to-video tool slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • The Best Free AI Art Generator for Beginners

    The Best Free AI Art Generator for Beginners

    Trying to pick the best AI art generator? An AI art generator is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI art generator slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • Resel

    Resel

    In image analysis, a resel (from resolution element) represents the actual spatial resolution in an image or a volumetric dataset. The number of resels in the image may be lower or equal to the number of pixel/voxels in the image. In an actual image the resels can vary across the image and indeed the local resolution can be expressed as "resels per pixel" (or "resels per voxel"). In functional neuroimaging analysis, an estimate of the number of resels together with random field theory is used in statistical inference. Keith Worsley has proposed an estimate for the number of resels/roughness. The word "resel" is related to the words "pixel", "texel", and "voxel". Waldo R. Tobler is probably among the first to use the word.

    Read more →
  • Kristian Kersting

    Kristian Kersting

    Kristian Kersting (born November 28, 1973, in Cuxhaven, Germany) is a German computer scientist. He is Professor of Artificial intelligence and Machine Learning at the Department of Computer Science at the Technische Universität Darmstadt, Head of the Artificial Intelligence and Machine Learning Lab (AIML) and Co-Director of hessian.AI, the Hessian Center for Artificial Intelligence. He is known for his research on statistical relational artificial intelligence, probabilistic programming, and deep probabilistic learning. == Life == Kersting studied computer science at the University of Freiburg, where he received his Ph.D. in 2006. At the university he attended a course on artificial intelligence given by Bernhard Nebel and became interested in the topic. He was a visiting postdoctoral researcher at the KU Leuven and a postdoctoral associate at the Massachusetts Institute of Technology (MIT). His advisor at MIT was Leslie Pack Kaelbling. From 2008 to 2012, he led a research group at the Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS). He then became a Juniorprofessor at the University of Bonn and associate Professor at the computer science department of the Technical University of Dortmund. From 2017 to 2019, he was professor of machine Learning and since 2019 professor of artificial intelligence and machine learning at the department of computer science of the Technische Universität Darmstadt. He is also a researcher at ATHENE, the largest research institute for IT security in Europe and leads a research department at the German Research Centre for Artificial Intelligence (DFKI). Kristian Kersting is the co-spokesperson of Cluster of Excellence "Reasonable Artificial Intelligence", RAI (2026-32). == Awards == In 2006, he received the AI Dissertation Award of the European Association for Artificial Intelligence. In 2008, he received the Fraunhofer Attract research grant with a budget of 2.5 million euros over five years. He was appointed Fellow of the European Association for Artificial Intelligence (EurAI) and Fellow of the European Laboratory for Learning and Intelligent Systems (ELLIS) in 2019. In 2019 he received the "Deutscher KI-Preis" ("German AI Award"), endowed with 100,000 euros, for his outstanding scientific achievements in the field of artificial intelligence. He was elected an AAAI Fellow in 2024. == Publications == De Raedt L., Kersting K. (2008) Probabilistic Inductive Logic Programming. In: De Raedt L., Frasconi P., Kersting K., Muggleton S. (eds) Probabilistic Inductive Logic Programming. Lecture Notes in Computer Science, vol 4911. Springer, Berlin, Heidelberg. ISBN 978-3-540-78651-1 Luc De Raedt, Kristian Kersting, Sriraam Natarajan and David Poole, "Statistical Relational Artificial Intelligence: Logic, Probability, and Computation", Synthesis Lectures on Artificial Intelligence and Machine Learning" Morgan & Claypool, March 2016 ISBN 9781627058414.

    Read more →
  • Dynamic topic model

    Dynamic topic model

    Within statistics, Dynamic topic models' are generative models that can be used to analyze the evolution of (unobserved) topics of a collection of documents over time. This family of models was proposed by David Blei and John Lafferty and is an extension to Latent Dirichlet Allocation (LDA) that can handle sequential documents. In LDA, both the order the words appear in a document and the order the documents appear in the corpus are oblivious to the model. Whereas words are still assumed to be exchangeable, in a dynamic topic model the order of the documents plays a fundamental role. More precisely, the documents are grouped by time slice (e.g.: years) and it is assumed that the documents of each group come from a set of topics that evolved from the set of the previous slice. == Topics == Similarly to LDA and pLSA, in a dynamic topic model, each document is viewed as a mixture of unobserved topics. Furthermore, each topic defines a multinomial distribution over a set of terms. Thus, for each word of each document, a topic is drawn from the mixture and a term is subsequently drawn from the multinomial distribution corresponding to that topic. The topics, however, evolve over time. For instance, the two most likely terms of a topic at time t could be "network" and "Zipf" (in descending order) while the most likely ones at time t+1 could be "Zipf" and "percolation" (in descending order). == Model == Define α t {\displaystyle \alpha _{t}} as the per-document topic distribution at time t. β t , k {\displaystyle \beta _{t,k}} as the word distribution of topic k at time t. η t , d {\displaystyle \eta _{t,d}} as the topic distribution for document d in time t, z t , d , n {\displaystyle z_{t,d,n}} as the topic for the nth word in document d in time t, and w t , d , n {\displaystyle w_{t,d,n}} as the specific word. In this model, the multinomial distributions α t + 1 {\displaystyle \alpha _{t+1}} and β t + 1 , k {\displaystyle \beta _{t+1,k}} are generated from α t {\displaystyle \alpha _{t}} and β t , k {\displaystyle \beta _{t,k}} , respectively. Even though multinomial distributions are usually written in terms of the mean parameters, representing them in terms of the natural parameters is better in the context of dynamic topic models. The former representation has some disadvantages due to the fact that the parameters are constrained to be non-negative and sum to one. When defining the evolution of these distributions, one would need to assure that such constraints were satisfied. Since both distributions are in the exponential family, one solution to this problem is to represent them in terms of the natural parameters, that can assume any real value and can be individually changed. Using the natural parameterization, the dynamics of the topic model are given by β t , k | β t − 1 , k ∼ N ( β t − 1 , k , σ 2 I ) {\displaystyle \beta _{t,k}|\beta _{t-1,k}\sim N(\beta _{t-1,k},\sigma ^{2}I)} and α t | α t − 1 ∼ N ( α t − 1 , δ 2 I ) {\displaystyle \alpha _{t}|\alpha _{t-1}\sim N(\alpha _{t-1},\delta ^{2}I)} . The generative process at time slice 't' is therefore: Draw topics β t , k | β t − 1 , k ∼ N ( β t − 1 , k , σ 2 I ) ∀ k {\displaystyle \beta _{t,k}|\beta _{t-1,k}\sim N(\beta _{t-1,k},\sigma ^{2}I)\forall k} Draw mixture model α t | α t − 1 ∼ N ( α t − 1 , δ 2 I ) {\displaystyle \alpha _{t}|\alpha _{t-1}\sim N(\alpha _{t-1},\delta ^{2}I)} For each document: Draw η t , d ∼ N ( α t , a 2 I ) {\displaystyle \eta _{t,d}\sim N(\alpha _{t},a^{2}I)} For each word: Draw topic Z t , d , n ∼ Mult ( π ( η t , d ) ) {\displaystyle Z_{t,d,n}\sim {\textrm {Mult}}(\pi (\eta _{t,d}))} Draw word W t , d , n ∼ Mult ( π ( β t , Z t , d , n ) ) {\displaystyle W_{t,d,n}\sim {\textrm {Mult}}(\pi (\beta _{t,Z_{t,d,n}}))} where π ( x ) {\displaystyle \pi (x)} is a mapping from the natural parameterization x to the mean parameterization, namely π ( x i ) = exp ⁡ ( x i ) ∑ i exp ⁡ ( x i ) {\displaystyle \pi (x_{i})={\frac {\exp(x_{i})}{\sum _{i}\exp(x_{i})}}} . == Inference == In the dynamic topic model, only W t , d , n {\displaystyle W_{t,d,n}} is observable. Learning the other parameters constitutes an inference problem. Blei and Lafferty argue that applying Gibbs sampling to do inference in this model is more difficult than in static models, due to the nonconjugacy of the Gaussian and multinomial distributions. They propose the use of variational methods, in particular, the Variational Kalman Filtering and the Variational Wavelet Regression. == Applications == In the original paper, a dynamic topic model is applied to the corpus of Science articles published between 1881 and 1999 aiming to show that this method can be used to analyze the trends of word usage inside topics. The authors also show that the model trained with past documents is able to fit documents of an incoming year better than LDA. A continuous dynamic topic model was developed by Wang et al. and applied to predict the timestamp of documents. Going beyond text documents, dynamic topic models were used to study musical influence, by learning musical topics and how they evolve in recent history.

    Read more →