AI Art Creator Free

AI Art Creator Free — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Scrolling

    Scrolling

    In computer displays, filmmaking, television production, video games and other kinetic displays, scrolling is sliding text, images or video across a monitor or display, vertically or horizontally. "Scrolling," as such, does not change the layout of the text or pictures but moves (pans or tilts) the user's view across what is apparently a larger image that is not wholly seen. A common television and movie special effect is to scroll credits, while leaving the background stationary. Scrolling may take place completely without user intervention (as in film credits) or, on an interactive device, be triggered by touchscreen or a keypress and continue without further intervention until a further user action, or be entirely controlled by input devices. Scrolling may take place in discrete increments (perhaps one or a few lines of text at a time), or continuously (smooth scrolling). Frame rate is the speed at which an entire image is redisplayed. It is related to scrolling in that changes to text and image position can only happen as often as the image can be redisplayed. When frame rate is a limiting factor, one smooth scrolling technique is to blur images during movement that would otherwise appear to "jump". == Computing == === Implementation === Scrolling is often carried out on a computer by the CPU (software scrolling) or by a graphics processor. Some systems feature hardware scrolling, where an image may be offset as it is displayed, without any frame buffer manipulation (see also hardware windowing). This was especially common in 8 and 16bit video game consoles. === UI paradigms === In a WIMP-style graphical user interface (GUI), user-controlled scrolling is carried out by manipulating a scrollbar with a mouse, or using keyboard shortcuts, often the arrow keys. Scrolling is often supported by text user interfaces and command line interfaces. Older computer terminals changed the entire contents of the display one screenful ("page") at a time; this paging mode requires fewer resources than scrolling. Scrolling displays often also support page mode. Typically certain keys or key combinations page up or down; on PC-compatible keyboards the page up and page down keys or the space bar are used; earlier computers often used control key combinations. Some computer mice have a scroll wheel, which scrolls the display, often vertically, when rolled; others have scroll balls or tilt wheels which allow both vertical and horizontal scrolling. Some software supports other ways of scrolling. Adobe Reader has a mode identified by a small hand icon ("hand tool") on the document, which can then be dragged by clicking on it and moving the mouse as if sliding a large sheet of paper. When this feature is implemented on a touchscreen it is called kinetic scrolling. Touch-screens often use inertial scrolling, in which the scrolling motion of an object continues in a decaying fashion after release of the touch, simulating the appearance of an object with inertia. An early implementation of such behavior was in the "Star7" PDA of Sun Microsystems ca. 1991–1992. Scrolling can be controlled in other software-dependent ways by a PC mouse. Some scroll wheels can be pressed down, functioning like a button. Depending on the software, this allows both horizontal and vertical scrolling by dragging in the direction desired; when the mouse is moved to the original position, scrolling stops. A few scroll wheels can also be tilted, scrolling horizontally in one direction until released. On touchscreen devices, scrolling is a multi-touch gesture, done by swiping a finger on the screen vertically in the direction opposite to where the user wants to scroll to. If any content is too wide to fit on a display, horizontal scrolling is required to view all of it. In applications such as graphics and spreadsheets there is often more content than can fit either the width or the height of the screen at a comfortable scale, and scrolling in both directions is necessary. === Infinite scrolling === In contrast to material divided into discrete pages, the web design approach of infinite scrolling dynamically adds new material to the user display, leading to a continuous, apparently bottomless or endless scrolling experience. === Text === In languages written horizontally, such as most Western languages, text documents longer than will fit on the screen are often displayed wrapped and sized to fit the screen width, and scrolled vertically to bring desired content into view. It is possible to display lines too long to fit the display without wrapping, scrolling horizontally to view each entire line. However, this requires inconvenient constant line-by-line scrolling, while vertical scrolling is only needed after reading a full screenful. Software such as word processors and web browsers normally uses word-wrapping to display as many words in a single line as will fit the width of the screen or window or, for text organised in columns, each column. === Demos === Scrolling texts, also referred to as scrolltexts or scrollers, played an important part in the birth of the computer demo culture. The software crackers often used their deep knowledge of computer platforms to transform the information that accompanied their releases into crack intros. The sole role of these intros was to scroll the text on the screen in an impressive way. == Film and television == Scrolling is commonly used to display the credits at the end of films and television programs. Scrolling is often used in the form of a news ticker towards the bottom of the picture for content such as television news, scrolling sideways across the screen, delivering short-form content. In the dynamic layout of kinetic typography, scrolling typography can scroll across the flat screen, or can appear to recede or advance. An iconic example is the Star Wars opening crawl inspired by the Flash Gordon serials. == Video games == In computer and video games, scrolling of a playing field allows the player to control an object in a large contiguous area. Early examples of this method include Taito's 1974 vertical-scrolling racing video game Speed Race, Sega's 1976 forward-scrolling racing games Moto-Cross (Fonz) and Road Race, and Super Bug. Previously the flip-screen method was used to indicate moving backgrounds. The Namco Galaxian arcade system board introduced with Galaxian in 1979 pioneered a sprite system that animated pre-loaded sprites over a scrolling background, which became the basis for Nintendo's Radar Scope and Donkey Kong arcade hardware and home consoles such as the Nintendo Entertainment System. Parallax scrolling, which was first featured in Moon Patrol, involves several semi-transparent layers (called playfields), which scroll on top of each other at varying rates in order to give an early pseudo-3D illusion of depth. Belt scrolling is a method used in side-scrolling beat 'em up games with a downward camera angle where players can move up and down in addition to left and right. == Studies == A 1993 article by George Fitzmaurice studied spatially aware palmtop computers. These devices had a 3D sensor, and moving the device caused the contents to move as if the contents were fixed in place. This interaction could be referred to as “moving to scroll.” Also, if the user moved the device away from their body, they would zoom in; conversely, the device would zoom out if the user pulled the device closer to them. Smartphone cameras and “optical flow” image analysis utilize this technique nowadays. A 1996 research paper by Jun Rekimoto analyzed tilting operations as scrolling techniques on small screen interfaces. Users could not only tilt to scroll, but also tilt to select menu items. These techniques proved especially useful for field workers, since they only needed to hold and control the device with one hand. A study from 2013 by Selina Sharmin, Oleg Špakov, and Kari-Jouko Räihä explored the action of reading text on a screen while the text auto-scrolls based on the user's eye tracking patterns. The control group simply read text on a screen and manually scrolled. The study found that participants preferred to read primarily at the top of the screen, so the screen scrolled down whenever participants’ eyes began to look toward the bottom of the screen. This auto-scrolling caused no statistically significant difference in reading speed or performance. An undated study occurring during or after 2010 by Dede Frederick, James Mohler, Mihaela Vorvoreanu, and Ronald Glotzbach noted that parallax scrolling "may cause certain people to experience nausea."

    Read more →
  • Daisy Intelligence

    Daisy Intelligence

    Daisy Intelligence is a Canadian artificial intelligence (AI) company that provides data analysis services to help retailers, mainly grocers and supermarkets, to determine optimal pricing and promotional mix. The company also helps insurance companies detect fraudulent claims. The company uses a subset of AI known as reinforcement learning. In October 2019, the company moved from the suburban Vaughan, Ontario, to downtown Toronto, joining other AI and technology startups concentrated in the King Street East area. In 2019, the company was ranked No. 39 on The Globe and Mail's annual list of Canada's "top growing companies by three-year revenue growth."

    Read more →
  • Linguistic value

    Linguistic value

    In artificial intelligence, fuzzy logic operations research, and related fields, a linguistic value is a natural language term which is derived using quantitative or qualitative reasoning such as with probability and statistics or fuzzy sets and systems. Variables that take linguistic values are called linguistic variables. == Examples of linguistic variables and values == For example, "age" may be a linguistic variable if its values are not numerical, e.g. very young, quite young, not young, old, not very old etc. These values could be derived from the numeric values for age. As another example, if a shuttle heat shield is deemed of having a linguistic value of a "very low" percentage of damage in re-entry, based upon knowledge from experts in the field, that probability would be given a value of say, 5%. From there on out, if it were to be used in an equation, the variable of percentage of damage will be at 5% if it deemed very low percentage.

    Read more →
  • Lukas Biewald

    Lukas Biewald

    Lukas Biewald (born 1981) is an American entrepreneur and a prominent figure in artificial intelligence. He is recognized for his contributions to machine learning and as the CEO and co-founder of Weights & Biases, a company that builds developer tools for AI, that sold to CoreWeave in 2025 for $1.7B. He previously founded and was CEO of Figure Eight, a human-in-the-loop machine learning platform. He has co-authored 26 AI research papers from 2004 through 2018. == Early life and education == Biewald was born in Boston, Massachusetts in 1981. He attended Cambridge Rindge and Latin School and later earned both a Bachelor's and Master's degree in Computer science from Stanford University. == Early Career and Founding Figure Eight == After graduation, Biewald joined Yahoo! as an engineer, working on machine translations to improve search results, and eventually led the Search Relevance Team for Yahoo! Japan. He later joined Powerset, a natural language search technology company, as their Senior Scientist, which was acquired by Microsoft in 2008 for an estimated $100M. In 2007, Biewald co-founded Figure Eight (formerly CrowdFlower), a data labeling and crowdsourcing company that created datasets for training machine learning models. Figure Eight was acquired by Appen in 2019 for $300 million. == Weights and Biases == In 2017, Biewald co-founded Weights & Biases with Chris Van Pelt and Shawn Lewis. The company provides tools for tracking machine learning experiments, model management, and collaborative AI and LLM app development. The platform has been adopted by organizations such as OpenAI, Salesforce, and Microsoft. In March 2025 Coreweave acquired Weights and Biases at $1.7 billion, with the transaction closing on May 5, 2025. == Gradient Dissent == Biewald hosts the bi-weekly podcast Gradient Dissent. Guest have included: Anthony Goldbloom – Co-founder & CEO of Kaggle. “How to Win Kaggle Competitions” (podcast, Sep. 9, 2020). Shared tips on data-science competitions from the founder of the largest ML community. Richard Socher – Founder & CEO of You.com; former Chief Scientist at Salesforce. “The Challenges of Making ML Work in the Real World” (podcast, September 28, 2020). A leading NLP researcher, he spoke on multimodal search engines powered by large language models. Jensen Huang – Founder & CEO of NVIDIA. “NVIDIA’s CEO on the Next Generation of AI and MLOps” (podcast, March 3, 2022). Huang’s GPUs power modern ML research and production. Emad Mostaque – Co-founder & CEO of Stability AI. “Stable Diffusion, Stability AI, and What’s Next” (podcast, Nov. 15, 2022). Leads the company behind Stable Diffusion, which helped spark the generative-AI imaging boom. Drago Anguelov – Head of Research at Waymo. “Robustness, Safety, and Scalability at Waymo” (podcast, July 14, 2022). Covered Waymo’s self-driving AI advances and deployment challenges. Jeremy Howard – Co-founder of fast.ai. “The Simple but Profound Insight Behind Diffusion” (podcast, Jan. 5, 2023). Known for democratizing deep-learning education; discussed diffusion models and accessible AI tooling. Aidan Gomez – Co-founder & CEO of Cohere. “Scaling LLMs and Accelerating Adoption” (podcast, April 20, 2023). Co-author of “Attention Is All You Need,” he shared how Cohere delivers large-scale NLP models as a service. Chelsea Finn – Stanford Assistant Professor (AI & Robotics). “Shaping the World of Robotics with Chelsea Finn” (podcast, February 15, 2024). A pioneer in meta-learning and robotics, she detailed robots learning complex tasks like cooking. Andrew Feldman – Co-founder & CEO of Cerebras Systems. "Launching the Fastest AI Inference Solution" (podcast, August 27, 2024). Described wafer-scale AI chips achieving new training performance records. Thomas Dohmke – CEO of GitHub. “GitHub CEO on Copilot and the Future of Software Development” (podcast, June 10, 2025). Discussed building Copilot and the future of AI-assisted coding. Martin Shkreli – Founder of Godel Terminal. “From Pharma to AGI Hype, and Developing AI in Finance: Martin Shkreli’s Journey” (podcast, May 20, 2025). Shkreli reflects on his pharma controversies, prison experience, and his new AI-driven trading platform. Jarek Kutylowski – Founder & CEO of DeepL. “How DeepL Built a Translation Powerhouse with AI” (podcast, July 8, 2025). Shared how DeepL’s neural-MT rivals Google Translate through model and infrastructure innovation. == Awards and recognition == In 2010, Lukas Biewald won the Netexplorateur Award for creating the GiveWork iPhone app, which allows users to perform small tasks that assist refugees and people in developing countries. In 2010, Inc Magazine included Biewald and Van Pelt on its list of the Top 30 Entrepreneurs Under 30. == Publications == Ensuring quality in crowdsourced search relevance evaluation: The effects of training question distribution by John Le, Andy Edmonds, Vaughn Hester, Lukas Biewald. SIGIR 2010 Workshop on Crowdsourcing for Search Evaluation, July 2010. Superficial Data Analysis: Exploring Millions of Social Stereotypes by Lukas Biewald, Brendan O’Connor. O’Reilly July 2009 Biewald has co-authored 26 AI research papers from 2004 through 2018.

    Read more →
  • Tertiary review

    Tertiary review

    In software engineering, a tertiary review is a systematic review of systematic reviews. It is also referred to as a tertiary study in the software engineering literature. However, Umbrella review is the term more commonly used in medicine. Kitchenham et al. suggest that methodologically there is no difference between a systematic review and a tertiary review. However, as the software engineering community has started performing tertiary reviews new concerns unique to tertiary reviews have surfaced. These include the challenge of quality assessment of systematic reviews, search validation and the additional risk of double counting. == Examples of Tertiary reviews in software engineering literature == Test quality Machine Learning Test-driven development

    Read more →
  • Embodied cognition

    Embodied cognition

    Embodied cognition represents a diverse group of theories which investigate how cognition is shaped by the bodily state and capacities of the organism. These embodied factors include the motor system, the perceptual system, bodily interactions with the environment (situatedness), and the assumptions about the world that shape the functional structure of the brain and body of the organism. Embodied cognition suggests that these elements are essential to a wide spectrum of cognitive functions, such as perception biases, memory recall, comprehension and high-level mental constructs (such as meaning attribution and categories) and performance on various cognitive tasks (reasoning or judgment). The embodied mind thesis challenges other theories, such as cognitivism, computationalism, and Cartesian dualism. It is closely related to the extended mind thesis, situated cognition, and enactivism. The modern version depends on understandings drawn from up-to-date research in psychology, linguistics, cognitive science, dynamical systems, artificial intelligence, robotics, animal cognition, plant cognition, and neurobiology. == Theory == Proponents of the embodied cognition thesis emphasize the active and significant role the body plays in the shaping of cognition and in the understanding of an agent's mind and cognitive capacities. In philosophy, embodied cognition holds that an agent's cognition, rather than being the product of mere (innate) abstract representations of the world, is strongly influenced by aspects of an agent's body beyond the brain itself. An embodied model of cognition opposes the disembodied Cartesian model, according to which all mental phenomena are non-physical and, therefore, not influenced by the body. With this opposition the embodiment thesis intends to reintroduce an agent's bodily experiences into any account of cognition. It is a rather broad thesis and encompasses both weak and strong variants of embodiment. In an attempt to reconcile cognitive science with human experience, the enactive approach to cognition defines "embodiment" as follows: By using the term embodied we mean to highlight two points: first that cognition depends upon the kinds of experience that come from having a body with various sensorimotor capacities, and second, that these individual sensorimotor capacities are themselves embedded in a more encompassing biological, psychological and cultural context. This double sense attributed to the embodiment thesis emphasizes the many aspects of cognition that researchers in different fields—such as philosophy, cognitive science, artificial intelligence, psychology, and neuroscience—are involved with. This general characterization of embodiment faces some difficulties: a consequence of this emphasis on the body, experience, culture, context, and the cognitive mechanisms of an agent in the world is that often distinct views and approaches to embodied cognition overlap. The theses of extended cognition and situated cognition, for example, are usually intertwined and not always carefully separated. And since each of the aspects of the embodiment thesis is endorsed to different degrees, embodied cognition should be better seen "as a research program rather than a well-defined unified theory". Some authors explain the embodiment thesis by arguing that cognition depends on an agent's body and its interactions with a determined environment. From this perspective, cognition in real biological systems is not an end in itself; it is constrained by the system's goals and capacities. Such constraints do not mean cognition is set by adaptive behavior (or autopoiesis) alone, but instead that cognition requires "some kind of information processing... the transformation or communication of incoming information". The acquiring of such information involves the agent's "exploration and modification of the environment". It would be a mistake, however, to suppose that cognition consists simply of building maximally accurate representations of input information...the gaining of knowledge is a stepping stone to achieving the more immediate goal of guiding behavior in response to the system's changing surroundings. Another approach to understanding embodied cognition comes from a narrower characterization of the embodiment thesis. The following narrower view of embodiment avoids any compromises to external sources other than the body and allows differentiating between embodied cognition, extended cognition, and situated cognition. Thus, the embodiment thesis can be specified as follows: Many features of cognition are embodied in that they are deeply dependent upon characteristics of the physical body of an agent, such that the agent's beyond-the-brain body plays a significant causal role, or a physically constitutive role, in that agent's cognitive processing. This thesis points out the core idea that an agent's body plays a significant role in shaping different features of cognition, such as perception, attention, memory, reasoning—among others. Likewise, these features of cognition depend on the kind of body an agent has. The thesis omits direct mention of some aspects of the "more encompassing biological, psychological and cultural context" included in the enactive definition, making it possible to separate embodied cognition, extended cognition, and situated cognition. In contrast to the embodiment thesis, the extended mind thesis limits cognitive processing neither to the brain nor even to the body, it extends it outward into the agent's world. Situated cognition emphasizes that this extension is not just a matter of including resources outside the head but stressing the role of probing and changing interactions with the agent's world. Cognition is situated in that it is inherently dependent upon the cultural and social contexts within which it takes place. This conceptual reframing of cognition as an activity influenced by the body has had significant implications. For instance, the view of cognition inherited by most contemporary cognitive neuroscience is internalist in nature. An agent's behavior along with its capacity to maintain (accurate) representations of the surrounding environment were considered as the product of "powerful brains that can maintain the world models and devise plans". From this perspective, cognizing was conceived as something that an isolated brain did. In contrast, accepting the role the body plays during cognitive processes allows us to account for a more encompassing view of cognition. This shift in perspective within neuroscience suggests that successful behavior in real-world scenarios demands the integration of several sensorimotor and cognitive (as well as affective) capacities of an agent. Thus, cognition emerges in the relationship between an agent and the affordances provided by the environment rather than in the brain alone. In 2002, a collection of positive characterizations summarizing what the embodiment thesis entails for cognition were offered. Professor of Cognitive Psychology Margaret Wilson argues that the general outlook of embodied cognition "displays an interesting co-variation of multiple observations and houses a number of different claims: (1) cognition is situated; (2) cognition is time-pressured; (3) we off-load cognitive work onto the environment; (4) the environment is part of the cognitive system; (5) cognition is for action; (6) offline cognition is bodily-based". According to Wilson, the first three and the fifth claim appear to be at least partially true, while the fourth claim is deeply problematic in that all things that have an impact on the elements of a system are not necessarily considered part of the system. The sixth claim has received the least attention in the literature on embodied cognition, yet it might be the most significant of the six claims as it shows how certain human cognitive capabilities, that previously were thought to be highly abstract, now appear to be leaning towards an embodied approach for their explanation. Wilson also describes at least five main (abstract) categories that combine both sensory and motor skills (or sensorimotor functions). The first three are working memory, episodic memory, and implicit memory; the fourth is mental imagery, and finally, the fifth concerns reasoning and problem solving. == History == The theory of embodied cognition, along with the multiple aspects it comprises, can be regarded as the imminent result of an intellectual skepticism towards the flourishment of the disembodied theory of mind put forth by René Descartes in the 17th century. According to Cartesian dualism, the mind is entirely distinct from the body and can be successfully explained and understood without reference to the body or to its processes. Research has been done to identify the set of ideas that would establish what could be considered as the early stages of embodied cognition around inquiries regarding the mind-body-soul rel

    Read more →
  • Yann LeCun

    Yann LeCun

    Yann André Le Cun ( lə-KUN; French: [ləkœ̃]; usually spelled LeCun; born 8 July 1960) is a French-American computer scientist working in the fields of artificial intelligence, machine learning, computer vision, robotics and image compression. He is the Jacob T. Schwartz Professor of Computer Science at the Courant Institute of Mathematical Sciences at New York University. He served as Chief AI Scientist at Meta Platforms before co-founding Advanced Machine Intelligence Labs in December 2025. He is well known for his work on optical character recognition and computer vision using convolutional neural networks (CNNs). He is also one of the main creators of the DjVu image compression technology, alongside Léon Bottou and Patrick Haffner. He co-developed the Lush programming language with Léon Bottou. In 2018, LeCun, Yoshua Bengio, and Geoffrey Hinton received the Turing Award from the Association for Computing Machinery (ACM) for their work on deep learning. LeCun, Bengio, and Hinton, and occasionally Jürgen Schmidhuber, are sometimes referred to as the "Godfathers of AI" and "Godfathers of Deep Learning". == Early life and education == Yann André Le Cun was born on 8 July 1960 at Soisy-sous-Montmorency, in the suburbs of Paris. His surname, Le Cun, derives from the old Breton form Le Cunff and originates from the region of Guingamp in northern Brittany. Yann is the Breton form of Jean, the French form of John. He received a Diplôme d'Ingénieur from the ESIEE Paris in 1983 and a PhD in computer science from Université Pierre et Marie Curie (now Sorbonne University) in 1987, during which he proposed an early form of backpropagation, an algorithm crucial for enabling neural networks to learn. Before joining AT&T, LeCun was a postdoctoral researcher for a year, starting in 1987, supervised by Geoffrey Hinton at the University of Toronto. LeCun has three sons, and his brother is employed by Google. He has American citizenship. == Career and research == LeCun's career has been spent primarily at Bell Labs, New York University and Meta Platforms, Inc. === Bell Labs === In 1988, LeCun joined the Adaptive Systems Research Department at AT&T Bell Laboratories in Holmdel, New Jersey, United States, headed by Lawrence D. Jackel, where he developed a number of new machine learning methods, such as a biologically inspired model of image recognition called convolutional neural networks (LeNet), the "Optimal Brain Damage" regularization methods, and the Graph Transformer Networks method (similar to conditional random field), which he applied to handwriting recognition and Optical character recognition (OCR). The bank check recognition system that he helped develop was widely deployed by NCR and other companies. In 1996, he joined AT&T Labs-Research as head of the Image Processing Research Department, which was part of Lawrence Rabiner's Speech and Image Processing Research Lab, and worked primarily on the DjVu image compression technology, a format designed for efficient distribution of scanned documents, and used by the Internet Archive to provide access to digitized texts. His collaborators at AT&T include Léon Bottou and Vladimir Vapnik. === New York University === After a brief tenure as a fellow of NEC Research Institute, LeCun joined New York University in 2003, where he is Jacob T. Schwartz Chaired Professor of Computer Science and Neural Science at the Courant Institute of Mathematical Sciences and the Center for Neural Science. At NYU, he has worked primarily on energy-based models for supervised and unsupervised learning, feature learning for object recognition in computer vision, and mobile robotics. In 2012, he became the founding director of the NYU Center for Data Science. On 9 December 2013, LeCun became the first director of Meta AI Research in New York City and in early 2014 stepped down from the NYU–CDS directorship. In 2013, he and Yoshua Bengio co-founded the International Conference on Learning Representations, which adopted a post-publication open review process he previously advocated on his website. He was the chair and organiser of the "Learning Workshop" held every year between 1986 and 2012 in Snowbird, Utah. He is a member of the Science Advisory Board of the Institute for Pure and Applied Mathematics at UCLA. He is the co-director of the Learning in Machines and Brain research program (formerly Neural Computation & Adaptive Perception) of CIFAR. In 2016, he was the visiting professor of computer science on the Chaire Annuelle Informatique et Sciences Numériques at Collège de France in Paris, where he presented the leçon inaugurale (inaugural lecture). In 2023, he was named as the inaugural Jacob T. Schwartz Chaired Professor in Computer Science at NYU's Courant Institute. LeCun is also a scientific advisor to French research group Kyutai which is being funded by Xavier Niel, Rodolphe Saadé, Eric Schmidt, and others. === Meta Platforms === LeCun joined Facebook (now Meta Platforms) in 2013 as chief AI scientist and led the company's AI research laboratory, FAIR. === AMI Labs === On 19 November 2025, LeCun confirmed that he would be leaving Meta after ten years to found his own company focused on world-model architectures and human-like artificial intelligence he calls superintelligence. The company he founded, Advanced Machine Intelligence Labs (or AMI Labs), is run by CEO Alex LeBrun, with LeCun serving as Executive Chair. This venture is focused on building AI "world models": systems that learn to understand the physical world's structure and dynamics rather than just predict text like large language models. In March 2026, AMI announced it had raised $1.03 billion in funding at a $3.5 billion pre-money valuation. The funding round was co-led by investors including Cathay Innovation, Greycroft, Hiro Capital, HV Capital and Bezos Expeditions. In January 2026, LeCun became founding chair of the Technical Research Board of Logical Intelligence, an AI company developing energy-based (EBM) reasoning systems. == Honours and awards == LeCun is a member of the US National Academy of Sciences, National Academy of Engineering and the French Académie des Sciences. He has received honorary doctorates from Instituto Politécnico Nacional (IPN) in Mexico City in 2016, from EPFL in 2018, from Université Côte d'Azur in 2021, from Università di Siena in 2023, and from Hong Kong University of Science and Technology in 2023. In 2014, he received the IEEE Neural Network Pioneer Award and in 2015, the PAMI Distinguished Researcher Award. In 2018, LeCun was awarded the IRI Medal, established by the Industrial Research Institute (IRI), and the Harold Pender Award, given by the University of Pennsylvania. In 2019, he received the Golden Plate Award of the American Academy of Achievement. In March 2019, LeCun won the 2018 Turing Award, sharing it with Yoshua Bengio and Geoffrey Hinton. In 2022, he received the Princess of Asturias Award in the category "Scientific Research", along with Yoshua Bengio, Geoffrey Hinton and Demis Hassabis. In 2023, the President of France made him a Chevalier (Knight) of the French Legion of Honour. During the World Economic Forum (WEF) 2024 in Davos, he received the Global Swiss AI Award 2023. The same year, he received the grand prize of the VinFuture Prize alongside Yoshua Bengio, Jensen Huang, Geoffrey Hinton, and Fei-Fei Li for their groundbreaking contributions to neural networks and deep learning algorithms. In 2025 he was awarded the Queen Elizabeth Prize for Engineering jointly with Yoshua Bengio, Bill Dally, Geoffrey E. Hinton, John Hopfield, Jensen Huang and Fei-Fei Li.

    Read more →
  • Darkforest

    Darkforest

    Darkforest is a computer go program developed by Meta Platforms, based on deep learning techniques using a convolutional neural network. Its updated version Darkfores2 combines the techniques of its predecessor with Monte Carlo tree search. The MCTS effectively takes tree search methods commonly seen in computer chess programs and randomizes them. With the update, the system is known as Darkfmcts3. Darkforest is of similar strength to programs like CrazyStone and Zen. It has been tested against a professional human player at the 2016 UEC cup. Google's AlphaGo program won against a professional player in October 2015 using a similar combination of techniques. Darkforest is named after Liu Cixin's science fiction novel The Dark Forest. == Background == Competing with top human players in the ancient game of Go has been a long-term goal of artificial intelligence. Go's high branching factor makes traditional search techniques ineffective, even on cutting-edge hardware, and Go's evaluation function could change drastically with one stone change. However, by using a Deep Convolutional Neural Network designed for long-term predictions, Darkforest has been able to substantially improve the win rate for bots over more traditional Monte Carlo Tree Search based approaches. === Matches === Against human players, Darkfores2 achieves a stable 3d ranking on KGS Go Server, which roughly corresponds to an advanced amateur human player. However, after adding Monte Carlo Tree Search to Darkfores2 to create a much stronger player named darkfmcts3, it can achieve a 5d ranking on the KGS Go Server. ==== Against other AI ==== darkfmcts3 is on par with state-of-the-art Go AIs such as Zen, DolBaram and Crazy Stone, but lags behind AlphaGo. It won 3rd place in January 2016 KGS Bot Tournament against other Go AIs. === News coverage === After Google's AlphaGo won against Fan Hui in 2015, Facebook made its AI's hardware designs public, alongside releasing the code behind DarkForest as open-source, in addition to heavy recruiting to strengthen its team of AI engineers. == Style of play == Darkforest uses a neural network to sort through the 10100 board positions, and find the most powerful next move. However, neural networks alone cannot match the level of good amateur players or the best search-based Go engines, and so Darkfores2 combines the neural network approach with a search-based machine. A database of 250,000 real Go games were used in the development of Darkforest, with 220,000 used as a training set and the rest used to test the neural network's ability to predict the next moves played in the real games. This allows Darkforest to accurately evaluate the global state of the board, but local tactics were still poor. Search-based engines have poor global evaluation, but are good at local tactics. Combining these two approaches is difficult because search-based engines work much faster than neural networks, a problem which was solved in Darkfores2 by running the processes in parallel with frequent communication between the two. === Conventional strategies === Go is generally played by analyzing the position of the stones on the board. Various advanced players have described it as playing in some part subconsciously. Unlike chess and checkers, where AI players can simply look further forward at moves than human players, but with each round of Go having on average 250 possible moves, that approach is ineffective. Instead, neural networks copy human play by training the AI systems on images of successful moves, the AI can effectively learn how to interpret how the board looks, as many grandmasters do. In November 2015, Facebook demonstrated the combination of MCTS with neural networks, which played with a style that "felt human". === Flaws === It has been noted that Darkforest still has flaws in its playstyle. The bot sometimes plays tenuki ("move elsewhere") pointlessly when local powerful moves are required. When the bot is losing, it shows the typical behavior of MCTS, it plays bad moves and loses more. The Facebook AI team has acknowledged these as areas of future improvement. == Program architecture == The family of Darkforest computer go programs is based on convolution neural networks. The most recent advances in Darkfmcts3 combined convolutional neural networks with more traditional Monte Carlo tree search. Darkfmcts3 is the most advanced version of Darkforest, which combines Facebook's most advanced convolutional neural network architecture from Darkfores2 with a Monte Carlo tree search. Darkfmcts3 relies on a convolution neural networks that predicts the next k moves based on the current state of play. It treats the board as a 19x19 image with multiple channels. Each channel represents a different aspect of board information based upon the specific style of play. For standard and extended play, there are 21 and 25 different channels, respectively. In standard play, each players liberties are represented as six binary channels or planes. The respective plane is true if the player one, two, or three or more liberties available. Ko (i.e. illegal moves) is represented as one binary plane. Stone placement for each opponent and empty board positions are represented as three binary planes, and the duration since a stone has been placed is represented as real numbers on two planes, one for each player. Lastly, the opponents rank is represented by nine binary planes, where if all are true, the player is a 9d level, if 8 are true, an 8d level, and so forth. Extended play additionally considers the border (binary plane that is true at the border), position mask (represented as distance from the board center, i.e. x ( − 0.5 ∗ d i s t a n c e 2 ) {\displaystyle x^{(-0.5distance^{2})}} , where x {\displaystyle x} is a real number at a position), and each player's territory (binary, based on which player a location is closer to). Darkfmct3 uses a 12-layer full convolutional network with a width of 384 nodes without weight sharing or pooling. Each convolutional layer is followed by a rectified linear unit, a popular activation function for deep neural networks. A key innovation of Darkfmct3 compared to previous approaches is that it uses only one softmax function to predict the next move, which enables the approach to reduce the overall number of parameters. Darkfmct3 was trained against 300 random selected games from an empirical dataset representing different game stages. The learning rate was determined by vanilla stochastic gradient descent. Darkfmct3 synchronously couples a convolutional neural network with a Monte Carlo tree search. Since the convolutional neural network is computationally taxing, the Monte Carlo tree search focuses computation on the more likely game play trajectories. By running the neural network synchronously with the Monte Carlo tree search, it is possible to guarantee that each node is expanded by the moves predicted by the neural network. == Comparison with other systems == Darkfores2 beats Darkforest, its neural network-only predecessor, around 90% of the time, and Pachi, one of the best search-based engines, around 95% of the time. On the Kyu rating system, Darkforest holds a 1-2d level. Darkfores2 achieves a stable 3d level on KGS Go Server as a ranked bot. With the added Monte Carlo tree search, Darkfmcts3 with 5,000 rollouts beats Pachi with 10k rollouts in all 250 games; with 75k rollouts it achieves a stable 5d level in KGS server, on par with state-of-the-art Go AIs (e.g., Zen, DolBaram, CrazyStone); with 110k rollouts, it won the 3rd place in January KGS Go Tournament.

    Read more →
  • Feed forward (control)

    Feed forward (control)

    A feed forward (sometimes written feedforward) is an element or pathway within a control system that passes a controlling signal from a source in its external environment to a load elsewhere in its external environment. This is often a command signal from an external operator. In control engineering, a feedforward control system is a control system that uses sensors to detect disturbances affecting the system and then applies an additional input to minimize the effect of the disturbance. This requires a mathematical model of the system so that the effect of disturbances can be properly predicted. A control system which has only feed-forward behavior responds to its control signal in a pre-defined way without responding to the way the system reacts; it is in contrast with a system that also has feedback, which adjusts the input to take account of how it affects the system, and how the system itself may vary unpredictably. In a feed-forward system, the control variable adjustment is not error-based. Instead it is based on knowledge about the process in the form of a mathematical model of the process and knowledge about, or measurements of, the process disturbances. Some prerequisites are needed for control scheme to be reliable by pure feed-forward without feedback: the external command or controlling signal must be available, and the effect of the output of the system on the load should be known (that usually means that the load must be predictably unchanging with time). Sometimes pure feed-forward control without feedback is called 'ballistic', because once a control signal has been sent, it cannot be further adjusted; any corrective adjustment must be by way of a new control signal. In contrast, 'cruise control' adjusts the output in response to the load that it encounters, by a feedback mechanism. These systems could relate to control theory, physiology, or computing. == Overview == With feed-forward or feedforward control, the disturbances are measured and accounted for before they have time to affect the system. In the house example, a feed-forward system may measure the fact that the door is opened and automatically turn on the heater before the house can get too cold. The difficulty with feed-forward control is that the effects of the disturbances on the system must be accurately predicted, and there must not be any unmeasured disturbances. For instance, if a window was opened that was not being measured, the feed-forward-controlled thermostat might let the house cool down. The term has specific meaning within the field of CPU-based automatic control. The discipline of feedforward control as it relates to modern, CPU based automatic controls is widely discussed, but is seldom practiced due to the difficulty and expense of developing or providing for the mathematical model required to facilitate this type of control. Open-loop control and feedback control, often based on canned PID control algorithms, are much more widely used. There are three types of control systems: open-loop, feed-forward, and feedback. An example of a pure open-loop control system is manual non-power-assisted steering of a motor car; the steering system does not have access to an auxiliary power source and does not respond to varying resistance to turning of the direction wheels; the driver must make that response without help from the steering system. In comparison, power steering has access to a controlled auxiliary power source, which depends on the engine speed. When the steering wheel is turned, a valve is opened which allows fluid under pressure to turn the wheels. A sensor monitors that pressure so that the valve only opens enough to cause the correct pressure to reach the wheel turning mechanism. This is feed-forward control where the output of the system, the change in direction of travel of the vehicle, plays no part in the system. See Model predictive control. If the driver is included in the system, then they do provide a feedback path by observing the direction of travel and compensating for errors by turning the steering wheel. In that case you have a feedback system, and the block labeled System in Figure(c) is a feed-forward system. In other words, systems of different types can be nested, and the overall system regarded as a black-box. Feedforward control is distinctly different from open-loop control and teleoperator systems. Feedforward control requires a mathematical model of the plant (process and/or machine being controlled) and the plant's relationship to any inputs or feedback the system might receive. Neither open-loop control nor teleoperator systems require the sophistication of a mathematical model of the physical system or plant being controlled. Control based on operator input without integral processing and interpretation through a mathematical model of the system is a teleoperator system and is not considered feedforward control. == History == Historically, the use of the term feedforward is found in works by Harold S. Black in US patent 1686792 (invented 17 March 1923) and D. M. MacKay as early as 1956. While MacKay's work is in the field of biological control theory, he speaks only of feedforward systems. MacKay does not mention feedforward control or allude to the discipline of feedforward controls. MacKay and other early writers who use the term feedforward are generally writing about theories of how human or animal brains work. Black also has US patent 2102671 invented 2 August 1927 on the technique of feedback applied to electronic systems. The discipline of feedforward controls was largely developed by professors and graduate students at Georgia Tech, MIT, Stanford and Carnegie Mellon. Feedforward is not typically hyphenated in scholarly publications. Meckl and Seering of MIT and Book and Dickerson of Georgia Tech began the development of the concepts of Feedforward Control in the mid-1970s. The discipline of Feedforward Controls was well defined in many scholarly papers, articles and books by the late 1980s. == Benefits == The benefits of feedforward control are significant and can often justify the extra cost, time and effort required to implement the technology. Control accuracy can often be improved by as much as an order of magnitude if the mathematical model is of sufficient quality and implementation of the feedforward control law is well thought out. Energy consumption by the feedforward control system and its driver is typically substantially lower than with other controls. Stability is enhanced such that the controlled device can be built of lower cost, lighter weight, springier materials while still being highly accurate and able to operate at high speeds. Other benefits of feedforward control include reduced wear and tear on equipment, lower maintenance costs, higher reliability and a substantial reduction in hysteresis. Feedforward control is often combined with feedback control to optimize performance. == Model == The mathematical model of the plant (machine, process or organism) used by the feedforward control system may be created and input by a control engineer or it may be learned by the control system. Control systems capable of learning and/or adapting their mathematical model have become more practical as microprocessor speeds have increased. The discipline of modern feedforward control was itself made possible by the invention of microprocessors. Feedforward control requires integration of the mathematical model into the control algorithm such that it is used to determine the control actions based on what is known about the state of the system being controlled. In the case of control for a lightweight, flexible robotic arm, this could be as simple as compensating between when the robot arm is carrying a payload and when it is not. The target joint angles are adjusted to place the payload in the desired position based on knowing the deflections in the arm from the mathematical model's interpretation of the disturbance caused by the payload. Systems that plan actions and then pass the plan to a different system for execution do not satisfy the above definition of feedforward control. Unless the system includes a means to detect a disturbance or receive an input and process that input through the mathematical model to determine the required modification to the control action, it is not true feedforward control. === Open system === In control theory, an open system is a feed forward system that does not have any feedback loop to control its output. In contrast, a closed system uses on a feedback loop to control the operation of the system. In an open system, the output of the system is not fed back into the input to the system for control or operation. == Applications == === Physiological feed-forward system === In physiology, feed-forward control is exemplified by the normal anticipatory regulation of heartbeat in advance of actual physical exertion by the central autonomic network. Feed-forward

    Read more →
  • Otterly.ai

    Otterly.ai

    Otterly.ai is an Austrian software company, founded in 2024, that provides tools for generative engine optimization, the practice of monitoring and optimizing results in large language models. == History == Otterly.ai was co-founded in 2024 by Thomas Peham, Klaus-M. Schremser and Josef Trauner. The concept for OtterlyAI was developed in response to the increasing use of generative AI tools in digital search and content discovery. The company announced a technology partnership with SEO platform Semrush in January 2025.

    Read more →
  • Liang Wenfeng

    Liang Wenfeng

    Liang Wenfeng (Chinese: 梁文锋; pinyin: Liáng Wénfēng; born 1985) is a Chinese entrepreneur and businessman who is the co-founder of the quantitative hedge fund High-Flyer, as well as the founder and CEO of its artificial intelligence company DeepSeek. Liang attended Zhejiang University, and began his career by applying machine learning methods to quantitative finance. Through High-Flyer, he built large-scale computing infrastructure that was later used to support artificial intelligence research, leading to the creation of DeepSeek in 2023. DeepSeek gained international attention following the release of DeepSeek-R1, which analysts described as demonstrating high-level performance with comparatively limited compute resources. In 2025, Liang was named to Time magazine's list of 100 Most Influential People in AI and Fortune's list of the Most Powerful People in Business. == Early life == Liang was born in 1985 in the village of Mililing (米历岭村), Qinba town (覃巴镇), Wuchuan city (吴川市), Guangdong. His parents were both primary school teachers. Liang was routinely praised by both locals and teachers alike. Even since middle school, Liang was recalled for being well-known for reading comic books, while also being very proficient in mathematics. == Education == After elementary school, Liang attended Wuchuan No. 1 Middle School. There, he quickly excelled in class and ranked highly amongst his peers. He taught himself high school and university-level mathematics courses. Liang then attended Wuchaun No. 1 High School. In these years, he developed hobbies of mathematical modeling and conducting research projects. Compared to his peers, he was always ranked highly. For every mathematics exam, he always ranked within the top three. He was also the top scorer in the Zhanjiang region of Guangdong for the college entrance exam. Thus, in 2002, Liang left high school early to further pursue his education at the university level at the young age of 17. Attending Zhejiang University at the age of 17, Liang earned a Bachelor of Engineering in Electronic Information Engineering in 2007 and his Master of Engineering in Information & Communication Engineering in 2010. His master's dissertation was titled "Study on Object Tracking Algorithm Based on Low-Cost PTZ camera" (基于低成本PTZ摄像机的目标跟踪算法研究). In his college years, DJI founder Wang Tao asked Liang to join as a co-founder. Liang declined the invitation to pursue artificial intelligence methodologies in financial markets. While he states that those around him had entrepreneurial mindsets, he himself valued academics. == Career == === Early career (2008–2016) === During the 2008 financial crisis, Liang formed a team with his classmates to accumulate data related to financial markets. He also led the team to explore quantitative trading using machine learning and other technologies. After his graduation, Liang moved to a cheap flat in Chengdu, Sichuan, where he experimented with ways to apply AI to various fields. These ventures failed, until he tried applying AI to finance. In 2013, Liang attempted to integrate artificial intelligence with quantitative trading and founded Hangzhou Yakebi Investment Management Co Ltd with Xu Jin, an alumnus of Zhejiang University. In 2015, they co-founded Hangzhou Huanfang Technology Co Ltd, which is today's Zhejiang Jiuzhang Asset Management Co Ltd. === High-Flyer (2016–2023) === In February 2016, Liang and two other engineering classmates co-founded Ningbo High-Flyer Quantitative Investment Management Partnership (Limited Partnership). The team relied on mathematics and AI to make investments. Much of the early startup culture was described by former employees to be "geeky" and "quirky," often seen as contrary to the existing culture in large Chinese tech companies. In 2019, Liang founded High-Flyer AI which was dedicated to research on AI algorithms and its basic applications. By this time, High-Flyer had over 10 billion yuan in assets under management. On 30 August 2019, Liang Wenfeng delivered a keynote speech entitled "The Future of Quantitative Investment in China from a Programmer's Perspective" at the Private Equity Golden Bull Award ceremony held by China Securities Journal, and sparked heated discussions. Liang stated that the criterion for determining what is quantitative or non-quantitative is whether the investment decision is made by quantitative methods or by people. Quantitative funds do not have portfolio managers making the decisions and instead are just servers. He also stated High-Flyer's mission is to improve the effectiveness of China's secondary market. In February 2021, Gregory Zuckerman's book The Man Who Solved the Market: How Jim Simons Launched the Quant Revolution was published. Liang wrote the preface for the Chinese edition of the book where he stated that whenever he encountered difficulties at work, he would think of Simons' words "There must be a way to model prices". In January 2025, Zuckerman wrote in The Wall Street Journal where he acknowledged this fact and stated he has been trying to get in touch with Liang but much like Simons, Liang is very secretive and difficult to contact. During 2021, Liang started buying thousands of Nvidia GPUs for his AI side project while running High-Flyer. Liang wanted to build something and it will be a game changer which his business partners thought was only possible from giants such as ByteDance and Alibaba Group. === DeepSeek (since 2023) === ==== DeepSeek begins ==== In May 2023, Liang announced High-Flyer would pursue the development of artificial general intelligence and launched DeepSeek. During that month in an interview with 36Kr, Liang stated that High-Flyer had acquired 10,000 Nvidia A100 GPUs before the US government imposed AI chip restrictions on China. That laid the foundation for DeepSeek to operate as an LLM developer. Liang also stated DeepSeek gets funding from High-Flyer. This was because when DeepSeek was founded, venture capital firms were reluctant in providing funding as it was unlikely that it would be able to generate an exit in a short period of time. Liang only personally holds 1% of the company, with 99% of the company being held by Ningbo High-Flyer Quantitative Investment Management Partnership (Limited Partnership). With DeepSeek's funding model, it lacks commercial pressure and rigid key performance indicators, enabling the company to deviate from previously established model architectures. ==== Early development ==== In July 2024, Liang was interviewed again by 36Kr. He stated that when DeepSeek-V2 was released and triggered an AI price war in China, it came as a huge surprise as the team did not expect pricing to be so sensitive. Liang's aggressive pricing of the language model forced domestic tech giants including Alibaba and Baidu to cut their own rates by over 95%. He also stated that as China's economy develops, it should gradually become a contributor instead of freeriding. What is lacking in China's innovation is not capital but a lack of confidence and knowledge on organizing talent into it. DeepSeek has not hired anyone particularly special and employees tend to be locally educated. When it comes to disruptive technologies, closed source approaches can only temporarily delay others in catching up. As the goal was long-term, DeepSeek sought employees who had ability and passion rather than experience. To retain a high talent density relative to larger firms like Bytedance or Baidu, DeepSeek aimed to maintain a low-hierarchy corporate culture, with members working in project-based groups, as well as competitive compensation. Liang emphasized his vision for DeepSeek employees to bring their "unique experience and ideas" instead of needing to be explicitly directed, with an overall bottom-up approach to division of labor. Liang noted that a significant outcome of this approach was the multi-head latent attention training architecture, which was attributed directly to a young DeepSeek researcher's personal interest. This advancement played a core role in reducing the cost of training the DeepSeek-V3 model, released in December 2024. ==== Release of DeepSeek-R1 ==== Also on 20 January 2025, DeepSeek, the company Liang founded and served as the CEO, released DeepSeek-R1, a 671-billion-parameter open-source reasoning AI model, alongside the publication of a detailed technical paper explaining its architecture and training methodology. The model was built using just 2,048 Nvidia H800 GPUs at a cost of $5.6 million, showcasing a resource-efficient approach that contrasted sharply with the billion-dollar budgets of Western competitors. The development of DeepSeek-R1 occurred amidst U.S. sanctions where Trump limited sales of Nvidia chips to China. By 27 January, DeepSeek surpassed ChatGPT to become the #1 free app on the United States iOS App Store. U.S. stocks plummeted, as more than $1 trillion was erased in market capitalization amid panic over DeepSeek. Technology journ

    Read more →
  • Daisy Intelligence

    Daisy Intelligence

    Daisy Intelligence is a Canadian artificial intelligence (AI) company that provides data analysis services to help retailers, mainly grocers and supermarkets, to determine optimal pricing and promotional mix. The company also helps insurance companies detect fraudulent claims. The company uses a subset of AI known as reinforcement learning. In October 2019, the company moved from the suburban Vaughan, Ontario, to downtown Toronto, joining other AI and technology startups concentrated in the King Street East area. In 2019, the company was ranked No. 39 on The Globe and Mail's annual list of Canada's "top growing companies by three-year revenue growth."

    Read more →
  • Framebuffer

    Framebuffer

    A framebuffer (frame buffer, or sometimes framestore) is a portion of random-access memory (RAM) containing a bitmap that drives a video display. It is a memory buffer containing data representing all the pixels in a complete video frame. Modern video cards contain framebuffer circuitry in their cores. This circuitry converts an in-memory bitmap into a video signal that can be displayed on a computer monitor. In computing, a screen buffer is a part of computer memory used by a computer application for the representation of the content to be shown on the computer display. The screen buffer may also be called the video buffer, the regeneration buffer, or regen buffer for short. The phrase "screen buffer” refers to a logical function, while video memory refers to a hardware storage location. In particular, the screen buffer may be placed in the main RAM, the video memory, or some other hardware location. To reduce latency and avoid screen tearing, multiple frames can be buffered, and this technique is called multiple buffering. When this is so, at any time, only one frame would be visible, and the others would not be. The currently invisible frames are located in the off-screen buffer. The information in the buffer typically consists of color values for every pixel to be shown on the display. Color values are commonly stored in 1-bit binary (monochrome), 4-bit palettized, 8-bit palettized, 16-bit high color and 24-bit true color formats. An additional alpha channel is sometimes used to retain information about pixel transparency. The total amount of memory required for the framebuffer depends on the resolution of the output signal, and on the color depth or palette size. == History == Computer researchers had long discussed the theoretical advantages of a framebuffer but were unable to produce a machine with sufficient memory at an economically practicable cost. In 1947, the Manchester Baby computer used a Williams tube, later the Williams-Kilburn tube, to store 1024 bits on a cathode-ray tube (CRT) memory and displayed on a second CRT. Other research labs were exploring these techniques with MIT Lincoln Laboratory achieving a 4096 display in 1950. A color-scanned display was implemented in the late 1960s, called the Brookhaven RAster Display (BRAD), which used a drum memory and a television monitor. In 1969, A. Michael Noll of Bell Telephone Laboratories, Inc. implemented a scanned display with a frame buffer, using magnetic-core memory. A year or so later, the Bell Labs system was expanded to display an image with a color depth of three bits on a standard color TV monitor. The vector graphics used in the computer had to be converted for the scanned graphics of a TV display. In the early 1970s, the development of MOS memory (metal–oxide–semiconductor memory) integrated-circuit chips, particularly high-density DRAM (dynamic random-access memory) chips with at least 1 kb memory, made it practical to create, for the first time, a digital memory system with framebuffers capable of holding a standard video image. This led to the development of the SuperPaint system by Richard Shoup at Xerox PARC in 1972. Shoup was able to use the SuperPaint framebuffer to create an early digital video-capture system. By synchronizing the output signal to the input signal, Shoup was able to overwrite each pixel of data as it shifted in. Shoup also experimented with modifying the output signal using color tables. These color tables allowed the SuperPaint system to produce a wide variety of colors outside the range of the limited 8-bit data it contained. This scheme would later become commonplace in computer framebuffers. In 1974, Evans & Sutherland released the first commercial framebuffer, the Picture System, costing about $15,000. It was capable of producing resolutions of up to 512 by 512 pixels in 8-bit grayscale, and became a boon for graphics researchers who did not have the resources to build their own framebuffer. The New York Institute of Technology would later create the first 24-bit color system using three of the Evans & Sutherland framebuffers. Each framebuffer was connected to an RGB color output (one for red, one for green and one for blue), with a Digital Equipment Corporation PDP 11/04 minicomputer controlling the three devices as one. In 1975, the UK company Quantel produced the first commercial full-color broadcast framebuffer, the Quantel DFS 3000. It was first used in TV coverage of the 1976 Montreal Olympics to generate a picture-in-picture inset of the Olympic flaming torch while the rest of the picture featured the runner entering the stadium. The rapid improvement of integrated-circuit technology made it possible for many of the home computers of the late 1970s to contain low-color-depth framebuffers. Today, nearly all computers with graphical capabilities utilize a framebuffer for generating the video signal. Amiga computers, created in the 1980s, featured special design attention to graphics performance and included a unique Hold-And-Modify framebuffer capable of displaying 4096 colors. Framebuffers also became popular in high-end workstations and arcade system boards throughout the 1980s. SGI, Sun Microsystems, HP, DEC and IBM all released framebuffers for their workstation computers in this period. These framebuffers were usually of a much higher quality than could be found in most home computers, and were regularly used in television, printing, computer modeling and 3D graphics. Framebuffers were also used by Sega for its high-end arcade boards, which were also of a higher quality than on home computers. == Display modes == Framebuffers used in personal and home computing often had sets of defined modes under which the framebuffer can operate. These modes reconfigure the hardware to output different resolutions, color depths, memory layouts and refresh rate timings. In the world of Unix machines and operating systems, such conveniences were usually eschewed in favor of directly manipulating the hardware settings. This manipulation was far more flexible in that any resolution, color depth and refresh rate was attainable – limited only by the memory available to the framebuffer. An unfortunate side-effect of this method was that the display device could be driven beyond its capabilities. In some cases, this resulted in hardware damage to the display. More commonly, it simply produced garbled and unusable output. Modern CRT monitors fix this problem through the introduction of protection circuitry. When the display mode is changed, the monitor attempts to obtain a signal lock on the new refresh frequency. If the monitor is unable to obtain a signal lock or if the signal is outside the range of its design limitations, the monitor will ignore the framebuffer signal and possibly present the user with an error message. LCD monitors tend to contain similar protection circuitry, but for different reasons. Since the LCD must digitally sample the display signal (thereby emulating an electron beam), any signal that is out of range cannot be physically displayed on the monitor. == Color palette == Framebuffers have traditionally supported a wide variety of color modes. Due to the expense of memory, most early framebuffers used 1-bit (2 colors per pixel), 2-bit (4 colors), 4-bit (16 colors) or 8-bit (256 colors) color depths. The problem with such small color depths is that a full range of colors cannot be produced. The solution to this problem was indexed color, which adds a lookup table to the framebuffer. Each color stored in framebuffer memory acts as a color index. The lookup table serves as a palette with a limited number of different colors, while the rest is used as an index table. Here is a typical indexed 256-color image and its own palette (shown as a rectangle of swatches): In some designs, it was also possible to write data to the lookup table (or switch between existing palettes) on the fly, allowing dividing the picture into horizontal bars with their own palette and thus rendering an image that had a far wider palette. For example, viewing an outdoor shot photograph, the picture could be divided into four bars: the top one with emphasis on sky tones, the next with foliage tones, the next with skin and clothing tones, and the bottom one with ground colors. This required each palette to have overlapping colors, but, carefully done, allowed great flexibility. == Memory access == While framebuffers are commonly accessed via a memory mapping directly to the CPU memory space, this is not the only method by which they may be accessed. Framebuffers have varied widely in the methods used to access memory. Some of the most common are: Mapping the entire framebuffer to a given memory range. Port commands to set each pixel, range of pixels or palette entry. Mapping a memory range smaller than the framebuffer memory, then bank switching as necessary. The framebuffer organization may be packed pixel or planar. The framebuffer may be all

    Read more →
  • Visual hierarchy

    Visual hierarchy

    Visual hierarchy, in Gestalt psychology, describes how particular elements in a visual field stand out more than others in a pattern, creating a perceived order of importance. Although it can occur naturally, the term is most often used in design—especially graphic design and cartography—where elements are arranged to appear more important than others. This order is created by the visual contrast between forms in a field of perception. Objects with highest contrast to their surroundings are recognized first by the human mind. == Evidence == There is some scientific evidence for visual hierarchy using eye tracking. For example, one study found that when people agree that a graphic design is good, they exhibit more similar eye movements; measured by the Fréchet distance. == Theory == The concept of visual hierarchy is based in Gestalt psychological theory, an early 20th-century German theory that proposes that the human brain has innate organizing tendencies that “structure individual elements, shapes or forms into a coherent, organized whole,” especially when processing visual information. The German word Gestalt translates into “form,” “pattern,” or “shape” in English. When an element in a visual field disconnects from the ‘whole’ created by the brain's perceptual organization, it “stands out” to the viewer. The shapes that disconnect most severely from their surroundings stand out the most. This is commonly encapsulated as the Von Restorff effect, which states that isolation attracts attention. === Physical characteristics === The brain distinguishes objects based on differences in their physical appearances. These characteristics fall into four categories: color, size, alignment, and character. Each type of contrast can be used to construct a visual hierarchy. The same characteristics are also sometimes categorized (especially among cartographers) according to the visual variables of Jacques Bertin. Color encompasses the hue, saturation, value, and perceived texture of forms. Dark figures will stand out on a light background, light figures will stand out on a dark background, brightly colored figures will stand out on a muted background, and so on. The fluorescent colors used for tennis balls and other sports equipment is intended to make them instantly stand out against almost any natural visual field. Size has a strong influence on visual hierarchy. Large elements typically attract attention, provided that they can be recognized as figures. Alignment is the arrangement of forms relative to one another. For example, items in the upper left corner of a page are often seen first (at least for those readers accustomed to western languages), the center of the field has prominence. Negative space can also be employed: a figure isolated among large amounts of white space will stand out more than one amid other figures. Character includes several kinds of contrasts based on shape. For example, complex patterns attract more attention than simple or predictable patterns, intricate shapes attract more attention than generalized ones. Even large-scale patterns can attract attention if they contrast with the pattern in the remainder of the visual field. Camouflage is an example of eliminating contrast in character in color and/or character specifically to reduce visual hierarchy. The "squint test" is often suggested as a simple, if unscientific, method to evaluate the visual hierarchy of a graphical product like a map or web page. When viewed out of focus (or from a great distance), the viewer is not distracted by details, but can only see overall (gestalt) patterns such as visual hierarchy. All of the above patterns, except some aspects of character, are recognizable by this method. == Application == Visual hierarchy is an important concept in the field of graphic design, a field that specializes in visual organization. Designers attempt to control visual hierarchy to guide the eye to information in a specific order for a specific purpose. One could compare visual hierarchy in graphic design to grammatical structure in writing in terms of the importance of each principle to these fields. === Cartography === In cartographic design, visual hierarchy is used to emphasize certain important features on a map over less important features. Typically, a map has a purpose that dictates a conceptual hierarchy of what should be more or less important, so one of the goals of the choice of map symbols is to match the visual hierarchy to the conceptual hierarchy. The Visual hierarchy of a map may apply to individual geographic features (such as making a single country stand out), to map layers of related features (e.g., making lakes stand out more than roads), and to the entire layout of map and non-map elements (e.g., making the title look more important than the scale bar). Like the main map elements, such features have weight, and the properties that apply to visual hierarchy of map layers also apply to other elements on the page. Size and alignment are the two main determinants of the visual hierarchy for these features. Cartographers often utilize principles of negative space and figure-ground contrast to design an appropriate visual hierarchy by employing contrast between unused space and layout features. === User experience design and behavioral design === In user experience design and behavioural design, such as web design, visual hierarchy is used to prioritize navigational structures and content, so that audiences focus on elements that facilitate system usage, or increases the chance that they notice content that contains psychological nudges. Color is one of many factors used in the design of a visual hierarchy, and a key factor due to the high salience of color perception.

    Read more →
  • Capsule neural network

    Capsule neural network

    A capsule neural network (CapsNet) is a machine learning system that is a type of artificial neural network (ANN) that can be used to better model hierarchical relationships. The approach is an attempt to more closely mimic biological neural organization. The idea is to add structures called "capsules" to a convolutional neural network (CNN), and to reuse output from several of those capsules to form more stable (with respect to various perturbations) representations for higher capsules. The output is a vector consisting of the probability of an observation, and a pose for that observation. This vector is similar to what is done for example when doing classification with localization in CNNs. Among other benefits, capsnets address the "Picasso problem" in image recognition: images that have all the right parts but that are not in the correct spatial relationship (e.g., in a "face", the positions of the mouth and one eye are switched). For image recognition, capsnets exploit the fact that while viewpoint changes have nonlinear effects at the pixel level, they have linear effects at the part/object level. This can be compared to inverting the rendering of an object of multiple parts. == History == In 2000, Geoffrey Hinton et al. described an imaging system that combined segmentation and recognition into a single inference process using parse trees. So-called credibility networks described the joint distribution over the latent variables and over the possible parse trees. That system proved useful on the MNIST handwritten digit database. A dynamic routing mechanism for capsule networks was introduced by Hinton and his team in 2017. The approach was claimed to reduce error rates on MNIST and to reduce training set sizes. Results were claimed to be considerably better than a CNN on highly overlapped digits. In Hinton's original idea one minicolumn would represent and detect one multidimensional entity. == Transformations == An invariant is an object property that does not change as a result of some transformation. For example, the area of a circle does not change if the circle is shifted to the left. Informally, an equivariant is a property that changes predictably under transformation. For example, the center of a circle moves by the same amount as the circle when shifted. A nonequivariant is a property whose value does not change predictably under a transformation. For example, transforming a circle into an ellipse means that its perimeter can no longer be computed as π times the diameter. In computer vision, the class of an object is expected to be an invariant over many transformations. I.e., a cat is still a cat if it is shifted, turned upside down or shrunken in size. However, many other properties are instead equivariant. The volume of a cat changes when it is scaled. Equivariant properties such as a spatial relationship are captured in a pose, data that describes an object's translation, rotation, scale and reflection. Translation is a change in location in one or more dimensions. Rotation is a change in orientation. Scale is a change in size. Reflection is a mirror image. Unsupervised capsnets learn a global linear manifold between an object and its pose as a matrix of weights. In other words, capsnets can identify an object independent of its pose, rather than having to learn to recognize the object while including its spatial relationships as part of the object. In capsnets, the pose can incorporate properties other than spatial relationships, e.g., color (cats can be of various colors). Multiplying the object by the manifold poses the object (for an object, in space). == Pooling == Capsnets reject the pooling layer strategy of conventional CNNs that reduces the amount of detail to be processed at the next higher layer. Pooling allows a degree of translational invariance (it can recognize the same object in a somewhat different location) and allows a larger number of feature types to be represented. Capsnet proponents argue that pooling: violates biological shape perception in that it has no intrinsic coordinate frame; provides invariance (discarding positional information) instead of equivariance (disentangling that information); ignores the linear manifold that underlies many variations among images; routes statically instead of communicating a potential "find" to the feature that can appreciate it; damages nearby feature detectors, by deleting the information they rely upon. == Capsules == A capsule is a set of neurons that individually activate for various properties of a type of object, such as position, size and hue. Formally, a capsule is a set of neurons that collectively produce an activity vector with one element for each neuron to hold that neuron's instantiation value (e.g., hue). Graphics programs use instantiation value to draw an object. Capsnets attempt to derive these from their input. The probability of the entity's presence in a specific input is the vector's length, while the vector's orientation quantifies the capsule's properties. Artificial neurons traditionally output a scalar, real-valued activation that loosely represents the probability of an observation. Capsnets replace scalar-output feature detectors with vector-output capsules and max-pooling with routing-by-agreement. Because capsules are independent, when multiple capsules agree, the probability of correct detection is much higher. A minimal cluster of two capsules considering a six-dimensional entity would agree within 10% by chance only once in a million trials. As the number of dimensions increase, the likelihood of a chance agreement across a larger cluster with higher dimensions decreases exponentially. Capsules in higher layers take outputs from capsules at lower layers, and accept those whose outputs cluster. A cluster causes the higher capsule to output a high probability of observation that an entity is present and also output a high-dimensional (20-50+) pose. Higher-level capsules ignore outliers, concentrating on clusters. This is similar to the Hough transform, the RHT and RANSAC from classic digital image processing. == Routing by agreement == The outputs from one capsule (child) are routed to capsules in the next layer (parent) according to the child's ability to predict the parents' outputs. Over the course of a few iterations, each parents' outputs may converge with the predictions of some children and diverge from those of others, meaning that that parent is present or absent from the scene. For each possible parent, each child computes a prediction vector by multiplying its output by a weight matrix (trained by backpropagation). Next the output of the parent is computed as the scalar product of a prediction with a coefficient representing the probability that this child belongs to that parent. A child whose predictions are relatively close to the resulting output successively increases the coefficient between that parent and child and decreases it for parents that it matches less well. This increases the contribution that that child makes to that parent, thus increasing the scalar product of the capsule's prediction with the parent's output. After a few iterations, the coefficients strongly connect a parent to its most likely children, indicating that the presence of the children imply the presence of the parent in the scene. The more children whose predictions are close to a parent's output, the more quickly the coefficients grow, driving convergence. The pose of the parent (reflected in its output) progressively becomes compatible with that of its children. The coefficients' initial logits are the log prior probabilities that a child belongs to a parent. The priors can be trained discriminatively along with the weights. The priors depend on the location and type of the child and parent capsules, but not on the current input. At each iteration, the coefficients are adjusted via a "routing" softmax so that they continue to sum to 1 (to express the probability that a given capsule is the parent of a given child.) Softmax amplifies larger values and diminishes smaller values beyond their proportion of the total. Similarly, the probability that a feature is present in the input is exaggerated by a nonlinear "squashing" function that reduces values (smaller ones drastically and larger ones such that they are less than 1). This dynamic routing mechanism provides the necessary deprecation of alternatives ("explaining away") that is needed for segmenting overlapped objects. This learned routing of signals has no clear biological equivalent. Some operations can be found in cortical layers, but they do not seem to relate this technique. === Math/code === The pose vector u i {\textstyle \mathbf {u} _{i}} is rotated and translated by a matrix W i j {\textstyle \mathbf {W} _{ij}} into a vector u ^ j | i {\textstyle \mathbf {\hat {u}} _{j|i}} that predicts the output of the parent capsule. u ^ j | i = W i j u i {\displaystyle \mathbf {

    Read more →