AI Art Pragmata

AI Art Pragmata — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Automotive security

    Automotive security

    Automotive security refers to the branch of computer security focused on the cyber risks related to the automotive context. The increasingly high number of ECUs in vehicles and, alongside, the implementation of multiple different means of communication from and towards the vehicle in a remote and wireless manner led to the necessity of a branch of cybersecurity dedicated to the threats associated with vehicles. Not to be confused with automotive safety. == Causes == The implementation of multiple ECUs (Electronic Control Units) inside vehicles began in the early '70s thanks to the development of integrated circuits and microprocessors that made it economically feasible to produce the ECUs on a large scale. Since then the number of ECUs has increased to up to 100 per vehicle. These units nowadays control almost everything in the vehicle, from simple tasks such as activating the wipers to more safety-related ones like brake-by-wire or ABS (Anti-lock Braking System). Autonomous driving is also strongly reliant on the implementation of new, complex ECUs such as the ADAS, alongside sensors (lidars and radars) and their control units. Inside the vehicle, the ECUs are connected with each other through cabled or wireless communication networks, such as CAN bus (controller area network), MOST bus (Media Oriented System Transport), FlexRay (Automotive Network Communications Protocol) or RF (radio frequency) as in many implementations of TPMSs (tire-pressure monitoring systems). Many of these ECUs require data received through these networks that arrive from various sensors to operate and use such data to modify the behavior of the vehicle (e.g., the cruise control modifies the vehicle's speed depending on signals arriving from a button usually located on the steering wheel). Since the development of cheap wireless communication technologies such as Bluetooth, LTE, Wi-Fi, RFID and similar, automotive producers and OEMs have designed ECUs that implement such technologies with the goal of improving the experience of the driver and passengers. Safety-related systems such as the OnStar from General Motors, telematic units, communication between smartphones and the vehicle's speakers through Bluetooth, Android Auto and Apple CarPlay. == Threat model == Threat models of the automotive world are based on both real-world and theoretically possible attacks. Most real-world attacks aim at the safety of the people in and around the car, by modifying the cyber-physical capabilities of the vehicle (e.g., steering, braking, accelerating without requiring actions from the driver), while theoretical attacks have been supposed to focus also on privacy-related goals, such as obtaining GPS data on the vehicle, or capturing microphone signals and similar. Regarding the attack surfaces of the vehicle, they are usually divided in long-range, short-range, and local attack surfaces: LTE and DSRC can be considered long-range ones, while Bluetooth and Wi-Fi are usually considered short-range although still wireless. Finally, USB, OBD-II and all the attack surfaces that require physical access to the car are defined as local. An attacker that is able to implement the attack through a long-range surface is considered stronger and more dangerous than the one that requires physical access to the vehicle. In 2015 the possibility of attacks on vehicles already on the market has been proven possible by Miller and Valasek, that managed to disrupt the driving of a Jeep Cherokee while remotely connecting to it through remote wireless communication. === Controller area network attacks === The most common network used in vehicles and the one that is mainly used for safety-related communication is CAN, due to its real-time properties, simplicity, and cheapness. For this reason the majority of real-world attacks have been implemented against ECUs connected through this type of network. The majority of attacks demonstrated either against actual vehicles or in testbeds fall in one or more of the following categories: ==== Sniffing ==== Sniffing in the computer security field generally refers to the possibility of intercepting and logging packets or more generally data from a network. In the case of CAN, since it is a bus network, every node listens to all communication on the network. It is useful for the attacker to read data to learn the behavior of the other nodes of the network before implementing the actual attack. Usually, the final goal of the attacker is not to simply sniff the data on CAN, since the packets passing on this type of network are not usually valuable just to read. ==== Denial of service ==== Denial of service (DoS) in information security is usually described as an attack that has the objective of making a machine or a network unavailable. DoS attacks against ECUs connected to CAN buses can be done both against the network, by abusing the arbitration protocol used by CAN to always win the arbitration, and targeting the single ECU, by abusing the error handling protocol of CAN. In this second case the attacker flags the messages of the victim as faulty to convince the victim of being broken and therefore shut itself off the network. ==== Spoofing ==== Spoofing attacks comprise all cases in which an attacker, by falsifying data, sends messages pretending to be another node of the network. In automotive security usually spoofing attacks are divided into masquerade and replay attacks. Replay attacks are defined as all those where the attacker pretends to be the victim and sends sniffed data that the victim sent in a previous iteration of authentication. Masquerade attacks are, on the contrary, spoofing attacks where the data payload has been created by the attacker. == Real life automotive threat example == Security researchers Charlie Miller and Chris Valasek have successfully demonstrated remote access to a wide variety of vehicle controls using a Jeep Cherokee as the target. They were able to control the radio, environmental controls, windshield wipers, and certain engine and brake functions. The method used to hack the system was implementation of pre-programmed chip into the controller area network (CAN) bus. By inserting this chip into the CAN bus, he was able to send arbitrary message to CAN bus. One other thing that Miller has pointed out is the danger of the CAN bus, as it broadcasts the signal which the message can be caught by the hackers throughout the network. The control of the vehicle was all done remotely, manipulating the system without any physical interaction. Miller states that he could control any of some 1.4 million vehicles in the United States regardless of the location or distance, the only thing needed is for someone to turn on the vehicle to gain access. The work by Miller and Valasek replicated earlier work completed and published by academics in 2010 and 2011 on a different vehicle. The earlier work demonstrated the ability to compromise a vehicle remotely, over multiple wireless channels (including cellular), and the ability to remotely control critical components on the vehicle post-compromise, including the telematics unit and the car's brakes. While the earlier academic work was publicly visible, both in peer-reviewed scholarly publications and in the press, the Miller and Valesek work received even greater public visibility. == Security measures == The increasing complexity of devices and networks in the automotive context requires the application of security measures to limit the capabilities of a potential attacker. Since the early 2000 many different countermeasures have been proposed and, in some cases, applied. Following, a list of the most common security measures: Sub-networks: to limit the attacker capabilities even if he/she manages to access the vehicle from remote through a remotely connected ECU, the networks of the vehicle are divided in multiple sub-networks, and the most critical ECUs are not placed in the same sub-networks of the ECUs that can be accessed from remote. Gateways: the sub-networks are divided by secure gateways or firewalls that block messages from crossing from a sub-network to the other if they were not intended to. Intrusion Detection Systems (IDS): on each critical sub-network, one of the nodes (ECUs) connected to it has the goal of reading all data passing on the sub-network and detect messages that, given some rules, are considered malicious (made by an attacker). The arbitrary messages can be caught by the passenger by using IDS which will notify the owner regarding with unexpected message. Authentication protocols: in order to implement authentication on networks where it is not already implemented (such as CAN), it is possible to design an authentication protocol that works on the higher layers of the ISO OSI model, by using part of the data payload of a message to authenticate the message itself. Hardware Security Modules: since many ECUs are not powerful enough to keep real-time delays whi

    Read more →
  • Kuki AI

    Kuki AI

    Kuki is an embodied AI bot designed for usage in the metaverse. Formerly known as Mitsuku, Kuki is a chatbot created from the Pandorabots framework. The bot has won the Loebner Prize 5 times. == Features == Kuki claims to be an 18-year-old female chatbot from the Metaverse, and the developers have stated she has been worked on since 2005. Early work by one of the company's co-founders inspired the Spike Jonze movie Her. As of 2015, she conversed, on average, in excess of a quarter of a million times daily, and it was estimated 5 million unique users had interacted with her between 2016 and 2020. == Virtual talent, model, and influencer == Kuki has appeared as a Virtual Model in Vogue Business and at Crypto Fashion Week where she modelled NFTs and spoke about the future of digital fashion. In 2021, Kuki modelled five digital looks from emerging Vogue Talents designers for Italian Vogue, that sold out as NFTs in under an hour. Kuki has also modeled for H&M on Instagram in a digital campaign that resulted in an "11x increase in ad recall" per a case study by Meta. == Awards == As of 2019, Kuki had been awarded the Loebner Prize five times, more than any other entrant. In 2020, Kuki competed against Facebook AI's Blenderbot in a 24/7 verbal sparring match called "Bot Battle", winning 79% of the audience vote.

    Read more →
  • Powerset (company)

    Powerset (company)

    Powerset was an American company based in San Francisco, California, that, in 2006, was developing a natural language search engine for the Internet. On July 1, 2008, Powerset was acquired by Microsoft for an estimated $100 million (~$143 million in 2024). Powerset was working on building a natural language search engine that could find targeted answers to user questions (as opposed to keyword based search). For example, when confronted with a question like "Which U.S. state has the highest income tax?", conventional search engines ignore the question phrasing and instead do a search on the keywords "state", "highest", "income", and "tax". Powerset on the other hand, attempts to use natural language processing to understand the nature of the question and return pages containing the answer. The company was in the process of "building a natural language search engine that reads and understands every sentence on the Web". The company has licensed natural language technology from PARC, the former Xerox Palo Alto Research Center. On May 11, 2008, the company unveiled a tool for searching a fixed subset of English Wikipedia using conversational phrases rather than keywords. Acquisition by Microsoft: One significant milestone in Powerset's history was its acquisition by Microsoft on July 1, 2008, for an estimated $100 million. This acquisition was part of Microsoft's broader strategy to enhance its search capabilities and compete more effectively with other search engine providers, particularly Google. Natural Language Search Engine: Powerset's primary focus was on developing a natural language search engine capable of understanding and interpreting user queries in a more human-like manner. Instead of simply matching keywords, Powerset aimed to comprehend the meaning behind the words, allowing for more accurate and contextually relevant search results. Technology and Partnerships: Powerset had licensed natural language technology from PARC, the Xerox Palo Alto Research Center. This technology likely played a crucial role in the development of Powerset's NLP capabilities. Wikipedia Search Tool: In May 2008, Powerset unveiled a search tool that allowed users to search a fixed subset of English Wikipedia using conversational phrases rather than traditional keywords. This demonstrated the potential of Powerset's NLP technology in providing more precise and relevant search results. == Powerlabs == In a form of beta testing, Powerset opened an online community called Powerlabs on September 17, 2007. Business Week said: "The company hopes the site will marshal thousands of people to help build and improve its search engine before it goes public next year." Said The New York Times: "[Powerset Labs] goes far beyond the 'alpha' or 'beta' testing involved in most software projects, when users put a new product through rigorous testing to find its flaws. Powerset doesn’t have a product yet, but rather a collection of promising natural language technologies, which are the fruit of years of research at Xerox PARC." Powerlabs' initial search results are taken from Wikipedia. == Notable people == Barney Pell (born March 18, 1968, in Hollywood, California) was co-founder and CEO of Powerset. Pell received his Bachelor of Science degree in symbolic systems from Stanford University in 1989, where he graduated Phi Beta Kappa and was a National Merit Scholar. Pell received a PhD in computer science from Cambridge University in 1993, where he was a Marshall Scholar. He has worked at NASA, as chief strategist and vice president of business development at StockMaster.com (acquired by Red Herring in March, 2000) and at Whizbang! Labs. Prior to joining Powerset, Pell was an Entrepreneur-in-Residence at Mayfield Fund, a venture capital firm in Silicon Valley. Pell is also a founder of Moon Express, Inc., a U.S. company awarded a $10M commercial lunar contract by NASA and a competitor in the Google Lunar X PRIZE. Steve Newcomb was the COO and co-founder of Powerset. Prior to joining Powerset, he was a co-founder of Loudfire, General Manager at Promptu, and was on the board of directors at Jaxtr. He left Powerset in October 2007 to form Virgance, a social startup incubator. Lorenzo Thione (born in Como, Italy) was the product architect and co-founder of Powerset. Prior to joining Powerset, he worked at FXPAL in natural language processing and related research fields. Thione earned his master's degree in software engineering from the University of Texas at Austin. Ronald Kaplan, former manager of research in Natural Language Theory and Technology at PARC, served as the company's CTO and CSO. Ryan Ferrier is a member of the founding team of Powerset. He managed personnel and internal operations. After 2008 he went on to co-found Serious Business, which made Facebook applications and was later bought by Zynga. Another Powerset alumnus, Alex Le, became CTO of Serious Business and went on to become an executive producer at Zynga when it bought the company. Siqi Chen founded a stealth startup in mobile computing after leaving Powerset. Tom Preston-Werner worked at Powerset and left after the acquisition to found GitHub. == Investors == Powerset attracted a wide range of investors, many of whom had considerable experience in the venture capital field. The company received $12.5 million (~$18.2 million in 2024) in Series A funding during November 2007, co-led by the venture capital firms Foundation Capital and The Founders Fund. Among the better-known investors: Esther Dyson, founding chairman of ICANN, founder of the newsletter Release 1.0 and editor at Cnet Peter Thiel, founder and former CEO of PayPal Luke Nosek, founder of PayPal Todd Parker. Managing Partner, Hidden River Ventures Reid Hoffman, executive vice president of PayPal and founder of LinkedIn First Round Capital, seed-stage venture firm

    Read more →
  • Averbis

    Averbis

    Averbis has a focus on healthcare, pharma, automotive and intellectual property analytics. Averbis is involved in various research projects of the German Federal Ministry of Economics and Energy and the European Union such as DebugIT, EUCases, Mantra and SEMCARE. In addition to these projects, Averbis was also involved in the following projects: Greenpilot is a virtual library, which provides technical information in the fields of nutrition, environment and agriculture. Medpilot is a virtual library, which provides information about medicine and related sciences. In 2013, Averbis has been nominated for the German Founder Prize 2013. Averbis GmbH provides text analytics and text mining software to transform unstructured text into actionable information. It was founded in 2007 by IT experts after years of relevant scientific experience in the field of text mining and multilingual information retrieval. Averbis works in the field of terminology management, natural language processing, machine learning and semantic search. Its text mining software is embedded into the text mining framework UIMA.

    Read more →
  • Tensor glyph

    Tensor glyph

    In scientific visualization a tensor glyph is an object that can visualize all or most of the nine degrees of freedom, such as acceleration, twist, or shear – of a 3 × 3 {\displaystyle 3\times 3} matrix. It is used for tensor field visualization, where a data-matrix is available at every point in the grid. "Glyphs, or icons, depict multiple data values by mapping them onto the shape, size, orientation, and surface appearance of a base geometric primitive." Tensor glyphs are a particular case of multivariate data glyphs. There are certain types of glyphs that are commonly used: Ellipsoid Cuboid Cylindrical Superquadrics According to Thomas Schultz and Gordon Kindlmann, specific types of tensor fields "play a central role in scientific and biomedical studies as well as in image analysis and feature-extraction methods."

    Read more →
  • Cyclodisparity

    Cyclodisparity

    In vision science, cyclodisparity is the difference in the rotation angle of an object or scene viewed by the left and right eyes. Cyclodisparity can result from the eyes' torsional rotation (cyclorotation) or can be created artificially by presenting to the eyes two images that need to be rotated relative to each other for binocular fusion to take place. == Human and animal vision == The eyes and visual system can compensate for cyclodisparity up to a certain point; if the cyclodisparity is larger than a threshold, the images cannot be fused, resulting stereoblindness, and in double vision in subjects who otherwise have full stereo vision. When a human subject is presented with images that have artificial cyclodisparity, cyclovergence is evoked, that is, a motor response of the eye muscles that rotates the two eyes in opposite directions, thereby reducing cyclodisparity. Visually-induced cyclovergence of up to 8 degrees has been observed in normal subjects. Furthermore, up to about 8 degrees can usually be compensated by purely sensory means, that is, without physical eye rotation. This means that the normal human observer can achieve binocular image fusion in presence of cyclodisparity of up to approximately 16 degrees. Cyclodisparity due to images having been rotated inward can be compensated better when the gaze is directed downwards, and cyclodisparity due to an outward rotation can be compensated better when the gaze is directed upwards. A proposed explanation for this phenomenon is that the motor system is coordinated in such a way that the eyes perform a torsional movement to reduce the size of the search zones and thus the computational load required for solving the correspondence problem. The resulting cyclovergence at near gaze is smaller than the cyclovergence predicted by Listing's law. == Video processing and computer vision == Active camera torsion can be used in machine and computer vision for several purposes. For instance, camera torsion can be used to make improved use of the search range over which matching detectors or stereo matching algorithms operate, or to make a 3D slanted surface appear frontoparallel for further stereo processing. For image compression purposes, images with cyclodisparity are advantageously encoded using global motion compensation using a rotational motion model.

    Read more →
  • Mistral Vibe

    Mistral Vibe

    Mistral Vibe or Vibe (Le Chat until May 2026), is a chatbot that uses generative artificial intelligence developed in France by Mistral AI. Mistral Vibe is available in iOS and Android. Its services are operated on a freemium model. == History == In February 2024, Mistral AI released Le Chat. In January 2025, Mistral AI made a content deal with Agence France-Presse (AFP) that lets Le Chat query AFP's entire archive dating back to 1983. On 6 February 2025, a mobile app for Le Chat was released for iOS and Android, and a subscription tier, Pro, was introduced at a cost of $14.99 per month. In July 2025, Mistral AI released Voxtral, an open-source language model that understands and generates audio. Mistral introduced a voice mode for chatting that uses Voxtral, and projects, which allows grouping chats and files. In September 2025, Le Chat introduced the capability to remember previous conversations. In May 2026, Mistral AI announced the rebrand from Le Chat to Mistral Vibe and new features were introduced at the same time.

    Read more →
  • Harris corner detector

    Harris corner detector

    The Harris corner detector is a corner detection operator that is commonly used in computer vision algorithms to extract corners and infer features of an image. It was first introduced by Chris Harris and Mike Stephens in 1988 upon the improvement of Moravec's corner detector. Compared to its predecessor, Harris' corner detector takes the differential of the corner score into account with reference to direction directly, instead of using shifting patches for every 45 degree angles, and has been proved to be more accurate in distinguishing between edges and corners. Since then, it has been improved and adopted in many algorithms to preprocess images for subsequent applications. == Introduction == A corner is a point whose local neighborhood stands in two dominant and different edge directions. In other words, a corner can be interpreted as the junction of two edges, where an edge is a sudden change in image brightness. Corners are the important features in the image, and they are generally termed as interest points which are invariant to translation, rotation and illumination. Although corners are only a small percentage of the image, they contain the most important features in restoring image information, and they can be used to minimize the amount of processed data for motion tracking, image stitching, building 2D mosaics, stereo vision, image representation and other related computer vision areas. In order to capture the corners from the image, researchers have proposed many different corner detectors including the Kanade-Lucas-Tomasi (KLT) operator and the Harris operator which are most simple, efficient and reliable for use in corner detection. These two popular methodologies are both closely associated with and based on the local structure matrix. Compared to the Kanade-Lucas-Tomasi corner detector, the Harris corner detector provides good repeatability under changing illumination and rotation, and therefore, it is more often used in stereo matching and image database retrieval. Although there still exist drawbacks and limitations, the Harris corner detector is still an important and fundamental technique for many computer vision applications. == Development of Harris corner detection algorithm == Source: Without loss of generality, we will assume a grayscale 2-dimensional image is used. Let this image be given by I {\displaystyle I} . Consider taking an image patch ( x , y ) ∈ W {\displaystyle (x,y)\in W} (window) and shifting it by ( Δ x , Δ y ) {\displaystyle (\Delta x,\Delta y)} . The sum of squared differences (SSD) between these two patches, denoted f {\displaystyle f} , is given by: f ( Δ x , Δ y ) = ∑ ( x k , y k ) ∈ W ( I ( x k , y k ) − I ( x k + Δ x , y k + Δ y ) ) 2 {\displaystyle f(\Delta x,\Delta y)={\underset {(x_{k},y_{k})\in W}{\sum }}\left(I(x_{k},y_{k})-I(x_{k}+\Delta x,y_{k}+\Delta y)\right)^{2}} I ( x + Δ x , y + Δ y ) {\displaystyle I(x+\Delta x,y+\Delta y)} can be approximated by a Taylor expansion. Let I x {\displaystyle I_{x}} and I y {\displaystyle I_{y}} be the partial derivatives of I {\displaystyle I} , such that I ( x + Δ x , y + Δ y ) ≈ I ( x , y ) + I x ( x , y ) Δ x + I y ( x , y ) Δ y {\displaystyle I(x+\Delta x,y+\Delta y)\approx I(x,y)+I_{x}(x,y)\Delta x+I_{y}(x,y)\Delta y} This produces the approximation f ( Δ x , Δ y ) ≈ ∑ ( x , y ) ∈ W ( I x ( x , y ) Δ x + I y ( x , y ) Δ y ) 2 , {\displaystyle f(\Delta x,\Delta y)\approx {\underset {(x,y)\in W}{\sum }}\left(I_{x}(x,y)\Delta x+I_{y}(x,y)\Delta y\right)^{2},} which can be written in matrix form: f ( Δ x , Δ y ) ≈ ( Δ x Δ y ) M ( Δ x Δ y ) , {\displaystyle f(\Delta x,\Delta y)\approx {\begin{pmatrix}\Delta x&\Delta y\end{pmatrix}}M{\begin{pmatrix}\Delta x\\\Delta y\end{pmatrix}},} where M is the structure tensor, M = ∑ ( x , y ) ∈ W [ I x 2 I x I y I x I y I y 2 ] = [ ∑ ( x , y ) ∈ W I x 2 ∑ ( x , y ) ∈ W I x I y ∑ ( x , y ) ∈ W I x I y ∑ ( x , y ) ∈ W I y 2 ] {\displaystyle M={\underset {(x,y)\in W}{\sum }}{\begin{bmatrix}I_{x}^{2}&I_{x}I_{y}\\I_{x}I_{y}&I_{y}^{2}\end{bmatrix}}={\begin{bmatrix}{\underset {(x,y)\in W}{\sum }}I_{x}^{2}&{\underset {(x,y)\in W}{\sum }}I_{x}I_{y}\\{\underset {(x,y)\in W}{\sum }}I_{x}I_{y}&{\underset {(x,y)\in W}{\sum }}I_{y}^{2}\end{bmatrix}}} == Process of Harris corner detection algorithm == Commonly, Harris corner detector algorithm can be divided into five steps. Color to grayscale Spatial derivative calculation Structure tensor setup Harris response calculation Non-maximum suppression === Color to grayscale === If we use Harris corner detector in a color image, the first step is to convert it into a grayscale image, which will enhance the processing speed. The value of the gray scale pixel can be computed as a weighted sums of the values R, B and G of the color image, ∑ C ∈ { R , G , B } w C ⋅ C {\displaystyle \sum _{C\,\in \,\{R,G,B\}}w_{C}\cdot C} , where, e.g., w R = 0.299 , w G = 0.587 , w B = 1 − ( w R + w G ) = 0.114. {\displaystyle w_{R}=0.299,\ w_{G}=0.587,\ w_{B}=1-(w_{R}+w_{G})=0.114.} === Spatial derivative calculation === Next, we are going to find the derivative with respect to x and the derivative with respect to y, I x ( x , y ) {\displaystyle I_{x}(x,y)} and I y ( x , y ) {\displaystyle I_{y}(x,y)} . This can be approximated by applying Sobel operators. === Structure tensor setup === With I x ( x , y ) {\displaystyle I_{x}(x,y)} , I y ( x , y ) {\displaystyle I_{y}(x,y)} , we can construct the structure tensor M {\displaystyle M} . === Harris response calculation === For x ≪ y {\displaystyle x\ll y} , one has x ⋅ y x + y = x 1 1 + x / y ≈ x . {\displaystyle {\tfrac {x\cdot y}{x+y}}=x{\tfrac {1}{1+x/y}}\approx x.} In this step, we compute the smallest eigenvalue of the structure tensor using that approximation: λ min ≈ λ 1 λ 2 ( λ 1 + λ 2 ) = det ( M ) tr ⁡ ( M ) {\displaystyle \lambda _{\min }\approx {\frac {\lambda _{1}\lambda _{2}}{(\lambda _{1}+\lambda _{2})}}={\frac {\det(M)}{\operatorname {tr} (M)}}} with the trace t r ( M ) = m 11 + m 22 {\displaystyle \mathrm {tr} (M)=m_{11}+m_{22}} . Another commonly used Harris response calculation is shown as below, R = λ 1 λ 2 − k ( λ 1 + λ 2 ) 2 = det ( M ) − k tr ⁡ ( M ) 2 {\displaystyle R=\lambda _{1}\lambda _{2}-k(\lambda _{1}+\lambda _{2})^{2}=\det(M)-k\operatorname {tr} (M)^{2}} where k {\displaystyle k} is an empirically determined constant; k ∈ [ 0.04 , 0.06 ] {\displaystyle k\in [0.04,0.06]} . === Non-maximum suppression === In order to pick up the optimal values to indicate corners, we find the local maxima as corners within the window which is a 3 by 3 filter. == Improvement == Sources: Harris-Laplace Corner Detector Differential Morphological Decomposition Based Corner Detector Multi-scale Bilateral Structure Tensor Based Corner Detector == Applications == Image Alignment, Stitching and Registration 2D Mosaics Creation 3D Scene Modeling and Reconstruction Motion Detection Object Recognition Image Indexing and Content-based Retrieval Video Tracking

    Read more →
  • Core FTP

    Core FTP

    Core FTP LE is a freeware secure FTP client for Windows, developed by CoreFTP.com. Features include FTP, SSL/TLS, SFTP via SSH, and HTTP/HTTPS support. Secure FTP clients encrypt account information and data transferred across the internet, protecting data from being seen, or sniffed across networks. Core FTP is a traditional FTP client with local files displayed on the left, remote files on the right. Core FTP Server is a secure FTP server for Windows, developed by CoreFTP.com, starting in 2010. == Licensing == CoreFTP LE is free for personal, educational, non-profit, and business use.

    Read more →
  • Residual neural network

    Residual neural network

    A residual neural network (also referred to as a residual network or ResNet) is a deep learning architecture in which the layers learn residual functions with reference to the layer inputs. It was developed in 2015 for image recognition, and won the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) of that year. As a point of terminology, "residual connection" refers to the specific architectural motif of x ↦ f ( x ) + x {\displaystyle x\mapsto f(x)+x} , where f {\displaystyle f} is an arbitrary neural network module. The motif had been used previously (see §History for details). However, the publication of ResNet made it widely popular for feedforward networks, appearing in neural networks that are seemingly unrelated to ResNet. The residual connection stabilizes the training and convergence of deep neural networks with hundreds of layers, and is a common motif in deep neural networks, such as transformer models (e.g., BERT, and GPT models such as ChatGPT), the AlphaGo Zero system, the AlphaStar system, and the AlphaFold system. == Mathematics == === Residual connection === In a multilayer neural network model, consider a (non-residual) subnetwork with a certain number of stacked layers (e.g., 2 or 3). Let H ( x ; α ) {\displaystyle H(x;\alpha )} denote the subnetwork. Suppose H ∗ {\displaystyle H^{}} is the desired optimal output of this subnetwork. Residual learning simply adds x {\displaystyle x} directly to the output, such that the optimal learned output now becomes be H ∗ − x {\displaystyle H^{}-x} , which is interpreted as a "residual" with respect to x {\displaystyle x} . The operation of "adding x {\displaystyle x} " is implemented via a "skip connection" that performs an identity mapping to connect the input of the subnetwork with its output. This connection is referred to as a "residual connection" in later work. Let F ( x ; α ) = H ( x ; a ) + x {\displaystyle F(x;\alpha )=H(x;a)+x} . The function F {\displaystyle F} is often represented by matrix multiplication interlaced with activation functions and normalization operations (e.g., batch normalization or layer normalization). As a whole, one of these subnetworks is referred to as a "residual block". A deep residual network is constructed by simply stacking these blocks. Long short-term memory (LSTM) has a memory mechanism that serves as a residual connection. In an LSTM without a forget gate, an input x t {\displaystyle x_{t}} is processed by a function F {\displaystyle F} and added to a memory cell c t {\displaystyle c_{t}} , resulting in c t + 1 = c t + F ( x t ) {\displaystyle c_{t+1}=c_{t}+F(x_{t})} . An LSTM with a forget gate essentially functions as a highway network. To stabilize the variance of the layers' inputs, it is recommended to replace the residual connections x + f ( x ) {\displaystyle x+f(x)} with x / L + f ( x ) {\displaystyle x/L+f(x)} , where L {\displaystyle L} is the total number of residual layers. === Projection connection === If the function F {\displaystyle F} is of type F : R n → R m {\displaystyle F:\mathbb {R} ^{n}\to \mathbb {R} ^{m}} where n ≠ m {\displaystyle n\neq m} , then F ( x ) + x {\displaystyle F(x)+x} is undefined. To handle this special case, a projection connection is used: y = F ( x ) + P ( x ) {\displaystyle y=F(x)+P(x)} where P {\displaystyle P} is typically a linear projection, defined by P ( x ) = M x {\displaystyle P(x)=Mx} where M {\displaystyle M} is a m × n {\displaystyle m\times n} matrix. The matrix is trained via backpropagation, as is any other parameter of the model. === Signal propagation === The introduction of identity mappings facilitates signal propagation in both forward and backward paths. ==== Forward propagation ==== If the output of the ℓ {\displaystyle \ell } -th residual block is the input to the ( ℓ + 1 ) {\displaystyle (\ell +1)} -th residual block (assuming no activation function between blocks), then the ( ℓ + 1 ) {\displaystyle (\ell +1)} -th input is: x ℓ + 1 = F ( x ℓ ) + x ℓ {\displaystyle x_{\ell +1}=F(x_{\ell })+x_{\ell }} Applying this formulation recursively, e.g.: x ℓ + 2 = F ( x ℓ + 1 ) + x ℓ + 1 = F ( x ℓ + 1 ) + F ( x ℓ ) + x ℓ {\displaystyle {\begin{aligned}x_{\ell +2}&=F(x_{\ell +1})+x_{\ell +1}\\&=F(x_{\ell +1})+F(x_{\ell })+x_{\ell }\end{aligned}}} yields the general relationship: x L = x ℓ + ∑ i = ℓ L − 1 F ( x i ) {\displaystyle x_{L}=x_{\ell }+\sum _{i=\ell }^{L-1}F(x_{i})} where L {\textstyle L} is the index of a residual block and ℓ {\textstyle \ell } is the index of some earlier block. This formulation suggests that there is always a signal that is directly sent from a shallower block ℓ {\textstyle \ell } to a deeper block L {\textstyle L} . ==== Backward propagation ==== The residual learning formulation provides the added benefit of mitigating the vanishing gradient problem to some extent. However, it is crucial to acknowledge that the vanishing gradient issue is not the root cause of the degradation problem, which is tackled through the use of normalization. To observe the effect of residual blocks on backpropagation, consider the partial derivative of a loss function E {\displaystyle {\mathcal {E}}} with respect to some residual block input x ℓ {\displaystyle x_{\ell }} . Using the equation above from forward propagation for a later residual block L > ℓ {\displaystyle L>\ell } : ∂ E ∂ x ℓ = ∂ E ∂ x L ∂ x L ∂ x ℓ = ∂ E ∂ x L ( 1 + ∂ ∂ x ℓ ∑ i = ℓ L − 1 F ( x i ) ) = ∂ E ∂ x L + ∂ E ∂ x L ∂ ∂ x ℓ ∑ i = ℓ L − 1 F ( x i ) {\displaystyle {\begin{aligned}{\frac {\partial {\mathcal {E}}}{\partial x_{\ell }}}&={\frac {\partial {\mathcal {E}}}{\partial x_{L}}}{\frac {\partial x_{L}}{\partial x_{\ell }}}\\&={\frac {\partial {\mathcal {E}}}{\partial x_{L}}}\left(1+{\frac {\partial }{\partial x_{\ell }}}\sum _{i=\ell }^{L-1}F(x_{i})\right)\\&={\frac {\partial {\mathcal {E}}}{\partial x_{L}}}+{\frac {\partial {\mathcal {E}}}{\partial x_{L}}}{\frac {\partial }{\partial x_{\ell }}}\sum _{i=\ell }^{L-1}F(x_{i})\end{aligned}}} This formulation suggests that the gradient computation of a shallower layer, ∂ E ∂ x ℓ {\textstyle {\frac {\partial {\mathcal {E}}}{\partial x_{\ell }}}} , always has a later term ∂ E ∂ x L {\textstyle {\frac {\partial {\mathcal {E}}}{\partial x_{L}}}} that is directly added. Even if the gradients of the F ( x i ) {\displaystyle F(x_{i})} terms are small, the total gradient ∂ E ∂ x ℓ {\textstyle {\frac {\partial {\mathcal {E}}}{\partial x_{\ell }}}} resists vanishing due to the added term ∂ E ∂ x L {\textstyle {\frac {\partial {\mathcal {E}}}{\partial x_{L}}}} . == Variants of residual blocks == === Basic block === A basic block is the simplest building block studied in the original ResNet. This block consists of two sequential 3x3 convolutional layers and a residual connection. The input and output dimensions of both layers are equal. === Bottleneck block === A bottleneck block consists of three sequential convolutional layers and a residual connection. The first layer in this block is a 1×1 convolution for dimension reduction (e.g., to 1/2 of the input dimension); the second layer performs a 3×3 convolution; the last layer is another 1×1 convolution for dimension restoration. The models of ResNet-50, ResNet-101, and ResNet-152 are all based on bottleneck blocks. === Pre-activation block === The pre-activation residual block applies activation functions before applying the residual function F {\displaystyle F} . Formally, the computation of a pre-activation residual block can be written as: x ℓ + 1 = F ( ϕ ( x ℓ ) ) + x ℓ {\displaystyle x_{\ell +1}=F(\phi (x_{\ell }))+x_{\ell }} where ϕ {\displaystyle \phi } can be any activation (e.g. ReLU) or normalization (e.g. LayerNorm) operation. This design reduces the number of non-identity mappings between residual blocks, and allows an identity mapping directly from the input to the output. This design was used to train models with 200 to over 1000 layers, and was found to consistently outperform variants where the residual path is not an identity function. The pre-activation ResNet with 200 layers took 3 weeks to train for ImageNet on 8 GPUs in 2016. Since GPT-2, transformer blocks have been mostly implemented as pre-activation blocks. This is often referred to as "pre-normalization" in the literature of transformer models. == Applications == Originally, ResNet was designed for computer vision. All transformer architectures include residual connections. Indeed, very deep transformers cannot be trained without them. The original ResNet paper made no claim on being inspired by biological systems. However, later research has related ResNet to biologically-plausible algorithms. A study published in Science in 2023 disclosed the complete connectome of an insect brain (specifically that of a fruit fly larva). This study discovered "multilayer shortcuts" that resemble the skip connections in artificial neural networks, including ResNets. == History == === Previous work === Residual connections were noticed in neu

    Read more →
  • Aikuma

    Aikuma

    Aikuma is an Android app for collecting speech recordings with time-aligned translations. The app includes a text-free interface for consecutive interpretation, designed for users who are not literate. The Aikuma won Grand Prize in the Open Source Software World Challenge (2013). == Name == Aikuma means "meeting place" in Usarufa, a Papuan language where this software was first used in 2012. == History == Aikuma was developed with sponsorship from the National Science Foundation, including a $101,501 (US) project, "to use mobile telephones to collect larger amounts of data on undocumented endangered languages than would never be possible through usual fieldwork." Aikuma and its modified version (Lig-Aikuma) have been used for collecting substantial quantities of audio in remote indigenous villages. A modified version of the app, called Lig-Aikuma, has been developed at the Université Grenoble Alpes (LIG laboratory) and implements new features such as elicitation of speech from text, images and videos. == Similar Software == Lingua Libre is an online collaborative project and tool by the Wikimedia France association, which can be used as a tool for Language Preservation. Lingua Libre enables to record words, phrases, or sentences of any language, oral (audio recording) or signed (video recording). It is a highly efficient method to record endangered languages since up to 1000 words can be recorded per hour. All the content is under Free License, and speakers of minority languages are encouraged to record their own dialects.

    Read more →
  • Semantic decomposition (natural language processing)

    Semantic decomposition (natural language processing)

    A semantic decomposition is an algorithm that breaks down the meanings of phrases or concepts into less complex concepts. The result of a semantic decomposition is a representation of meaning. This representation can be used for tasks, such as those related to artificial intelligence or machine learning. Semantic decomposition is common in natural language processing applications. The basic idea of a semantic decomposition is taken from the learning skills of adult humans, where words are explained using other words. It is based on Meaning-text theory. Meaning-text theory is used as a theoretical linguistic framework to describe the meaning of concepts with other concepts. == Background == Given that an AI does not inherently have language, it is unable to think about the meanings behind the words of a language. An artificial notion of meaning needs to be created for a strong AI to emerge. Creating an artificial representation of meaning requires the analysis of what meaning is. Many terms are associated with meaning, including semantics, pragmatics, knowledge and understanding or word sense. Each term describes a particular aspect of meaning, and contributes to a multitude of theories explaining what meaning is. These theories need to be analyzed further to develop an artificial notion of meaning best fit for our current state of knowledge. == Graph representations == Representing meaning as a graph is one of the two ways that both an AI cognition and a linguistic researcher think about meaning (connectionist view). Logicians utilize a formal representation of meaning to build upon the idea of symbolic representation, whereas description logics describe languages and the meaning of symbols. This contention between 'neat' and 'scruffy' techniques has been discussed since the 1970s. Research has so far identified semantic measures and with that word-sense disambiguation (WSD) - the differentiation of meaning of words - as the main problem of language understanding. As an AI-complete environment, WSD is a core problem of natural language understanding. AI approaches that use knowledge-given reasoning creates a notion of meaning combining the state of the art knowledge of natural meaning with the symbolic and connectionist formalization of meaning for AI. The abstract approach is shown in Figure. First, a connectionist knowledge representation is created as a semantic network consisting of concepts and their relations to serve as the basis for the representation of meaning. This graph is built out of different knowledge sources like WordNet, Wiktionary, and BabelNET. The graph is created by lexical decomposition that recursively breaks each concept semantically down into a set of semantic primes. The primes are taken from the theory of Natural Semantic Metalanguage, which has been analyzed for usefulness in formal languages. Upon this graph marker passing is used to create the dynamic part of meaning representing thoughts. The marker passing algorithm, where symbolic information is passed along relations form one concept to another, uses node and edge interpretation to guide its markers. The node and edge interpretation model is the symbolic influence of certain concepts. Future work uses the created representation of meaning to build heuristics and evaluate them through capability matching and agent planning, chatbots or other applications of natural language understanding.

    Read more →
  • Prequel (mobile application)

    Prequel (mobile application)

    Prequel, Inc. is an American technology company and mobile app developer known for developing the Prequel mobile application, which enables editing photos and videos with filters and effects generated using artificial intelligence. Prequel was founded in 2018 by Serge Aliseenko and Timur Khabirov, who currently serves as the company's CEO. It is headquartered in New York City. As of August 2022, it had been downloaded more than 100 million times. == History == In 2016, entrepreneur Timur Khabirov and investor Serge Aliseenko registered a US corporation named AIAR Labs Inc, which was developing AR solutions as an outsourced contractor. Of several proprietary products, Prequel was selected for beta-testing as a product focused on editing photos and videos. In 2018, Prequel was released on the Apple App Store. The launch cost $3 million USD, financed with the founders’ personal funds. The first release included approximately 10 filters for photos and the same amount of effects that augmented images with rose petals, rain and snow, VHS and film reel simulations, glitch, grain, sun puddles, and lomography. By June 2020, the app had also been released for Android. In 2021, Prequel founders Timur Khabirov and Serge Aliseenko launched a venture studio for startups working with artificial, computer vision, and AR-based visual art. In December 2022, Prequel reached the number 14 slot on the global rankings for Apple App Store’s Top Charts and the number 5 slot on the App Store’s U.S. charts. In March 2023, Prequel launched a new app called Artique, which is an AI-powered image editing app for businesses. Artique provides advertising and marketing graphic design using ready-made templates that users can customize, while giving suggestions and visual cues through artificial intelligence. Prequel was also one of the companies participating in discussions about artificial intelligence at SXSW 2023. == Features == Prequel describes its app as an "Aesthetic Pic Editor. The app uses artificial intelligence to create and edit content. Prequel can be used to touch up faces on images and videos and can also tie various decorative elements to certain points on the human body and face. Prequel filters include the "Cartoon" filter, which converts selfies into cartoon-style pictures. Other filters include Kidcore, Dust, Grain, Fisheye, Retro Style, Miami, Disco, and VHS-style filters, as well as the ability to create Renaissance-style pictures. Prequel also gives users the ability to apply color correction tools and to make moving images with 3D effects out of 2D images. Prequel allows users to take photos and videos directly through the app and apply filters and effects in real time. The app also comes with manual editing options for photos, such as adjusting the brightness and/or exposure and cropping photos, as well as an option to automatically apply adjustments. The Prequel app uses the Core ML, MNN, and TFLight frameworks to work with its neural networks. Some AI solutions are launched server-side, and some on the user's mobile device. A resulting photo or video edited with the app is called "a prequel." The app daily generates over 2 million such prequels, which are published by users in Instagram, TikTok, and other social media. As of 2022, the app has more than 800 filters and effects, along with video templates and support for GIFs and stickers. Prequel is free-to-use, but has a premium version that gives users access to more effects, filters, and beauty tools. Since its launch in 2018, Prequel has been downloaded more than 100 million times.

    Read more →
  • Ultra Hal

    Ultra Hal

    Ultra Hal is a chatbot intended to function as a virtual assistant. It was developed by Zabaware, Inc. Ultra Hal uses a natural language interface with animated characters using speech synthesis. Users can communicate with the chatterbot via typing or via a speech recognition engine. It utilizes the WordNet lexical dictionary. Its name is an allusion to HAL 9000, the artificial intelligence from the movie 2001: A Space Odyssey. Ultra Hal won the 2007 Loebner Prize for "most human" chatterbot.

    Read more →
  • Virtual Woman

    Virtual Woman

    Virtual Woman is a software program that has elements of a chatbot, virtual reality, artificial intelligence, a video game, and a virtual human. It claims to be the oldest form of virtual life in existence, as it has been distributed since the late 1980s. Recent releases of the program can update their intelligence by connecting online and downloading newer personalities and histories. == Program play == When Virtual Woman starts, the user is presented with a list of options and then may choose their Virtual Woman's ethnic type, personality, location, clothing, etc. or load a pre-built Virtual Woman from a Digital DNA file. Once the options are determined, the user is presented with a 3-D animated Virtual Woman of their selection and then can engage them in conversation, progressing in a manner similar to that of its predecessor, ELIZA and its successors, the chatbots. In most versions of Virtual Woman, this is done through the keyboard, but some versions also support voice input. == In popular culture == Software sales and usage statistics from private companies are difficult to verify. WinSite, an independent Internet shareware distribution site that does publish public download counts, has for some time now listed some version of Virtual Woman in their top three shareware downloads of all time with well over seven hundred thousand downloads. == Compadre == The group of beta testers and advisers for Virtual Woman are referred to as Compadre and have their own beta testing site and forum. == Criticisms == As Virtual Woman has developed the ability to conduct longer and more realistic interactions, particularly in recent beta releases, criticism has arisen that this may lead some users to social isolation, or to use the program as a substitute for real human interaction. However, these are criticisms that have been leveled at all video games and at the use of the Internet itself. == Release history == Versions of Virtual Woman with rough release dates and PC platforms for which they were designed: Virtual Woman (????) (DOS) Virtual Woman for Windows (1991) (Windows 3.0) Virtual Woman 95 (1995) (Windows 3X, Windows 95) Virtual Woman 98 (1998) (Windows 3X, Windows 95) Virtual Woman 2000 (2000) (Windows 95+) Virtual Woman Millennium (Windows 95, XP) Virtual Woman Net ( Windows XP/Vista specific)

    Read more →