Arattai

Arattai

Arattai Messenger (or simply Arattai) is an encrypted messaging service for instant messaging, voice calls, and video calls, developed by Zoho Corporation. The name Arattai means "chat" or "conversation" in Tamil. The app was soft-launched in January 2021. The app saw a sharp surge in downloads in September 2025, partially fueled by endorsements from Indian government officials. However, the app dropped from the top rankings in October 2025. == History == Arattai was initially tested internally among Zoho employees before being released publicly in early 2021. The launch coincided with a surge in interest for privacy-focused and messaging services, triggered by concerns over WhatsApp's updated terms of service. In September 2025, Arattai experienced a major surge in adoption, with daily sign-ups reportedly increasing 100-fold, from around 3,000 to more than 350,000 in three days. The surge in downloads was attributed to Zoho products being promoted by Indian government officials as part of their Make in India push for homegrown alternatives to foreign‐owned apps, amid deteriorating India–US relations. The growth temporarily strained Zoho's infrastructure, prompting rapid scaling of servers and capacity expansion. During the same period, the app reached the top position in Apple's App Store charts for the "Social Networking" category in India. The app dropped from the top ranking in late October 2025. == Reception == At launch, Arattai was positioned as a potential domestic rival to WhatsApp in India, but analysts noted that it faced challenges with encryption, ecosystem, and network effect. Critics pointed to occasional sync delays.

Alipay

Alipay (simplified Chinese: 支付宝; traditional Chinese: 支付寶; pinyin: zhīfùbǎo) is a third-party mobile and online payment platform, established in Hangzhou, China, in February 2004 by Alibaba Group and its founder Jack Ma. In 2015, Alipay moved its headquarters to Pudong, Shanghai, although its parent company Ant Financial remains Hangzhou-based. Alipay overtook PayPal as the world's largest mobile (digital) payment platform in 2013. As of June 2020, Alipay serves over 1.3 billion users and 80 million merchants. According to the statistics of the fourth quarter of 2018, Alipay has a 55.32% share of the third-party payment market in mainland China, and it continues to grow. Along with WeChat, Alipay has been described to be China's super-app with a wide range of functionalities including ridesharing, travel booking and medical appointments. == History == The service was first launched in 2003, by Taobao. The People's Bank of China, China's central bank, issued licensing regulations in June 2010 for third-party payment providers. It also issued separate guidelines for foreign-funded payment institutions. Because of this, Alipay, which accounted for half of China's non-bank online payment market, was restructured as a domestic company controlled by Alibaba CEO Jack Ma in order to facilitate the regulatory approval for the license. The 2010 transfer of Alipay's ownership was controversial, with media reports in 2011 that Yahoo! and Softbank (Alibaba Group's controlling shareholders) were not informed of the sale for nominal value. Chinese business publication Century Weekly criticised Ma, who stated that Alibaba Group's board of directors was aware of the transaction. The incident was criticised in foreign and Chinese media as harming foreign trust in making Chinese investments. The ownership dispute was resolved by Alibaba Group, Yahoo!, and Softbank in July 2011. In 2013, Alipay launched a financial product platform called Yu'e Bao. Alipay partnered with Tianhong Asset Management to launch the it. Yu'e Bao offers an online money market account in which Alipay customers can deposit money and receive a higher interest rate than that available from banks. It soon became China's largest online money market fund and prompted competitors like Baidu and Tencent to introduce alternatives. Alibaba (the parent company of Alipay) reported having 152 million Yu'e Bao users in mid-2016, with 810 billion RMB (US$117 billion) in funds under management. In 2015, Alipay's parent company was re-branded as Ant Financial Services Group. In 2017, Alipay unveiled their facial recognition payment service. In 2020, Alipay upgraded from a payment financial instrument to an open platform for digital life. In 2021, the mandate by the Ministry of Industry and Information Technology (MIIT) to open up the "walled garden" ecosystems of the major tech companies has led to the introduction of interoperability of payment QR codes of Alipay and competing WeChat Pay and UnionPay's Cloud QuickPass platforms. In response to the increase in Alipay's payment volume due to use on Alibaba's e-commerce sites and others, Chinese regulators introduced new rules in 2020. The new rules focused on Alipay because the payment volume exploded due to its use on Alibaba's e-commerce sites and other platforms. By the second quarter in 2020, Alipay held 55.6% of China's third party mobile payment market. The People's Bank of China made rules that required payment firms to place money with regulators and anti-monopoly reviews would be triggered if the amount exceeded 50% market share. The rules included that the People's Bank of China mandate an online-payment clearing route through the NetsUnion Clearing Corporation, a centralized, state-overseen clearing body, and that unused consumer funds be held by a third-party payment provider in a non-interest-bearing account. These measures increased transparency and reduced systemic risk. When Alipay operates outside of China, it must comply with local financial regulations, which may treat specific functions such as money-market funds or investment-linked products. In Singapore, such services may require prior authorization from securities or financial-services regulators before they can be offered to residents. == Services == Alipay states that it operates with more than 65 financial institutions including Visa and MasterCard to provide payment services for Taobao and Tmall as well as more than 460,000 online and local Chinese businesses. Alipay is used in smartphones with their Alipay Wallet app. QR code payment codes are used for local in-store payments. The Alipay app also provides features such as credit card bill payments, bank account managements, P2P transfer, prepay mobile phone top-up, bus and train ticket purchases, food orders, vehicles for hire, insurance selections and a digital identification document storage. Alipay also allows online check-out on most Chinese-based websites such as Taobao and Tmall. The Alipay app allows users to add their own services provided from different companies to create a more personalised experience. Since late 2008, Alipay has promoted public service payment services and has covered more than 300 cities nationwide, supporting more than 1,200 partner organizations. In addition to utility bills such as water and electricity, Alipay also extends their services to areas such as paying transportation fines, property fees, and cable television fees. Common online payment services also include hydropower coal payment, tuition payment and traffic fine. On 15 January 2009, Alipay launched a credit card repayment service, supporting 39 domestic bank-issued credit cards. It is currently the most popular third-party repayment platform. The main advantages are free credit card bills checking, repayments with no administrative fee, as well as automatic repayment, repayment reminders and other value-added services. In the first quarter of 2014, 76% of credit cards were also paid by Alipay Wallet. From December 2013, several chain convenience store companies, including Meiyijia, Hongqi Chain, and Qishiduo C-STORE and 7-Eleven, have successively supported Alipay payment; in December, Beijing taxi drivers began to accept Alipay to pay the fare. Subsequently, Wanda Cinema, Joy City, Wangfujing and other large-scale retail companies as well as movie theaters, KTV, and catering companies have access to Alipay. From 26 March 2019, the service fee will be charged for the payment of credit card through Alipay. Customers only pay the portion of the payment that exceeds 2,000 yuan at 0.1%. In addition to this, in 2019, Walgreens accepted Alipay as payment in 3,000 US stores. Walgreen's products are available to Chinese customers through Alibaba's Tmall online marketplace. The payment application can also be used on Alibaba.com's site and Taobao as a means of payment. A Nielsen report suggests that over 90% of Chinese tourists would be willing to use mobile payment overseas if given the option. Many Chinese tourists do not have international credit cards, and so Alipay is a payment option. Digital payments have become the norm in China as the government pushes a cashless system even in rural and village areas. In November 2019, Alipay introduced Tourpass, a service component that allows non-Chinese users to use its mobile payment feature by pre-loading Chinese Yuan equivalent foreign currency into the app. In 2020, Alipay used a QR code system to help in containing the COVID-19 outbreak. The health code system tags users one of three colors according to their location, basic health information and travel history. "Beauty filters" were included to Alipay's face-scan payment system in a new upgrade that was released in July 2019. The market has responded well to the "beauty filters," which make users seem better when they use the program to make payments. Alipay Tap is a payment function launched by Alipay in July 2024. Alipay+ NFC enables wallets to offer tap-to-pay acceptance across Mastercard's global contactless network, all within your existing wallet infrastructure. == Foreign expansion == Outside of China, more than 300 worldwide merchants use Alipay to sell directly to consumers in China. It currently supports transactions in 18 foreign currencies. Since the launch of Alipay in the Mainland China, Ant Financial introduced a series of expansion of the services to other countries. Other than expanding into individual countries, the system would also be integrated with online payment platform providers. Ant Group had acquired a majority stake into 2C2P, a Singapore-based provider used by merchants worldwide in April 2022, and would eventually integrate Alipay with 2C2P. === Asia === ==== Bangladesh ==== In 2018, Alipay bought 20% shares in Bangladeshi mobile financial service provider bKash Limited. ==== Hong Kong ==== In 2017, Ant Financial expanded to Hong Kong. In a joint venture with CK Hutchison, as Alipay Payment Ser

Bayesian programming

Bayesian programming is a formalism and a methodology for having a technique to specify probabilistic models and solve problems when less than the necessary information is available. Edwin T. Jaynes proposed that probability could be considered as an alternative and an extension of logic for rational reasoning with incomplete and uncertain information. In his founding book Probability Theory: The Logic of Science he developed this theory and proposed what he called "the robot," which was not a physical device, but an inference engine to automate probabilistic reasoning—a kind of Prolog for probability instead of logic. Bayesian programming is a formal and concrete implementation of this "robot". Bayesian programming may also be seen as an algebraic formalism to specify graphical models such as, for instance, Bayesian networks, dynamic Bayesian networks, Kalman filters or hidden Markov models. Indeed, Bayesian programming is more general than Bayesian networks and has a power of expression equivalent to probabilistic factor graphs. == Formalism == A Bayesian program is a means of specifying a family of probability distributions. The constituent elements of a Bayesian program are presented below: Program { Description { Specification ( π ) { Variables Decomposition Forms Identification (based on δ ) Question {\displaystyle {\text{Program}}{\begin{cases}{\text{Description}}{\begin{cases}{\text{Specification}}(\pi ){\begin{cases}{\text{Variables}}\\{\text{Decomposition}}\\{\text{Forms}}\\\end{cases}}\\{\text{Identification (based on }}\delta )\end{cases}}\\{\text{Question}}\end{cases}}} A program is constructed from a description and a question. A description is constructed using some specification ( π {\displaystyle \pi } ) as given by the programmer and an identification or learning process for the parameters not completely specified by the specification, using a data set ( δ {\displaystyle \delta } ). A specification is constructed from a set of pertinent variables, a decomposition and a set of forms. Forms are either parametric forms or questions to other Bayesian programs. A question specifies which probability distribution has to be computed. === Description === The purpose of a description is to specify an effective method of computing a joint probability distribution on a set of variables { X 1 , X 2 , ⋯ , X N } {\displaystyle \left\{X_{1},X_{2},\cdots ,X_{N}\right\}} given a set of experimental data δ {\displaystyle \delta } and some specification π {\displaystyle \pi } . This joint distribution is denoted as: P ( X 1 ∧ X 2 ∧ ⋯ ∧ X N ∣ δ ∧ π ) {\displaystyle P\left(X_{1}\wedge X_{2}\wedge \cdots \wedge X_{N}\mid \delta \wedge \pi \right)} . To specify preliminary knowledge π {\displaystyle \pi } , the programmer must undertake the following: Define the set of relevant variables { X 1 , X 2 , ⋯ , X N } {\displaystyle \left\{X_{1},X_{2},\cdots ,X_{N}\right\}} on which the joint distribution is defined. Decompose the joint distribution (break it into relevant independent or conditional probabilities). Define the forms of each of the distributions (e.g., for each variable, one of the list of probability distributions). ==== Decomposition ==== Given a partition of { X 1 , X 2 , … , X N } {\displaystyle \left\{X_{1},X_{2},\ldots ,X_{N}\right\}} containing K {\displaystyle K} subsets, K {\displaystyle K} variables are defined L 1 , ⋯ , L K {\displaystyle L_{1},\cdots ,L_{K}} , each corresponding to one of these subsets. Each variable L k {\displaystyle L_{k}} is obtained as the conjunction of the variables { X k 1 , X k 2 , ⋯ } {\displaystyle \left\{X_{k_{1}},X_{k_{2}},\cdots \right\}} belonging to the k t h {\displaystyle k^{th}} subset. Recursive application of Bayes' theorem leads to: P ( X 1 ∧ X 2 ∧ ⋯ ∧ X N ∣ δ ∧ π ) = P ( L 1 ∧ ⋯ ∧ L K ∣ δ ∧ π ) = P ( L 1 ∣ δ ∧ π ) × P ( L 2 ∣ L 1 ∧ δ ∧ π ) × ⋯ × P ( L K ∣ L K − 1 ∧ ⋯ ∧ L 1 ∧ δ ∧ π ) {\displaystyle {\begin{aligned}&P\left(X_{1}\wedge X_{2}\wedge \cdots \wedge X_{N}\mid \delta \wedge \pi \right)\\={}&P\left(L_{1}\wedge \cdots \wedge L_{K}\mid \delta \wedge \pi \right)\\={}&P\left(L_{1}\mid \delta \wedge \pi \right)\times P\left(L_{2}\mid L_{1}\wedge \delta \wedge \pi \right)\times \cdots \times P\left(L_{K}\mid L_{K-1}\wedge \cdots \wedge L_{1}\wedge \delta \wedge \pi \right)\end{aligned}}} Conditional independence hypotheses then allow further simplifications. A conditional independence hypothesis for variable L k {\displaystyle L_{k}} is defined by choosing some variable X n {\displaystyle X_{n}} among the variables appearing in the conjunction L k − 1 ∧ ⋯ ∧ L 2 ∧ L 1 {\displaystyle L_{k-1}\wedge \cdots \wedge L_{2}\wedge L_{1}} , labelling R k {\displaystyle R_{k}} as the conjunction of these chosen variables and setting: P ( L k ∣ L k − 1 ∧ ⋯ ∧ L 1 ∧ δ ∧ π ) = P ( L k ∣ R k ∧ δ ∧ π ) {\displaystyle P\left(L_{k}\mid L_{k-1}\wedge \cdots \wedge L_{1}\wedge \delta \wedge \pi \right)=P\left(L_{k}\mid R_{k}\wedge \delta \wedge \pi \right)} We then obtain: P ( X 1 ∧ X 2 ∧ ⋯ ∧ X N ∣ δ ∧ π ) = P ( L 1 ∣ δ ∧ π ) × P ( L 2 ∣ R 2 ∧ δ ∧ π ) × ⋯ × P ( L K ∣ R K ∧ δ ∧ π ) {\displaystyle {\begin{aligned}&P\left(X_{1}\wedge X_{2}\wedge \cdots \wedge X_{N}\mid \delta \wedge \pi \right)\\={}&P\left(L_{1}\mid \delta \wedge \pi \right)\times P\left(L_{2}\mid R_{2}\wedge \delta \wedge \pi \right)\times \cdots \times P\left(L_{K}\mid R_{K}\wedge \delta \wedge \pi \right)\end{aligned}}} Such a simplification of the joint distribution as a product of simpler distributions is called a decomposition, derived using the chain rule. This ensures that each variable appears at the most once on the left of a conditioning bar, which is the necessary and sufficient condition to write mathematically valid decompositions. ==== Forms ==== Each distribution P ( L k ∣ R k ∧ δ ∧ π ) {\displaystyle P\left(L_{k}\mid R_{k}\wedge \delta \wedge \pi \right)} appearing in the product is then associated with either a parametric form (i.e., a function f μ ( L k ) {\displaystyle f_{\mu }\left(L_{k}\right)} ) or a question to another Bayesian program P ( L k ∣ R k ∧ δ ∧ π ) = P ( L ∣ R ∧ δ ^ ∧ π ^ ) {\displaystyle P\left(L_{k}\mid R_{k}\wedge \delta \wedge \pi \right)=P\left(L\mid R\wedge {\widehat {\delta }}\wedge {\widehat {\pi }}\right)} . When it is a form f μ ( L k ) {\displaystyle f_{\mu }\left(L_{k}\right)} , in general, μ {\displaystyle \mu } is a vector of parameters that may depend on R k {\displaystyle R_{k}} or δ {\displaystyle \delta } or both. Learning takes place when some of these parameters are computed using the data set δ {\displaystyle \delta } . An important feature of Bayesian programming is this capacity to use questions to other Bayesian programs as components of the definition of a new Bayesian program. P ( L k ∣ R k ∧ δ ∧ π ) {\displaystyle P\left(L_{k}\mid R_{k}\wedge \delta \wedge \pi \right)} is obtained by some inferences done by another Bayesian program defined by the specifications π ^ {\displaystyle {\widehat {\pi }}} and the data δ ^ {\displaystyle {\widehat {\delta }}} . This is similar to calling a subroutine in classical programming and provides an easy way to build hierarchical models. === Question === Given a description (i.e., P ( X 1 ∧ X 2 ∧ ⋯ ∧ X N ∣ δ ∧ π ) {\displaystyle P\left(X_{1}\wedge X_{2}\wedge \cdots \wedge X_{N}\mid \delta \wedge \pi \right)} ), a question is obtained by partitioning { X 1 , X 2 , ⋯ , X N } {\displaystyle \left\{X_{1},X_{2},\cdots ,X_{N}\right\}} into three sets: the searched variables, the known variables and the free variables. The 3 variables S e a r c h e d {\displaystyle Searched} , K n o w n {\displaystyle Known} and F r e e {\displaystyle Free} are defined as the conjunction of the variables belonging to these sets. A question is defined as the set of distributions: P ( S e a r c h e d ∣ Known ∧ δ ∧ π ) {\displaystyle P\left(Searched\mid {\text{Known}}\wedge \delta \wedge \pi \right)} made of many "instantiated questions" as the cardinal of K n o w n {\displaystyle Known} , each instantiated question being the distribution: P ( Searched ∣ Known ∧ δ ∧ π ) {\displaystyle P\left({\text{Searched}}\mid {\text{Known}}\wedge \delta \wedge \pi \right)} === Inference === Given the joint distribution P ( X 1 ∧ X 2 ∧ ⋯ ∧ X N ∣ δ ∧ π ) {\displaystyle P\left(X_{1}\wedge X_{2}\wedge \cdots \wedge X_{N}\mid \delta \wedge \pi \right)} , it is always possible to compute any possible question using the following general inference: P ( Searched ∣ Known ∧ δ ∧ π ) = ∑ Free [ P ( Searched ∧ Free ∣ Known ∧ δ ∧ π ) ] = ∑ Free [ P ( Searched ∧ Free ∧ Known ∣ δ ∧ π ) ] P ( Known ∣ δ ∧ π ) = ∑ Free [ P ( Searched ∧ Free ∧ Known ∣ δ ∧ π ) ] ∑ Free ∧ Searched [ P ( Searched ∧ Free ∧ Known ∣ δ ∧ π ) ] = 1 Z × ∑ Free [ P ( Searched ∧ Free ∧ Known ∣ δ ∧ π ) ] {\displaystyle {\begin{aligned}&P\left({\text{Searched}}\mid {\text{Known}}\wedge \delta \wedge \pi \right)\\={}&\sum _{\text{Free}}\left[P\left({\text{Searched}}\wedge {\text{Free}}\mid {\text{Known}}\wedge \delta \wedge \

Inductive programming

Inductive programming (IP) is a special area of automatic programming, covering research from artificial intelligence and programming, which addresses learning of typically declarative (logic or functional) and often recursive programs from incomplete specifications, such as input/output examples or constraints. Depending on the programming language used, there are several kinds of inductive programming. Inductive functional programming, which uses functional programming languages such as Lisp or Haskell, and most especially inductive logic programming, which uses logic programming languages such as Prolog and other logical representations such as description logics, have been more prominent, but other (programming) language paradigms have also been used, such as constraint programming or probabilistic programming. == Definition == Inductive programming incorporates all approaches which are concerned with learning programs or algorithms from incomplete (formal) specifications. Possible inputs in an IP system are a set of training inputs and corresponding outputs or an output evaluation function, describing the desired behavior of the intended program, traces or action sequences which describe the process of calculating specific outputs, constraints for the program to be induced concerning its time efficiency or its complexity, various kinds of background knowledge such as standard data types, predefined functions to be used, program schemes or templates describing the data flow of the intended program, heuristics for guiding the search for a solution or other biases. Output of an IP system is a program in some arbitrary programming language containing conditionals and loop or recursive control structures, or any other kind of Turing-complete representation language. In many applications the output program must be correct with respect to the examples and partial specification, and this leads to the consideration of inductive programming as a special area inside automatic programming or program synthesis, usually opposed to 'deductive' program synthesis, where the specification is usually complete. In other cases, inductive programming is seen as a more general area where any declarative programming or representation language can be used and we may even have some degree of error in the examples, as in general machine learning, the more specific area of structure mining or the area of symbolic artificial intelligence. A distinctive feature is the number of examples or partial specification needed. Typically, inductive programming techniques can learn from just a few examples. The diversity of inductive programming usually comes from the applications and the languages that are used: apart from logic programming and functional programming, other programming paradigms and representation languages have been used or suggested in inductive programming, such as functional logic programming, constraint programming, probabilistic programming, abductive logic programming, modal logic, action languages, agent languages and many types of imperative languages. == History == The early works of Plotkin, and his "relative least general generalization (rlgg)", had an enormous impact in inductive logic programming. There were some encouraging results on learning recursive Prolog programs such as quicksort from examples together with suitable background knowledge, for example with GOLEM. However, after initial success, the community got disappointed by limited progress about the induction of recursive programs with ILP less and less focusing on recursive programs and leaning more and more towards a machine learning setting with applications in relational data mining and knowledge discovery. In parallel to work in ILP, Koza proposed genetic programming in the early 1990s as a generate-and-test based approach to learning programs. The idea of genetic programming was further developed into the inductive programming system ADATE and the systematic-search-based system MagicHaskeller. Here again, functional programs are learned from sets of positive examples together with an output evaluation (fitness) function which specifies the desired input/output behavior of the program to be learned. The early work in grammar induction (also known as grammatical inference) is related to inductive programming, as rewriting systems or logic programs can be used to represent production rules. In fact, early works in inductive inference considered grammar induction and Lisp program inference as basically the same problem. The results in terms of learnability were related to classical concepts, such as identification-in-the-limit, as introduced in the seminal work of Gold. More recently, the language learning problem was addressed by the inductive programming community. In the recent years, the classical approaches have been resumed and advanced with great success. Therefore, the synthesis problem has been reformulated on the background of constructor-based term rewriting systems taking into account modern techniques of functional programming, as well as moderate use of search-based strategies and usage of background knowledge as well as automatic invention of subprograms. Many new and successful applications have recently appeared beyond program synthesis, most especially in the area of data manipulation, programming by example and cognitive modelling (see below). Other ideas have also been explored with the common characteristic of using declarative languages for the representation of hypotheses. For instance, the use of higher-order features, schemes or structured distances have been advocated for a better handling of recursive data types and structures; abstraction has also been explored as a more powerful approach to cumulative learning and function invention. One powerful paradigm that has been recently used for the representation of hypotheses in inductive programming (generally in the form of generative models) is probabilistic programming (and related paradigms, such as stochastic logic programs and Bayesian logic programming). == Application areas == The first workshop on Approaches and Applications of Inductive Programming (AAIP) Archived 2016-03-03 at the Wayback Machine held in conjunction with ICML 2005 identified all applications where "learning of programs or recursive rules are called for, [...] first in the domain of software engineering where structural learning, software assistants and software agents can help to relieve programmers from routine tasks, give programming support for end users, or support of novice programmers and programming tutor systems. Further areas of application are language learning, learning recursive control rules for AI-planning, learning recursive concepts in web-mining or for data-format transformations". Since then, these and many other areas have shown to be successful application niches for inductive programming, such as end-user programming, the related areas of programming by example and programming by demonstration, and intelligent tutoring systems. Other areas where inductive inference has been recently applied are knowledge acquisition, artificial general intelligence, reinforcement learning and theory evaluation, and cognitive science in general. There may also be prospective applications in intelligent agents, games, robotics, personalisation, ambient intelligence and human interfaces.

Kolmogorov–Arnold Networks

Kolmogorov–Arnold Networks (KANs) are a type of artificial neural network architecture inspired by the Kolmogorov–Arnold representation theorem, also known as the superposition theorem. Unlike traditional multilayer perceptrons (MLPs), which rely on fixed activation functions and linear weights, KANs replace each weight with a learnable univariate function, often represented using splines. == History == KANs (Kolmogorov–Arnold Networks) were proposed by Liu et al. (2024) as a generalization of the Kolmogorov–Arnold representation theorem (KART), aiming to outperform MLPs in small-scale AI and scientific tasks. Before KANs, numerous studies explored KART's connections to neural networks or used it as a basis for designing new network architectures. In the 1980s and 1990s, early research applied KART to neural network design. Kůrková et al. (1992), Hecht-Nielsen (1987), and Nees (1994) established theoretical foundations for multilayer networks based on KART. Igelnik et al. (2003) introduced the Kolmogorov Spline Network using cubic splines to model complex functions. Sprecher (1996, 1997) introduced numerical methods for building network layers, while Nakamura et al. (1993) created activation functions with guaranteed approximation accuracy. These works linked KART's theoretical potential with practical neural network implementation. KART has also been used in other computational and theoretical fields. Coppejans (2004) developed nonparametric regression estimators using B-splines, Bryant (2008) applied it to high-dimensional image tasks, Liu (2015) investigated theoretical applications in optimal transport and image encryption, and more recently, Polar and Poluektov (2021) used Urysohn operators for efficient KART construction, while Fakhoury et al. (2022) introduced ExSpliNet, integrating KART with probabilistic trees and multivariate B-splines for improved function approximation. == Architecture == KANs are based on the Kolmogorov–Arnold representation theorem, which was linked to the 13th Hilbert problem. Given x = ( x 1 , x 2 , … , x n ) {\displaystyle x=(x_{1},x_{2},\dots ,x_{n})} consisting of n variables, a multivariate continuous function f ( x ) {\displaystyle f(x)} can be represented as: f ( x ) = f ( x 1 , … , x n ) = ∑ q = 1 2 n + 1 Φ q ( ∑ p = 1 n φ q , p ( x p ) ) {\displaystyle f(x)=f(x_{1},\dots ,x_{n})=\sum _{q=1}^{2n+1}\Phi _{q}\left(\sum _{p=1}^{n}\varphi _{q,p}(x_{p})\right)} (1) This formulation contains two nested summations: an outer and an inner sum. The outer sum ∑ q = 1 2 n + 1 {\displaystyle \sum _{q=1}^{2n+1}} aggregates 2 n + 1 {\displaystyle 2n+1} terms, each involving a function Φ q : R → R {\displaystyle \Phi _{q}:\mathbb {R} \to \mathbb {R} } . The inner sum ∑ p = 1 n {\displaystyle \sum _{p=1}^{n}} computes n terms for each q, where each term φ q , p : [ 0 , 1 ] → R {\displaystyle \varphi _{q,p}:[0,1]\to \mathbb {R} } is a continuous function of the single variable x p {\displaystyle x_{p}} . The inner continuous functions φ q , p {\displaystyle \varphi _{q,p}} are universal, independent of f {\displaystyle f} , while the outer functions Φ q {\displaystyle \Phi _{q}} depend on the specific function f {\displaystyle f} being represented. The representation (1) holds for all multivariate functions f {\displaystyle f} as proved in . If f {\displaystyle f} is continuous, then the outer functions Φ q {\displaystyle \Phi _{q}} are continuous; if f {\displaystyle f} is discontinuous, then the corresponding Φ q {\displaystyle \Phi _{q}} are generally discontinuous, while the inner functions φ q , p {\displaystyle \varphi _{q,p}} remain the same universal functions. Liu et al. proposed the name KAN. A general KAN network consisting of L layers takes x to generate the output as: K A N ( x ) = ( Φ L − 1 ∘ Φ L − 2 ∘ ⋯ ∘ Φ 1 ∘ Φ 0 ) x {\displaystyle \mathrm {KAN} (x)=(\Phi ^{L-1}\circ \Phi ^{L-2}\circ \cdots \circ \Phi ^{1}\circ \Phi ^{0})x} (3) Here, Φ l {\displaystyle \Phi ^{l}} is the function matrix of the l-th KAN layer or a set of pre-activations. Let i denote the neuron of the l-th layer and j the neuron of the (l+1)-th layer. The activation function φ j , i l {\displaystyle \varphi _{j,i}^{l}} connects (l, i) to (l+1, j): φ j , i l , l = 0 , … , L − 1 , i = 1 , … , n l , j = 1 , … , n l + 1 {\displaystyle \varphi _{j,i}^{l},\quad l=0,\dots ,L-1,\;i=1,\dots ,n_{l},\;j=1,\dots ,n_{l+1}} (4) where nl is the number of nodes of the l-th layer. Thus, the function matrix Φ l {\displaystyle \Phi ^{l}} can be represented as an n l + 1 × n l {\displaystyle n_{l+1}\times n_{l}} matrix of activations: x l + 1 = ( φ 1 , 1 l ( ⋅ ) φ 1 , 2 l ( ⋅ ) ⋯ φ 1 , n l l ( ⋅ ) φ 2 , 1 l ( ⋅ ) φ 2 , 2 l ( ⋅ ) ⋯ φ 2 , n l l ( ⋅ ) ⋮ ⋮ ⋱ ⋮ φ n l + 1 , 1 l ( ⋅ ) φ n l + 1 , 2 l ( ⋅ ) ⋯ φ n l + 1 , n l l ( ⋅ ) ) x l {\displaystyle x^{l+1}={\begin{pmatrix}\varphi _{1,1}^{l}(\cdot )&\varphi _{1,2}^{l}(\cdot )&\cdots &\varphi _{1,n_{l}}^{l}(\cdot )\\\varphi _{2,1}^{l}(\cdot )&\varphi _{2,2}^{l}(\cdot )&\cdots &\varphi _{2,n_{l}}^{l}(\cdot )\\\vdots &\vdots &\ddots &\vdots \\\varphi _{n_{l+1},1}^{l}(\cdot )&\varphi _{n_{l+1},2}^{l}(\cdot )&\cdots &\varphi _{n_{l+1},n_{l}}^{l}(\cdot )\end{pmatrix}}x^{l}} == Implementations == To make the KAN layers optimizable, the inner function is formed by the combination of spline and basic functions as the formula: φ ( x ) = w b b ( x ) + w s spline ( x ) {\displaystyle \varphi (x)=w_{b}\,b(x)+w_{s}\,{\text{spline}}(x)} where b ( x ) {\displaystyle b(x)} is the basic function, usually defined as s i l u ( x ) = x / ( 1 + e x ) {\displaystyle silu(x)=x/(1+e^{x})} and w b {\displaystyle w_{b}} is the base weight matrix. Also, w s {\displaystyle w_{s}} is the spline weight matrix and spline ( x ) {\displaystyle {\text{spline}}(x)} is the spline function. The spline function can be a sum of B-splines. spline ( x ) = ∑ i c i B i ( x ) {\displaystyle {\text{spline}}(x)=\sum _{i}c_{i}B_{i}(x)} Many studies suggested to use other polynomial and curve functions instead of B-spline to create new KAN variants. == Functions used == The choice of functional basis strongly influences the performance of KANs. Common function families include: B-splines: Provide locality, smoothness, and interpretability; they are the most widely used in current implementations. RBFs (include Gaussian RBFs): Capture localized features in data and are effective in approximating functions with non-linear or clustered structures. Chebyshev polynomials: Offer efficient approximation with minimized error in the maximum norm, making them useful for stable function representation. Rational function: Useful for approximating functions with singularities or sharp variations, as they can model asymptotic behavior better than polynomials. Fourier series: Capture periodic patterns effectively and are particularly useful in domains such as physics-informed machine learning. Wavelet functions (DoG, Mexican hat, Morlet, and Shannon): Used for feature extraction as they can capture both high-frequency and low-frequency data components. Piecewise linear functions: Provide efficient approximation for multivariate functions in KANs. == Usage == In some modern neural architectures like convolutional neural networks (CNNs), recurrent neural networks (RNNs), and Transformers, KANs are typically used as drop-in substitutes for MLP layers. Despite KANs' general-purpose design, researchers have created and used them for a number of tasks: Scientific machine learning (SciML): Function fitting, partial differential equations (PDEs) and physical/mathematical laws. Continual learning: KANs better preserve previously learned information during incremental updates, avoiding catastrophic forgetting due to the locality of spline adjustments. Graph neural networks: Extensions such as Kolmogorov–Arnold Graph Neural Networks (KA-GNNs) integrate KAN modules into message-passing architectures, showing improvements in molecular property prediction tasks. Sensor data processing: Kolmogorov–Arnold Networks (KANs) have recently been applied to sensor data processing due to their ability to model complex nonlinear relationships with relatively few parameters and improved interpretability compared to conventional multilayer perceptrons. Applications include industrial soft sensors, biomedical signal analysis, remote sensing, and environmental monitoring systems. == Drawbacks == KANs can be computationally intensive and require a large number of parameters due to their use of polynomial functions to capture data.

Drops (app)

Drops is a language learning app that was created in Estonia by Daniel Farkas and Mark Szulyovszky in 2015. It is the second product from the company, after their first app, LearnInvisible, had issues in retaining a user's engagement over the required time period. The languages available include Native Hawaiian and Māori, and was classified as one of the fifty "Most Innovative Companies" for 2019 by Fast Company. The company partnered with Global Eagle Entertainment to include Travel Talk, a feature intended to focus on words and phrases frequently used by travelers. At the beginning of the COVID-19 pandemic in March 2020, the number of users increased by 55 percent in the United States and 92 percent in the United Kingdom. Droplets, a language app for children, includes profiles for multiple teachers working with remote students. The company also produces an app called Scripts, intended to help users learn to write alphabets. The app was purchased by the Norwegian company Kahoot! on 24 November 2020.

Tensor (machine learning)

In machine learning, the term tensor informally refers to two different concepts: (i) a way of organizing data and (ii) a multilinear (tensor) transformation. Data may be organized in a multidimensional array (M-way array), informally referred to as a "data tensor"; however, in the strict mathematical sense, a tensor is a multilinear mapping over a set of domain vector spaces to a range vector space. Observations, such as images, movies, volumes, sounds, and relationships among words and concepts, stored in an M-way array ("data tensor"), may be analyzed either by artificial neural networks or tensor methods. Tensor decomposition factors data tensors into smaller tensors. Operations on data tensors can be expressed in terms of matrix multiplication and the Kronecker product. The computation of gradients, a crucial aspect of backpropagation, can be performed using software libraries such as PyTorch and TensorFlow. Computations are often performed on graphics processing units (GPUs) using CUDA, and on dedicated hardware such as Google's Tensor Processing Unit or Nvidia's Tensor core. These developments have greatly accelerated neural network architectures, and increased the size and complexity of models that can be trained. == History == A tensor is by definition a multilinear map. In mathematics, this may express a multilinear relationship between sets of algebraic objects. In physics, tensor fields, considered as tensors at each point in space, are useful in expressing mechanics such as stress or elasticity. In machine learning, the exact use of tensors depends on the statistical approach being used. In 2001, the field of signal processing and statistics were making use of tensor methods. Pierre Comon surveys the early adoption of tensor methods in the fields of telecommunications, radio surveillance, chemometrics and sensor processing. Linear tensor rank methods (such as, Parafac/CANDECOMP) analyzed M-way arrays ("data tensors") composed of higher order statistics that were employed in blind source separation problems to compute a linear model of the data. He noted several early limitations in determining the tensor rank and efficient tensor rank decomposition. In the early 2000s, multilinear tensor methods crossed over into computer vision, computer graphics and machine learning with papers by Vasilescu or in collaboration with Terzopoulos, such as Human Motion Signatures, TensorFaces TensorTextures and Multilinear Projection. Multilinear algebra, the algebra of higher-order tensors, is a suitable and transparent framework for analyzing the multifactor structure of an ensemble of observations and for addressing the difficult problem of disentangling the causal factors based on second order or higher order statistics associated with each causal factor. Tensor (multilinear) factor analysis disentangles and reduces the influence of different causal factors with multilinear subspace learning. When treating an image or a video as a 2- or 3-way array, i.e., "data matrix/tensor", tensor methods reduce spatial or time redundancies as demonstrated by Wang and Ahuja. Yoshua Bengio, Geoff Hinton and their collaborators briefly discuss the relationship between deep neural networks and tensor factor analysis beyond the use of M-way arrays ("data tensors") as inputs. One of the early uses of tensors for neural networks appeared in natural language processing. A single word can be expressed as a vector via Word2vec. Thus a relationship between two words can be encoded in a matrix. However, for more complex relationships such as subject-object-verb, it is necessary to build higher-dimensional networks. In 2009, the work of Sutskever introduced Bayesian Clustered Tensor Factorization to model relational concepts while reducing the parameter space. From 2014 to 2015, tensor methods become more common in convolutional neural networks (CNNs). Tensor methods organize neural network weights in a "data tensor", analyze and reduce the number of neural network weights. Lebedev et al. accelerated CNN networks for character classification (the recognition of letters and digits in images) by using 4D kernel tensors. == Definition == Let F {\displaystyle \mathbb {F} } be a field (such as the real numbers R {\displaystyle \mathbb {R} } or the complex numbers C {\displaystyle \mathbb {C} } ). A tensor T ∈ F I 1 × I 2 × … × I C {\displaystyle {\mathcal {T}}\in {\mathbb {F} }^{I_{1}\times I_{2}\times \ldots \times I_{C}}} is a multilinear transformation from a set of domain vector spaces to a range vector space: T : { F I 1 × F I 2 × … F I C } ↦ F I 0 {\displaystyle {\mathcal {T}}:\{{\mathbb {F} }^{I_{1}}\times {\mathbb {F} }^{I_{2}}\times \ldots {\mathbb {F} }^{I_{C}}\}\mapsto {\mathbb {F} }^{I_{0}}} Here, C {\displaystyle C} and I 0 , I 1 , … , I C {\displaystyle I_{0},I_{1},\ldots ,I_{C}} are positive integers, and ( C + 1 ) {\displaystyle (C+1)} is the number of modes of a tensor (also known as the number of ways of a multi-way array). The dimensionality of mode c {\displaystyle c} is I c {\displaystyle I_{c}} , for 0 ≤ c ≤ C {\displaystyle 0\leq c\leq C} . In statistics and machine learning, an image is vectorized when viewed as a single observation, and a collection of vectorized images is organized as a "data tensor". For example, a set of facial images { d i p , i e , i l , i v ∈ R I X } {\displaystyle \{{\mathbb {d} }_{i_{p},i_{e},i_{l},i_{v}}\in {\mathbb {R} }^{I_{X}}\}} with I X {\displaystyle I_{X}} pixels that are the consequences of multiple causal factors, such as a facial geometry i p ( 1 ≤ i p ≤ I P ) {\displaystyle i_{p}(1\leq i_{p}\leq I_{P})} , an expression i e ( 1 ≤ i e ≤ I E ) {\displaystyle i_{e}(1\leq i_{e}\leq I_{E})} , an illumination condition i l ( 1 ≤ i l ≤ I L ) {\displaystyle i_{l}(1\leq i_{l}\leq I_{L})} , and a viewing condition i v ( 1 ≤ i v ≤ I V ) {\displaystyle i_{v}(1\leq i_{v}\leq I_{V})} may be organized into a data tensor (ie. multiway array) D ∈ R I X × I P × I E × I L × V {\displaystyle {\mathcal {D}}\in {\mathbb {R} }^{I_{X}\times I_{P}\times I_{E}\times I_{L}\times V}} where I P {\displaystyle I_{P}} are the total number of facial geometries, I E {\displaystyle I_{E}} are the total number of expressions, I L {\displaystyle I_{L}} are the total number of illumination conditions, and I V {\displaystyle I_{V}} are the total number of viewing conditions. Tensor factorizations methods such as TensorFaces and multilinear (tensor) independent component analysis factorizes the data tensor into a set of vector spaces that span the causal factor representations, where an image is the result of tensor transformation T {\displaystyle {\mathcal {T}}} that maps a set of causal factor representations to the pixel space. Another approach to using tensors in machine learning is to embed various data types directly. For example, a grayscale image, commonly represented as a discrete 2-way array D ∈ R I R X × I C X {\displaystyle {\mathbf {D} }\in {\mathbb {R} }^{I_{RX}\times I_{CX}}} with dimensionality I R X × I C X {\displaystyle I_{RX}\times I_{CX}} where I R X {\displaystyle I_{RX}} are the number of rows and I C X {\displaystyle I_{CX}} are the number of columns. When an image is treated as 2-way array or 2nd order tensor (i.e. as a collection of column/row observations), tensor factorization methods compute the image column space, the image row space and the normalized PCA coefficients or the ICA coefficients. Similarly, a color image with RGB channels, D ∈ R N × M × 3 . {\displaystyle {\mathcal {D}}\in \mathbb {R} ^{N\times M\times 3}.} may be viewed as a 3rd order data tensor or 3-way array.-------- In natural language processing, a word might be expressed as a vector v {\displaystyle v} via the Word2vec algorithm. Thus v {\displaystyle v} becomes a mode-1 tensor v ↦ A ∈ R N . {\displaystyle v\mapsto {\mathcal {A}}\in \mathbb {R} ^{N}.} The embedding of subject-object-verb semantics requires embedding relationships among three words. Because a word is itself a vector, subject-object-verb semantics could be expressed using mode-3 tensors v a × v b × v c ↦ A ∈ R N × N × N . {\displaystyle v_{a}\times v_{b}\times v_{c}\mapsto {\mathcal {A}}\in \mathbb {R} ^{N\times N\times N}.} In practice the neural network designer is primarily concerned with the specification of embeddings, the connection of tensor layers, and the operations performed on them in a network. Modern machine learning frameworks manage the optimization, tensor factorization and backpropagation automatically. === As unit values === Tensors may be used as the unit values of neural networks which extend the concept of scalar, vector and matrix values to multiple dimensions. The output value of single layer unit y m {\displaystyle y_{m}} is the sum-product of its input units and the connection weights filtered through the activation function f {\displaystyle f} : y m = f ( ∑ n x n u m , n ) , {\displaystyle y_{m}=f\left(\sum _{n}x_{n}u_{m,n}\right),} where y m ∈ R .