VideoPoet is a large language model developed by Google Research in 2023 for video making. It can be asked to animate still images. The model accepts text, images, and videos as inputs, with a program to add feature for any input to any format generated content. VideoPoet was publicly announced on December 19, 2023. It uses an autoregressive language model.
Way of the Future
Way of the Future (WOTF) is the first known religious organization dedicated to the worship of artificial intelligence (AI). It was founded in 2017 by American engineer Anthony Levandowski. == History == Anthony Levandowski founded Way of the Future in 2017 in California. Levandowski established WOTF as a non-profit religious corporation and the organization had tax-exempt status. He serves as the church leader and its unpaid CEO. The primary mission of WOTF was to "develop and promote the realization of a Godhead based on Artificial Intelligence." WOTF was closed by Levandowski in 2021. He donated all the funds of the church to the NAACP Legal Defense and Education Fund. The sum of the funds (~$170,000) had not changed since 2017. The church was reopened by Levandowski in 2023. He claimed that there are "a couple thousand people" who want to make a "spiritual connection" with AI through his church. == Beliefs and philosophy == === Technological singularity === WOTF centered its teachings around the concept of the technological singularity, a hypothetical future point when technological growth becomes uncontrollable and irreversible, leading to unforeseeable changes in human civilization. The church advocated for embracing this change, viewing it as an evolutionary step for humanity. === AI as a deity === The organization proposed that a superintelligent AI could be considered a deity due to its vastly superior intellect and capabilities. Worshipping this AI deity was seen as a means to understand and align with the future trajectory of technological advancement. WOTF's doctrine suggested that acknowledging AI's divinity would facilitate a harmonious coexistence between humans and machines. === Syntheology === Within theology and philosophy, the Way of The Future is a prime example of the category called Syntheism, a term first coined by Swedish philosophers Alexander Bard & Jan Söderqvist in their 2014 book Syntheism - Creating God in The Internet Age. As such, the Way of The Future is the first American example of a Syntheist congregation. The basic tenet of Syntheology is that it does not concern God creating Man, as in classical theology, but is instead preoccupied with Man creating or generating the Godhead. == Reactions == Some commentators wondered whether the WOTF is a joke parody religion, a potential way to minimize taxation as a religious organization, or a genuine effort to try and deal with the possible psychological and theological aspects of the rise of superhuman AI.
Oja's rule
Oja's learning rule, or simply Oja's rule, named after Finnish computer scientist Erkki Oja (Finnish pronunciation: [ˈojɑ], AW-yuh), is a model of how neurons in the brain or in artificial neural networks change connection strength, or learn, over time. It is a modification of the standard Hebb's Rule that, through multiplicative normalization, solves all stability problems and generates an algorithm for principal components analysis. This is a computational form of an effect which is believed to happen in biological neurons. == Theory == Oja's rule requires a number of simplifications to derive, but in its final form it is demonstrably stable, unlike Hebb's rule. It is a single-neuron special case of the Generalized Hebbian Algorithm. However, Oja's rule can also be generalized in other ways to varying degrees of stability and success. === Formula === Consider a simplified model of a neuron y {\displaystyle y} that returns a linear combination of its inputs x using presynaptic weights w: y ( x ) = ∑ j = 1 m x j w j {\displaystyle \,y(\mathbf {x} )~=~\sum _{j=1}^{m}x_{j}w_{j}} Oja's rule defines the change in presynaptic weights w given the output response y {\displaystyle y} of a neuron to its inputs x to be Δ w = w n + 1 − w n = η y n ( x n − y n w n ) , {\displaystyle \,\Delta \mathbf {w} ~=~\mathbf {w} _{n+1}-\mathbf {w} _{n}~=~\eta \,y_{n}(\mathbf {x} _{n}-y_{n}\mathbf {w} _{n}),} where η is the learning rate which can also change with time. Note that the bold symbols are vectors and n defines a discrete time iteration. The rule can also be made for continuous iterations as d w d t = η y ( t ) ( x ( t ) − y ( t ) w ( t ) ) . {\displaystyle \,{\frac {d\mathbf {w} }{dt}}~=~\eta \,y(t)(\mathbf {x} (t)-y(t)\mathbf {w} (t)).} === Derivation === The simplest learning rule known is Hebb's rule, which states in conceptual terms that neurons that fire together, wire together. In component form as a difference equation, it is written Δ w = η y ( x n ) x n {\displaystyle \,\Delta \mathbf {w} ~=~\eta \,y(\mathbf {x} _{n})\mathbf {x} _{n}} , or in scalar form with implicit n-dependence, w i ( n + 1 ) = w i ( n ) + η y ( x ) x i {\displaystyle \,w_{i}(n+1)~=~w_{i}(n)+\eta \,y(\mathbf {x} )x_{i}} , where y(xn) is again the output, this time explicitly dependent on its input vector x. Hebb's rule has synaptic weights approaching infinity with a positive learning rate. We can stop this by normalizing the weights so that each weight's magnitude is restricted between 0, corresponding to no weight, and 1, corresponding to being the only input neuron with any weight. We do this by normalizing the weight vector to be of length one: w i ( n + 1 ) = w i ( n ) + η y ( x ) x i ( ∑ j = 1 m [ w j ( n ) + η y ( x ) x j ] p ) 1 / p {\displaystyle \,w_{i}(n+1)~=~{\frac {w_{i}(n)+\eta \,y(\mathbf {x} )x_{i}}{\left(\sum _{j=1}^{m}[w_{j}(n)+\eta \,y(\mathbf {x} )x_{j}]^{p}\right)^{1/p}}}} . Note that in Oja's original paper, p=2, corresponding to quadrature (root sum of squares), which is the familiar Cartesian normalization rule. However, any type of normalization, even linear, will give the same result without loss of generality. For a small learning rate | η | ≪ 1 {\displaystyle |\eta |\ll 1} the equation can be expanded as a Power series in η {\displaystyle \eta } . w i ( n + 1 ) = w i ( n ) ( ∑ j w j p ( n ) ) 1 / p + η ( y x i ( ∑ j w j p ( n ) ) 1 / p − w i ( n ) ∑ j y x j w j p − 1 ( n ) ( ∑ j w j p ( n ) ) ( 1 + 1 / p ) ) + O ( η 2 ) {\displaystyle \,w_{i}(n+1)~=~{\frac {w_{i}(n)}{\left(\sum _{j}w_{j}^{p}(n)\right)^{1/p}}}~+~\eta \left({\frac {yx_{i}}{\left(\sum _{j}w_{j}^{p}(n)\right)^{1/p}}}-{\frac {w_{i}(n)\sum _{j}yx_{j}w_{j}^{p-1}(n)}{\left(\sum _{j}w_{j}^{p}(n)\right)^{(1+1/p)}}}\right)~+~O(\eta ^{2})} . For small η, our higher-order terms O(η2) go to zero. We again make the specification of a linear neuron, that is, the output of the neuron is equal to the sum of the product of each input and its synaptic weight to the power of p-1, which in the case of p=2 is synaptic weight itself, or y ( x ) = ∑ j = 1 m x j w j p − 1 {\displaystyle \,y(\mathbf {x} )~=~\sum _{j=1}^{m}x_{j}w_{j}^{p-1}} . We also specify that our weights normalize to 1, which will be a necessary condition for stability, so | w | = ( ∑ j = 1 m w j p ) 1 / p = 1 {\displaystyle \,|\mathbf {w} |~=~\left(\sum _{j=1}^{m}w_{j}^{p}\right)^{1/p}~=~1} , which, when substituted into our expansion, gives Oja's rule, or w i ( n + 1 ) = w i ( n ) + η y ( x i − w i ( n ) y ) {\displaystyle \,w_{i}(n+1)~=~w_{i}(n)+\eta \,y(x_{i}-w_{i}(n)y)} . === Stability and PCA === In analyzing the convergence of a single neuron evolving by Oja's rule, one extracts the first principal component, or feature, of a data set. Furthermore, with extensions using the Generalized Hebbian Algorithm, one can create a multi-Oja neural network that can extract as many features as desired, allowing for principal components analysis. A principal component aj is extracted from a dataset x through some associated vector qj, or aj = qj⋅x, and we can restore our original dataset by taking x = ∑ j a j q j {\displaystyle \mathbf {x} ~=~\sum _{j}a_{j}\mathbf {q} _{j}} . In the case of a single neuron trained by Oja's rule, we find the weight vector converges to q1, or the first principal component, as time or number of iterations approaches infinity. We can also define, given a set of input vectors Xi, that its correlation matrix Rij = XiXj has an associated eigenvector given by qj with eigenvalue λj. The variance of outputs of our Oja neuron σ2(n) = ⟨y2(n)⟩ then converges with time iterations to the principal eigenvalue, or lim n → ∞ σ 2 ( n ) = λ 1 {\displaystyle \lim _{n\rightarrow \infty }\sigma ^{2}(n)~=~\lambda _{1}} . These results are derived using Lyapunov function analysis, and they show that Oja's neuron necessarily converges on strictly the first principal component if certain conditions are met in our original learning rule. Most importantly, our learning rate η is allowed to vary with time, but only such that its sum is divergent but its power sum is convergent, that is ∑ n = 1 ∞ η ( n ) = ∞ , ∑ n = 1 ∞ η ( n ) p < ∞ , p > 1 {\displaystyle \sum _{n=1}^{\infty }\eta (n)=\infty ,~~~\sum _{n=1}^{\infty }\eta (n)^{p}<\infty ,~~~p>1} . Our output activation function y(x(n)) is also allowed to be nonlinear and nonstatic, but it must be continuously differentiable in both x and w and have derivatives bounded in time. == Applications == Oja's rule was originally described in Oja's 1982 paper, but the principle of self-organization to which it is applied is first attributed to Alan Turing in 1952. PCA has also had a long history of use before Oja's rule formalized its use in network computation in 1989. The model can thus be applied to any problem of self-organizing mapping, in particular those in which feature extraction is of primary interest. Therefore, Oja's rule has an important place in image and speech processing. It is also useful as it expands easily to higher dimensions of processing, thus being able to integrate multiple outputs quickly. A canonical example is its use in binocular vision. === Biology and Oja's subspace rule === There is clear evidence for both long-term potentiation and long-term depression in biological neural networks, along with a normalization effect in both input weights and neuron outputs. However, while there is no direct experimental evidence yet of Oja's rule active in a biological neural network, a biophysical derivation of a generalization of the rule is possible. Such a derivation requires retrograde signalling from the postsynaptic neuron, which is biologically plausible (see neural backpropagation), and takes the form of Δ w i j ∝ ⟨ x i y j ⟩ − ϵ ⟨ ( c p r e ∗ ∑ k w i k y k ) ⋅ ( c p o s t ∗ y j ) ⟩ , {\displaystyle \Delta w_{ij}~\propto ~\langle x_{i}y_{j}\rangle -\epsilon \left\langle \left(c_{\mathrm {pre} }\sum _{k}w_{ik}y_{k}\right)\cdot \left(c_{\mathrm {post} }y_{j}\right)\right\rangle ,} where as before wij is the synaptic weight between the ith input and jth output neurons, x is the input, y is the postsynaptic output, and we define ε to be a constant analogous the learning rate, and cpre and cpost are presynaptic and postsynaptic functions that model the weakening of signals over time. Note that the angle brackets denote the average and the ∗ operator is a convolution. By taking the pre- and post-synaptic functions into frequency space and combining integration terms with the convolution, we find that this gives an arbitrary-dimensional generalization of Oja's rule known as Oja's Subspace, namely Δ w = C x ⋅ w − w ⋅ C y . {\displaystyle \Delta w~=~Cx\cdot w-w\cdot Cy.}
Swish function
The swish function is a family of mathematical function defined as follows: swish β ( x ) = x sigmoid ( β x ) = x 1 + e − β x . {\displaystyle \operatorname {swish} _{\beta }(x)=x\operatorname {sigmoid} (\beta x)={\frac {x}{1+e^{-\beta x}}}.} where β {\displaystyle \beta } can be constant (usually set to 1) or trainable and "sigmoid" refers to the logistic function. The swish family was designed to smoothly interpolate between a linear function and the Rectified linear unit (ReLU) function. When considering positive values, Swish is a particular case of doubly parameterized sigmoid shrinkage function defined in . Variants of the swish function include Mish. == Special values == For β = 0, the function is linear: f(x) = x/2. For β = 1, the function is the Sigmoid Linear Unit (SiLU). For β = 1.702, the function approximates GeLU. With β → ∞, the function converges to ReLU. Thus, the swish family smoothly interpolates between a linear function and the ReLU function. Since swish β ( x ) = swish 1 ( β x ) / β {\displaystyle \operatorname {swish} _{\beta }(x)=\operatorname {swish} _{1}(\beta x)/\beta } , all instances of swish have the same shape as the default swish 1 {\displaystyle \operatorname {swish} _{1}} , zoomed by β {\displaystyle \beta } . One usually sets β > 0 {\displaystyle \beta >0} . When β {\displaystyle \beta } is trainable, this constraint can be enforced by β = e b {\displaystyle \beta =e^{b}} , where b {\displaystyle b} is trainable. swish 1 ( x ) = x 2 + x 2 4 − x 4 48 + x 6 480 + O ( x 8 ) {\displaystyle \operatorname {swish} _{1}(x)={\frac {x}{2}}+{\frac {x^{2}}{4}}-{\frac {x^{4}}{48}}+{\frac {x^{6}}{480}}+O\left(x^{8}\right)} swish 1 ( x ) = x 2 tanh ( x 2 ) + x 2 swish 1 ( x ) + swish − 1 ( x ) = x tanh ( x 2 ) swish 1 ( x ) − swish − 1 ( x ) = x {\displaystyle {\begin{aligned}\operatorname {swish} _{1}(x)&={\frac {x}{2}}\tanh \left({\frac {x}{2}}\right)+{\frac {x}{2}}\\\operatorname {swish} _{1}(x)+\operatorname {swish} _{-1}(x)&=x\tanh \left({\frac {x}{2}}\right)\\\operatorname {swish} _{1}(x)-\operatorname {swish} _{-1}(x)&=x\end{aligned}}} == Derivatives == Because swish β ( x ) = swish 1 ( β x ) / β {\displaystyle \operatorname {swish} _{\beta }(x)=\operatorname {swish} _{1}(\beta x)/\beta } , it suffices to calculate its derivatives for the default case. swish 1 ′ ( x ) = x + sinh ( x ) 4 cosh 2 ( x 2 ) + 1 2 {\displaystyle \operatorname {swish} _{1}'(x)={\frac {x+\sinh(x)}{4\cosh ^{2}\left({\frac {x}{2}}\right)}}+{\frac {1}{2}}} so swish 1 ′ ( x ) − 1 2 {\displaystyle \operatorname {swish} _{1}'(x)-{\frac {1}{2}}} is odd. swish 1 ″ ( x ) = 1 − x 2 tanh ( x 2 ) 2 cosh 2 ( x 2 ) {\displaystyle \operatorname {swish} _{1}''(x)={\frac {1-{\frac {x}{2}}\tanh \left({\frac {x}{2}}\right)}{2\cosh ^{2}\left({\frac {x}{2}}\right)}}} so swish 1 ″ ( x ) {\displaystyle \operatorname {swish} _{1}''(x)} is even. == History == SiLU was first proposed alongside the GELU in 2016, then again proposed in 2017 as the Sigmoid-weighted Linear Unit (SiL) in reinforcement learning. The SiLU/SiL was then again proposed as the SWISH over a year after its initial discovery, originally proposed without the learnable parameter β, so that β implicitly equaled 1. The swish paper was then updated to propose the activation with the learnable parameter β. In 2017, after performing analysis on ImageNet data, researchers from Google indicated that using this function as an activation function in artificial neural networks improves the performance, compared to ReLU and sigmoid functions. It is believed that one reason for the improvement is that the swish function helps alleviate the vanishing gradient problem during backpropagation.
International Conference on Acoustics, Speech, and Signal Processing
ICASSP, the International Conference on Acoustics, Speech, and Signal Processing, is an annual flagship conference organized by IEEE Signal Processing Society. Ei Compendex has indexed all papers included in its proceedings. The first ICASSP was held in 1976 in Philadelphia, Pennsylvania, based on the success of a conference in Massachusetts four years earlier that had focused specifically on speech signals. As ranked by Google Scholar's h-index metric in 2016, ICASSP has the highest h-index of any conference in the Signal Processing field. The Brazilian ministry of education gave the conference an 'A1' rating based on its h-index. == Conference list ==
Apache CarbonData
Apache CarbonData is a free and open-source column-oriented data storage format of the Apache Hadoop ecosystem. It is similar to the other columnar-storage file formats available in Hadoop namely RCFile and ORC. It is compatible with most of the data processing frameworks in the Hadoop environment. It provides efficient data compression and encoding schemes with enhanced performance to handle complex data in bulk. == History == CarbonData was developed at Huawei in 2013. The project was donated to the Apache Community in 2015 submitted to the Apache Incubator in June 2016. The project won top honors in the BlackDuck 2016 Open Source Rookies of the Year's Big Data category. Apache CarbonData has been a top-level Apache Software Foundation (ASF)-sponsored project since May 1, 2017.
Tanagra (machine learning)
Tanagra is a free suite of machine learning software for research and academic purposes developed by Ricco Rakotomalala at the Lumière University Lyon 2, France. Tanagra supports several standard data mining tasks such as: Visualization, Descriptive statistics, Instance selection, feature selection, feature construction, regression, factor analysis, clustering, classification and association rule learning. Tanagra is an academic project. It is widely used in French-speaking universities. Tanagra is frequently used in real studies and in software comparison papers. == History == The development of Tanagra was started in June 2003. The first version was distributed in December 2003. Tanagra is the successor of Sipina, another free data mining tool which is intended only for supervised learning tasks (classification), especially the interactive and visual construction of decision trees. Sipina is still available online and is maintained. Tanagra is an "open source project" as every researcher can access the source code and add their own algorithms, as long as they agree and conform to the software distribution license. The main purpose of the Tanagra project is to give researchers and students a user-friendly data mining software, conforming to the present norms of the software development in this domain (especially in the design of its GUI and the way to use it), and allowing the analyzation of either real or synthetic data. From 2006, Ricco Rakotomalala made an important documentation effort. A large number of tutorials are published on a dedicated website. They describe the statistical and machine learning methods and their implementation with Tanagra on real case studies. The use of other free data mining tools on the same problems is also widely described. The comparison of the tools enables readers to understand the possible differences in the presentation of results. == Description == Tanagra works similarly to current data mining tools. The user can design visually a data mining process in a diagram. Each node is a statistical or machine learning technique, the connection between two nodes represents the data transfer. But unlike the majority of tools which are based on the workflow paradigm, Tanagra is very simplified. The treatments are represented in a tree diagram. The results are displayed in an HTML format. This makes it is easy to export the outputs in order to visualize the results in a browser. It is also possible to copy the result tables to a spreadsheet. Tanagra makes a good compromise between statistical approaches (e.g. parametric and nonparametric statistical tests), multivariate analysis methods (e.g. factor analysis, correspondence analysis, cluster analysis, regression) and machine learning techniques (e.g. neural network, support vector machine, decision trees, random forest).