Hi uTandem, also known as uTandem, is a free language exchange mobile app. It helps people to connect with other language learners in order to carry out face-to-face language exchange sessions and also offers learners lists of businesses in the field of language learning or language exchange. == Use == Hi uTandem is built around the concept of language exchange, which is a method of language learning based on mutual oral linguistic exchange between partners. Ideally, each partner is a native speaker of the language they are helping their counterpart to learn. The app designed for users to chat with other users and translate messages, find suitable language partners and to locate language schools, bars, cafés and language exchange groups around them. == Team and development == Hi uTandem was released in January, 2016. The initial idea was conceived by Alberto Rodríguez as part of a team of eight Spanish youngsters. Hi uTandem belongs to the company Velvor Tech S.L., founded by the same members and registered in Ronda (Spain). == Reception == Hi uTandem was listed on the Top 4 Apps to Learn Languages list by ElPlural.com and since its launch it has been featured in numerous online and physical sources, including 20 minutos, Europapress, ABC Andalucía and Telefónica's Think Big Blog.
Gas (app)
Gas (sometimes stylized in all caps), formerly known as Melt as well as Crush, was an American anonymous social media app. Launched in August 2022, the app is oriented towards high schoolers. The app was developed by Nikita Bier, Isaiah Turner, and former Facebook engineer Dave Schatz. Gas was largely based upon the prior tbh app developed by co-founder Nikita Bier, along with Erik Hazzard, Kyle Zaragoza, and Nicolas Ducdodon in September 2017. tbh was acquired by Facebook inc. (now Meta Platforms) on October 16, 2017, and nearly a year later in July 2018 was dissolved, owing to low usage. Gas follows a similar purpose to tbh in being a social media app oriented towards high schoolers. In the app, users participate in anonymous polls regarding pre-written complimentary statements to their peers, such as "I'd say yes if (blank) asked me out on a date," "I think (blank) is the coolest kid in school," or "would make an ugly face and still look pretty." Winners of said polls receive a "flame." The name of the app is derived from this, with "gassing someone up" being Gen Z slang for complimenting someone. Users can pay a $6.99 subscription that enables "God Mode," which shows hints regarding who voted for them in a poll. Gas overtook TikTok and BeReal as the most downloaded app on the Apple App Store in October 2022 (the app is currently not available for Android). The app has over 5.1 million downloads as of early November 2022, over a million active users and 300 thousand daily downloads as of October 2022. Currently, the app is available in Canada and the majority of the United States. On January 17, 2023, Gas was acquired by Discord, however it would remain a standalone app and its developers became Discord staff members. On October 18, 2023, Discord announced that service for Gas would be permanently ending effective November 7, 2023, due to a steep decline in users. Effective November 7, the app became completely unusable. == Controversy regarding human-trafficking == Beginning in October 2022, rumors spread largely throughout TikTok and Snapchat alleged that the app was linked to human trafficking (in particular sex trafficking). According to Bier, the rumor originated with a single user review from China on October 5, and then was disseminated through TikTok accounts with "few to no US teen followers." Although largely dismissed as a hoax by experts, who cite how the app doesn't log user locations and general anonymity, the hoax became pervasive to the extent that various police departments, school systems, and local news outlets began issuing warnings regarding the app. For instance, on October 31, 2022, the police department of Piedmont, Oklahoma issued a warning to parents, encouraging them to check their children's phones, while on November 3, the Oklahoma Oktaha Public School system stated in a Facebook post that "Children are being kidnapped in other towns and this new app is thought to be the source of predators finding their location." (both statements have since been retracted by Police Chief Scott Singer and Superintendent Jerry Needham respectively). Additionally, local medial outlets such as KOCO in Oklahoma City ran stories making similar statements. The rumor had a negative impact on the app, with downloads plateauing for a two-week period in late October and with 3% of users in a single day reportedly uninstalling the app. Revenue and ratings have also reportedly dropped and the company's social media accounts have been bombarded with comments labeling them as sex-traffickers. Additionally, the four-person development team has reportedly been bombarded with various death threats as a result.
Quickprop
Quickprop is an iterative method for determining the minimum of the loss function of an artificial neural network, following an algorithm inspired by the Newton's method. Sometimes, the algorithm is classified to the group of the second order learning methods. It follows a quadratic approximation of the previous gradient step and the current gradient, which is expected to be close to the minimum of the loss function, under the assumption that the loss function is locally approximately square, trying to describe it by means of an upwardly open parabola. The minimum is sought in the vertex of the parabola. The procedure requires only local information of the artificial neuron to which it is applied. The k {\displaystyle k} -th approximation step is given by: Δ ( k ) w i j = Δ ( k − 1 ) w i j ( ∇ i j E ( k ) ∇ i j E ( k − 1 ) − ∇ i j E ( k ) ) {\displaystyle \Delta ^{(k)}\,w_{ij}=\Delta ^{(k-1)}\,w_{ij}\left({\frac {\nabla _{ij}\,E^{(k)}}{\nabla _{ij}\,E^{(k-1)}-\nabla _{ij}\,E^{(k)}}}\right)} Where w i j {\displaystyle w_{ij}} is the weight of input i {\displaystyle i} of neuron j {\displaystyle j} , and E {\displaystyle E} is the loss function. The Quickprop algorithm is an implementation of the error backpropagation algorithm, but the network can behave chaotically during the learning phase due to large step sizes.
Linear genetic programming
"Linear genetic programming" is unrelated to "linear programming". Linear genetic programming (LGP) is a particular method of genetic programming wherein computer programs in a population are represented as a sequence of register-based instructions from an imperative programming language or machine language. The adjective "linear" stems from the fact that each LGP program is a sequence of instructions and the sequence of instructions is normally executed sequentially. Like in other programs, the data flow in LGP can be modeled as a graph that will visualize the potential multiple usage of register contents and the existence of structurally noneffective code (introns) which are two main differences of this genetic representation from the more common tree-based genetic programming (TGP) variant. Like other Genetic Programming methods, Linear genetic programming requires the input of data to run the program population on. Then, the output of the program (its behaviour) is judged against some target behaviour, using a fitness function. However, LGP is generally more efficient than tree genetic programming due to its two main differences mentioned above: Intermediate results (stored in registers) can be reused and a simple intron removal algorithm exists that can be executed to remove all non-effective code prior to programs being run on the intended data. These two differences often result in compact solutions and substantial computational savings compared to the highly constrained data flow in trees and the common method of executing all tree nodes in TGP. Furthermore, LGP naturally has multiple outputs by defining multiple output registers and easily cooperates with control flow operations. Linear genetic programming has been applied in many domains, including system modeling and system control with considerable success. Linear genetic programming should not be confused with linear tree programs in tree genetic programming, program composed of a variable number of unary functions and a single terminal. Note that linear tree GP differs from bit string genetic algorithms since a population may contain programs of different lengths and there may be more than two types of functions or more than two types of terminals. == Examples of LGP programs == Because LGP programs are basically represented by a linear sequence of instructions, they are simpler to read and to operate on than their tree-based counterparts. For example, a simple program written to solve a Boolean function problem with 3 inputs (in R1, R2, R3) and one output (in R0), could read like this: R1, R2, R3 have to be declared as input (read-only) registers, while R0 and R4 are declared as calculation (read-write) registers. This program is very simple, having just 5 instructions. But mutation and crossover operators could work to increase the length of the program, as well as the content of each of its instructions. Note that one instruction is non-effective or an intron (marked), since it does not impact the output register R0. Recognition of those instructions is the basis for the intron removal algorithm which is used analyze code prior to execution. Technically, this happens by copying an individual and then run the intron removal once. The copy with removed introns is then executed as many times as dictated by the number of training cases. Notably, the original individual is left intact, so as to continue participating in the evolutionary process. It is only the copy that is executed that is compressed by removing these "structural" introns. Another simple program, this one written in the LGP language Slash/A looks like a series of instructions separated by a slash: By representing such code in bytecode format, i.e. as an array of bytes each representing a different instruction, one can make mutation operations simply by changing an element of such an array.
Relationship square
In statistics, the relationship square is a graphical representation for use in the factorial analysis of a table individuals x variables. This representation completes classical representations provided by principal component analysis (PCA) or multiple correspondence analysis (MCA), namely those of individuals, of quantitative variables (correlation circle) and of the categories of qualitative variables (at the centroid of the individuals who possess them). It is especially important in factor analysis of mixed data (FAMD) and in multiple factor analysis (MFA). == Definition of relationship square in the MCA frame == The first interest of the relationship square is to represent the variables themselves, not their categories, which is all the more valuable as there are many variables. For this, we calculate for each qualitative variable j {\displaystyle j} and each factor F s {\displaystyle F_{s}} ( F s {\displaystyle F_{s}} , rank s {\displaystyle s} factor, is the vector of coordinates of the individuals along the axis of rank s {\displaystyle s} ; in PCA, F s {\displaystyle F_{s}} is called principal component of rank s {\displaystyle s} ), the square of the correlation ratio between the F s {\displaystyle F_{s}} and the variable j {\displaystyle j} , usually denoted : η 2 ( j , F s ) {\displaystyle \eta ^{2}(j,F_{s})} Thus, to each factorial plane, we can associate a representation of qualitative variables themselves. Their coordinates being between 0 and 1, the variables appear in the square having as vertices the points (0,0), ( 0,1), (1,0) and (1,1). == Example in MCA == Six individuals ( i 1 , … , i 6 ) {\displaystyle i_{1},\ldots ,i_{6})} are described by three variables ( q 1 , q 2 , q 3 ) {\displaystyle (q_{1},q_{2},q_{3})} having respectively 3, 2 and 3 categories. Example : the individual i 1 {\displaystyle i_{1}} possesses the category a {\displaystyle a} of q 1 {\displaystyle q_{1}} , d {\displaystyle d} of q 2 {\displaystyle q_{2}} and f {\displaystyle f} of q 3 {\displaystyle q_{3}} . Applied to these data, the MCA function included in the R Package FactoMineR provides to the classical graph in Figure 1. The relationship square (Figure 2) makes easier the reading of the classic factorial plane. It indicates that: The first factor is related to the three variables but especially q 3 {\displaystyle q_{3}} (which have a very high coordinate along the first axis) and then q 2 {\displaystyle q_{2}} . The second factor is related only to q 1 {\displaystyle q_{1}} and q 3 {\displaystyle q_{3}} (and not to q 2 {\displaystyle q_{2}} which has a coordinate along axis 2 equal to 0) and that in a strong and equal manner. All this is visible on the classic graphic but not so clearly. The role of the relationship square is first to assist in reading a conventional graphic. This is precious when the variables are numerous and possess numerous coordinates. == Extensions == This representation may be supplemented with those of quantitative variables, the coordinates of the latter being the square of correlation coefficients (and not of correlation ratios). Thus, the second advantage of the relationship square lies in the ability to represent simultaneously quantitative and qualitative variables. The relationship square can be constructed from any factorial analysis of a table individuals x variables. In particular, it is (or should be) used systematically: in multiple correspondences analysis (MCA); in principal components analysis (PCA) when there are many supplementary variables; in factor analysis of mixed data (FAMD). An extension of this graphic to groups of variables (how to represent a group of variables by a single point ?) is used in Multiple Factor Analysis (MFA) == History == The idea of representing the qualitative variables themselves by a point (and not the categories) is due to Brigitte Escofier. The graphic as it is used now has been introduced by Brigitte Escofier and Jérôme Pagès in the framework of multiple factor analysis == Conclusion == In MCA, the relationship square provides a synthetic view of the connections between mixed variables, all the more valuable as there are many variables having many categories. This representation iscan be useful in any factorial analysis when there are numerous mixed variables, active and/or supplementary.
Line detection
In image processing, line detection is an algorithm that takes a collection of n edge points and finds all the lines on which these edge points lie. The most popular line detectors are the Hough transform and convolution-based techniques. == Hough transform == The Hough transform can be used to detect lines and the output is a parametric description of the lines in an image, for example ρ = r cos(θ) + c sin(θ). If there is a line in a row and column based image space, it can be defined ρ, the distance from the origin to the line along a perpendicular to the line, and θ, the angle of the perpendicular projection from the origin to the line measured in degrees clockwise from the positive row axis. Therefore, a line in the image corresponds to a point in the Hough space. The Hough space for lines has therefore these two dimensions θ and ρ, and a line is represented by a single point corresponding to a unique set of these parameters. The Hough transform can then be implemented by choosing a set of values of ρ and θ to use. For each pixel (r, c) in the image, compute r cos(θ) + c sin(θ) for each values of θ, and place the result in the appropriate position in the (ρ, θ) array. At the end, the values of (ρ, θ) with the highest values in the array will correspond to strongest lines in the image == Convolution-based technique == In a convolution-based technique, the line detector operator consists of a convolution masks tuned to detect the presence of lines of a particular width n and a θ orientation. Here are the four convolution masks to detect horizontal, vertical, oblique (+45 degrees), and oblique (−45 degrees) lines in an image. a) Horizontal mask(R1) (b) Vertical (R3) (C) Oblique (+45 degrees)(R2) (d) Oblique (−45 degrees)(R4) In practice, masks are run over the image and the responses are combined given by the following equation: R(x, y) = max(|R1 (x, y)|, |R2 (x, y)|, |R3 (x, y)|, |R4 (x, y)|) If R(x, y) > T, then discontinuity As can be seen below, if mask is overlay on the image (horizontal line), multiply the coincident values, and sum all these results, the output will be the (convolved image). For example, (−1)(0)+(−1)(0)+(−1)(0) + (2)(1) +(2)(1)+(2)(1) + (−1)(0)+(−1)(0)+(−1)(0) = 6 pixels on the second row, second column in the (convolved image) starting from the upper left corner of the horizontal lines. page 82 == Example == These masks above are tuned for light lines against a dark background, and would give a big negative response to dark lines against a light background. == Code example == The code was used to detect only the vertical lines in an image using Matlab and the result is below. The original image is the one on the top and the result is below it. As can be seen on the picture on the right, only the vertical lines were detected
Variational autoencoder
In machine learning, a variational autoencoder (VAE) is an artificial neural network architecture introduced by Diederik P. Kingma and Max Welling in 2013. It is part of the families of probabilistic graphical models and variational Bayesian methods. In addition to being seen as an autoencoder neural network architecture, variational autoencoders can also be studied within the mathematical formulation of variational Bayesian methods, connecting a neural encoder network to its decoder through a probabilistic latent space (for example, as a multivariate Gaussian distribution) that corresponds to the parameters of a variational distribution. Thus, the encoder maps each point (such as an image) from a large complex dataset into a distribution within the latent space, rather than to a single point in that space. The decoder has the opposite function, which is to map from the latent space to the input space, again according to a distribution (although in practice, noise is rarely added during the decoding stage). By mapping a point to a distribution instead of a single point, the network can avoid overfitting the training data. Both networks are typically trained together with the usage of the reparameterization trick, although the variance of the noise model can be learned separately. Although this type of model was initially designed for unsupervised learning, its effectiveness has been proven for semi-supervised learning and supervised learning. == Overview of architecture and operation == A variational autoencoder is a generative model with a prior and noise distribution respectively. Usually such models are trained using the expectation-maximization meta-algorithm (e.g. probabilistic PCA, (spike & slab) sparse coding). Such a scheme optimizes a lower bound of the data likelihood, which is usually computationally intractable, and in doing so requires the discovery of q-distributions, or variational posteriors. These q-distributions are normally parameterized for each individual data point in a separate optimization process. However, variational autoencoders use a neural network as an amortized approach to jointly optimize across data points. In that way, the same parameters are reused for multiple data points, which can result in massive memory savings. The first neural network takes as input the data points themselves, and outputs parameters for the variational distribution. As it maps from a known input space to the low-dimensional latent space, it is called the encoder. The decoder is the second neural network of this model. It is a function that maps from the latent space to the input space, e.g. as the means of the noise distribution. It is possible to use another neural network that maps to the variance, however this can be omitted for simplicity. In such a case, the variance can be optimized with gradient descent. To optimize this model, one needs to know two terms: the "reconstruction error", and the Kullback–Leibler divergence (KL-D). Both terms are derived from the free energy expression of the probabilistic model, and therefore differ depending on the noise distribution and the assumed prior of the data, here referred to as p-distribution. For example, a standard VAE task such as IMAGENET is typically assumed to have a gaussianly distributed noise; however, tasks such as binarized MNIST require a Bernoulli noise. The KL-D from the free energy expression maximizes the probability mass of the q-distribution that overlaps with the p-distribution, which unfortunately can result in mode-seeking behaviour. The "reconstruction" term is the remainder of the free energy expression, and requires a sampling approximation to compute its expectation value. More recent approaches replace Kullback–Leibler divergence (KL-D) with various statistical distances, see "Statistical distance VAE variants" below. == Formulation == From the point of view of probabilistic modeling, one wants to maximize the likelihood of the data x {\displaystyle x} by their chosen parameterized probability distribution p θ ( x ) = p ( x | θ ) {\displaystyle p_{\theta }(x)=p(x|\theta )} . This distribution is usually chosen to be a Gaussian N ( x | μ , σ ) {\displaystyle N(x|\mu ,\sigma )} which is parameterized by μ {\displaystyle \mu } and σ {\displaystyle \sigma } respectively, and as a member of the exponential family it is easy to work with as a noise distribution. Simple distributions are easy enough to maximize, however distributions where a prior is assumed over the latents z {\displaystyle z} results in intractable integrals. Let us find p θ ( x ) {\displaystyle p_{\theta }(x)} via marginalizing over z {\displaystyle z} . p θ ( x ) = ∫ z p θ ( x , z ) d z , {\displaystyle p_{\theta }(x)=\int _{z}p_{\theta }({x,z})\,dz,} where p θ ( x , z ) {\displaystyle p_{\theta }({x,z})} represents the joint distribution under p θ {\displaystyle p_{\theta }} of the observable data x {\displaystyle x} and its latent representation or encoding z {\displaystyle z} . According to the chain rule, the equation can be rewritten as p θ ( x ) = ∫ z p θ ( x | z ) p θ ( z ) d z {\displaystyle p_{\theta }(x)=\int _{z}p_{\theta }({x|z})p_{\theta }(z)\,dz} In the vanilla variational autoencoder, z {\displaystyle z} is usually taken to be a finite-dimensional vector of real numbers, and p θ ( x | z ) {\displaystyle p_{\theta }({x|z})} to be a Gaussian distribution. Then p θ ( x ) {\displaystyle p_{\theta }(x)} is a mixture of Gaussian distributions. It is now possible to define the set of the relationships between the input data and its latent representation as Prior p θ ( z ) {\displaystyle p_{\theta }(z)} Likelihood p θ ( x | z ) {\displaystyle p_{\theta }(x|z)} Posterior p θ ( z | x ) {\displaystyle p_{\theta }(z|x)} Unfortunately, the computation of p θ ( z | x ) {\displaystyle p_{\theta }(z|x)} is expensive and in most cases intractable. To speed up the calculus to make it feasible, it is necessary to introduce a further function to approximate the posterior distribution as q ϕ ( z | x ) ≈ p θ ( z | x ) {\displaystyle q_{\phi }({z|x})\approx p_{\theta }({z|x})} with ϕ {\displaystyle \phi } defined as the set of real values that parametrize q {\displaystyle q} . This is sometimes called amortized inference, since by "investing" in finding a good q ϕ {\displaystyle q_{\phi }} , one can later infer z {\displaystyle z} from x {\displaystyle x} quickly without doing any integrals. In this way, the problem is to find a good probabilistic autoencoder, in which the conditional likelihood distribution p θ ( x | z ) {\displaystyle p_{\theta }(x|z)} is computed by the probabilistic decoder, and the approximated posterior distribution q ϕ ( z | x ) {\displaystyle q_{\phi }(z|x)} is computed by the probabilistic encoder. Parametrize the encoder as E ϕ {\displaystyle E_{\phi }} , and the decoder as D θ {\displaystyle D_{\theta }} . == Evidence lower bound (ELBO) == Like many deep learning approaches that use gradient-based optimization, VAEs require a differentiable loss function to update the network weights through backpropagation. For variational autoencoders, the idea is to jointly optimize the generative model parameters θ {\displaystyle \theta } to reduce the reconstruction error between the input and the output, and ϕ {\displaystyle \phi } to make q ϕ ( z | x ) {\displaystyle q_{\phi }({z|x})} as close as possible to p θ ( z | x ) {\displaystyle p_{\theta }(z|x)} . As reconstruction loss, mean squared error and cross entropy are often used. The Kullback–Leibler divergence D K L ( q ϕ ( z | x ) ∥ p θ ( z | x ) ) {\displaystyle D_{KL}(q_{\phi }({z|x})\parallel p_{\theta }({z|x}))} can be used as a loss function to squeeze q ϕ ( z | x ) {\displaystyle q_{\phi }({z|x})} under p θ ( z | x ) {\displaystyle p_{\theta }(z|x)} . This divergence loss expands to D K L ( q ϕ ( z | x ) ∥ p θ ( z | x ) ) = E z ∼ q ϕ ( ⋅ | x ) [ ln q ϕ ( z | x ) p θ ( z | x ) ] = E z ∼ q ϕ ( ⋅ | x ) [ ln q ϕ ( z | x ) p θ ( x ) p θ ( x , z ) ] = ln p θ ( x ) + E z ∼ q ϕ ( ⋅ | x ) [ ln q ϕ ( z | x ) p θ ( x , z ) ] . {\displaystyle {\begin{aligned}D_{KL}(q_{\phi }({z|x})\parallel p_{\theta }({z|x}))&=\mathbb {E} _{z\sim q_{\phi }(\cdot |x)}\left[\ln {\frac {q_{\phi }(z|x)}{p_{\theta }(z|x)}}\right]\\&=\mathbb {E} _{z\sim q_{\phi }(\cdot |x)}\left[\ln {\frac {q_{\phi }({z|x})p_{\theta }(x)}{p_{\theta }(x,z)}}\right]\\&=\ln p_{\theta }(x)+\mathbb {E} _{z\sim q_{\phi }(\cdot |x)}\left[\ln {\frac {q_{\phi }({z|x})}{p_{\theta }(x,z)}}\right].\end{aligned}}} Now, define the evidence lower bound (ELBO): L θ , ϕ ( x ) := E z ∼ q ϕ ( ⋅ | x ) [ ln p θ ( x , z ) q ϕ ( z | x ) ] = ln p θ ( x ) − D K L ( q ϕ ( ⋅ | x ) ∥ p θ ( ⋅ | x ) ) {\displaystyle L_{\theta ,\phi }(x):=\mathbb {E} _{z\sim q_{\phi }(\cdot |x)}\left[\ln {\frac {p_{\theta }(x,z)}{q_{\phi }({z|x})}}\right]=\ln p_{\theta }(x)-D_{KL}(q_{\phi }({\cdot |x})\parallel p_{\theta }({\cdot |x}))} Maximizing the ELBO θ ∗ , ϕ ∗ = argmax θ , ϕ L θ , ϕ ( x ) {\dis