AI Chatbot Ethics

AI Chatbot Ethics — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Gradient vector flow

    Gradient vector flow

    Gradient vector flow (GVF), a computer vision framework introduced by Chenyang Xu and Jerry L. Prince, is the vector field that is produced by a process that smooths and diffuses an input vector field. It is usually used to create a vector field from images that points to object edges from a distance. It is widely used in image analysis and computer vision applications for object tracking, shape recognition, segmentation, and edge detection. In particular, it is commonly used in conjunction with active contour model. == Background == Finding objects or homogeneous regions in images is a process known as image segmentation. In many applications, the locations of object edges can be estimated using local operators that yield a new image called an edge map. The edge map can then be used to guide a deformable model, sometimes called an active contour or a snake, so that it passes through the edge map in a smooth way, therefore defining the object itself. A common way to encourage a deformable model to move toward the edge map is to take the spatial gradient of the edge map, yielding a vector field. Since the edge map has its highest intensities directly on the edge and drops to zero away from the edge, these gradient vectors provide directions for the active contour to move. When the gradient vectors are zero, the active contour will not move, and this is the correct behavior when the contour rests on the peak of the edge map itself. However, because the edge itself is defined by local operators, these gradient vectors will also be zero far away from the edge and therefore the active contour will not move toward the edge when initialized far away from the edge. Gradient vector flow (GVF) is the process that spatially extends the edge map gradient vectors, yielding a new vector field that contains information about the location of object edges throughout the entire image domain. GVF is defined as a diffusion process operating on the components of the input vector field. It is designed to balance the fidelity of the original vector field, so it is not changed too much, with a regularization that is intended to produce a smooth field on its output. Although GVF was designed originally for the purpose of segmenting objects using active contours attracted to edges, it has been since adapted and used for many alternative purposes. Some newer purposes including defining a continuous medial axis representation, regularizing image anisotropic diffusion algorithms, finding the centers of ribbon-like objects, constructing graphs for optimal surface segmentations, creating a shape prior, and much more. == Theory == The theory of GVF was originally described by Xu and Prince. Let f ( x , y ) {\displaystyle \textstyle f(x,y)} be an edge map defined on the image domain. For uniformity of results, it is important to restrict the edge map intensities to lie between 0 and 1, and by convention f ( x , y ) {\displaystyle \textstyle f(x,y)} takes on larger values (close to 1) on the object edges. The gradient vector flow (GVF) field is given by the vector field v ( x , y ) = [ u ( x , y ) , v ( x , y ) ] {\displaystyle \textstyle \mathbf {v} (x,y)=[u(x,y),v(x,y)]} that minimizes the energy functional In this equation, subscripts denote partial derivatives and the gradient of the edge map is given by the vector field ∇ f = ( f x , f y ) {\displaystyle \textstyle \nabla f=(f_{x},f_{y})} . Figure 1 shows an edge map, the gradient of the (slightly blurred) edge map, and the GVF field generated by minimizing E {\displaystyle \textstyle {\mathcal {E}}} . Equation 1 is a variational formulation that has both a data term and a regularization term. The first term in the integrand is the data term. It encourages the solution v {\displaystyle \textstyle \mathbf {v} } to closely agree with the gradients of the edge map since that will make v − ∇ f {\displaystyle \textstyle \mathbf {v} -\nabla f} small. However, this only needs to happen when the edge map gradients are large since v − ∇ f {\displaystyle \textstyle \mathbf {v} -\nabla f} is multiplied by the square of the length of these gradients. The second term in the integrand is a regularization term. It encourages the spatial variations in the components of the solution to be small by penalizing the sum of all the partial derivatives of v {\displaystyle \textstyle \mathbf {v} } . As is customary in these types of variational formulations, there is a regularization parameter μ > 0 {\displaystyle \textstyle \mu >0} that must be specified by the user in order to trade off the influence of each of the two terms. If μ {\displaystyle \textstyle \mu } is large, for example, then the resulting field will be very smooth and may not agree as well with the underlying edge gradients. Theoretical Solution. Finding v ( x , y ) {\displaystyle \textstyle \mathbf {v} (x,y)} to minimize Equation 1 requires the use of calculus of variations since v ( x , y ) {\displaystyle \textstyle \mathbf {v} (x,y)} is a function, not a variable. Accordingly, the Euler equations, which provide the necessary conditions for v {\displaystyle \textstyle \mathbf {v} } to be a solution can be found by calculus of variations, yielding where ∇ 2 {\displaystyle \textstyle \nabla ^{2}} is the Laplacian operator. It is instructive to examine the form of the equations in (2). Each is a partial differential equation that the components u {\displaystyle u} and v {\displaystyle v} of v {\displaystyle \mathbf {v} } must satisfy. If the magnitude of the edge gradient is small, then the solution of each equation is guided entirely by Laplace's equation, for example ∇ 2 u = 0 {\displaystyle \textstyle \nabla ^{2}u=0} , which will produce a smooth scalar field entirely dependent on its boundary conditions. The boundary conditions are effectively provided by the locations in the image where the magnitude of the edge gradient is large, where the solution is driven to agree more with the edge gradients. Computational Solutions. There are two fundamental ways to compute GVF. First, the energy function E {\displaystyle {\mathcal {E}}} itself (1) can be directly discretized and minimized, for example, by gradient descent. Second, the partial differential equations in (2) can be discretized and solved iteratively. The original GVF paper used an iterative approach, while later papers introduced considerably faster implementations such as an octree-based method, a multi-grid method, and an augmented Lagrangian method. In addition, very fast GPU implementations have been developed in Extensions and Advances. GVF is easily extended to higher dimensions. The energy function is readily written in a vector form as which can be solved by gradient descent or by finding and solving its Euler equation. Figure 2 shows an illustration of a three-dimensional GVF field on the edge map of a simple object (see ). The data and regularization terms in the integrand of the GVF functional can also be modified. A modification described in , called generalized gradient vector flow (GGVF) defines two scalar functions and reformulates the energy as While the choices g ( ∇ f | ) = μ {\displaystyle \textstyle g(\nabla f|)=\mu } and h ( | ∇ f | ) = | ∇ f | 2 {\displaystyle \textstyle h(|\nabla f|)=|\nabla f|^{2}} reduce GGVF to GVF, the alternative choices g ( | ∇ f | ) = exp ⁡ { − | ∇ f | / K } {\displaystyle \textstyle g(|\nabla f|)=\exp\{-|\nabla f|/K\}} and h ( ∇ f | ) = 1 − g ( | ∇ f | ) {\displaystyle \textstyle h(\nabla f|)=1-g(|\nabla f|)} , for K {\displaystyle K} a user-selected constant, can improve the tradeoff between the data term and its regularization in some applications. The GVF formulation has been further extended to vector-valued images in where a weighted structure tensor of a vector-valued image is used. A learning based probabilistic weighted GVF extension was proposed in to further improve the segmentation for images with severely cluttered textures or high levels of noise. The variational formulation of GVF has also been modified in motion GVF (MGVF) to incorporate object motion in an image sequence. Whereas the diffusion of GVF vectors from a conventional edge map acts in an isotropic manner, the formulation of MGVF incorporates the expected object motion between image frames. An alternative to GVF called vector field convolution (VFC) provides many of the advantages of GVF, has superior noise robustness, and can be computed very fast. The VFC field v V F C {\displaystyle \textstyle \mathbf {v} _{\mathrm {VFC} }} is defined as the convolution of the edge map f {\displaystyle f} with a vector field kernel k {\displaystyle \mathbf {k} } where The vector field kernel k {\displaystyle \textstyle \mathbf {k} } has vectors that always point toward the origin but their magnitudes, determined in detail by the function m {\displaystyle m} , decrease to zero with increasing distance from the origin. The beauty of VFC is that it can be computed very rapidly using a fast Fourier tra

    Read more →
  • Residual neural network

    Residual neural network

    A residual neural network (also referred to as a residual network or ResNet) is a deep learning architecture in which the layers learn residual functions with reference to the layer inputs. It was developed in 2015 for image recognition, and won the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) of that year. As a point of terminology, "residual connection" refers to the specific architectural motif of x ↦ f ( x ) + x {\displaystyle x\mapsto f(x)+x} , where f {\displaystyle f} is an arbitrary neural network module. The motif had been used previously (see §History for details). However, the publication of ResNet made it widely popular for feedforward networks, appearing in neural networks that are seemingly unrelated to ResNet. The residual connection stabilizes the training and convergence of deep neural networks with hundreds of layers, and is a common motif in deep neural networks, such as transformer models (e.g., BERT, and GPT models such as ChatGPT), the AlphaGo Zero system, the AlphaStar system, and the AlphaFold system. == Mathematics == === Residual connection === In a multilayer neural network model, consider a (non-residual) subnetwork with a certain number of stacked layers (e.g., 2 or 3). Let H ( x ; α ) {\displaystyle H(x;\alpha )} denote the subnetwork. Suppose H ∗ {\displaystyle H^{}} is the desired optimal output of this subnetwork. Residual learning simply adds x {\displaystyle x} directly to the output, such that the optimal learned output now becomes be H ∗ − x {\displaystyle H^{}-x} , which is interpreted as a "residual" with respect to x {\displaystyle x} . The operation of "adding x {\displaystyle x} " is implemented via a "skip connection" that performs an identity mapping to connect the input of the subnetwork with its output. This connection is referred to as a "residual connection" in later work. Let F ( x ; α ) = H ( x ; a ) + x {\displaystyle F(x;\alpha )=H(x;a)+x} . The function F {\displaystyle F} is often represented by matrix multiplication interlaced with activation functions and normalization operations (e.g., batch normalization or layer normalization). As a whole, one of these subnetworks is referred to as a "residual block". A deep residual network is constructed by simply stacking these blocks. Long short-term memory (LSTM) has a memory mechanism that serves as a residual connection. In an LSTM without a forget gate, an input x t {\displaystyle x_{t}} is processed by a function F {\displaystyle F} and added to a memory cell c t {\displaystyle c_{t}} , resulting in c t + 1 = c t + F ( x t ) {\displaystyle c_{t+1}=c_{t}+F(x_{t})} . An LSTM with a forget gate essentially functions as a highway network. To stabilize the variance of the layers' inputs, it is recommended to replace the residual connections x + f ( x ) {\displaystyle x+f(x)} with x / L + f ( x ) {\displaystyle x/L+f(x)} , where L {\displaystyle L} is the total number of residual layers. === Projection connection === If the function F {\displaystyle F} is of type F : R n → R m {\displaystyle F:\mathbb {R} ^{n}\to \mathbb {R} ^{m}} where n ≠ m {\displaystyle n\neq m} , then F ( x ) + x {\displaystyle F(x)+x} is undefined. To handle this special case, a projection connection is used: y = F ( x ) + P ( x ) {\displaystyle y=F(x)+P(x)} where P {\displaystyle P} is typically a linear projection, defined by P ( x ) = M x {\displaystyle P(x)=Mx} where M {\displaystyle M} is a m × n {\displaystyle m\times n} matrix. The matrix is trained via backpropagation, as is any other parameter of the model. === Signal propagation === The introduction of identity mappings facilitates signal propagation in both forward and backward paths. ==== Forward propagation ==== If the output of the ℓ {\displaystyle \ell } -th residual block is the input to the ( ℓ + 1 ) {\displaystyle (\ell +1)} -th residual block (assuming no activation function between blocks), then the ( ℓ + 1 ) {\displaystyle (\ell +1)} -th input is: x ℓ + 1 = F ( x ℓ ) + x ℓ {\displaystyle x_{\ell +1}=F(x_{\ell })+x_{\ell }} Applying this formulation recursively, e.g.: x ℓ + 2 = F ( x ℓ + 1 ) + x ℓ + 1 = F ( x ℓ + 1 ) + F ( x ℓ ) + x ℓ {\displaystyle {\begin{aligned}x_{\ell +2}&=F(x_{\ell +1})+x_{\ell +1}\\&=F(x_{\ell +1})+F(x_{\ell })+x_{\ell }\end{aligned}}} yields the general relationship: x L = x ℓ + ∑ i = ℓ L − 1 F ( x i ) {\displaystyle x_{L}=x_{\ell }+\sum _{i=\ell }^{L-1}F(x_{i})} where L {\textstyle L} is the index of a residual block and ℓ {\textstyle \ell } is the index of some earlier block. This formulation suggests that there is always a signal that is directly sent from a shallower block ℓ {\textstyle \ell } to a deeper block L {\textstyle L} . ==== Backward propagation ==== The residual learning formulation provides the added benefit of mitigating the vanishing gradient problem to some extent. However, it is crucial to acknowledge that the vanishing gradient issue is not the root cause of the degradation problem, which is tackled through the use of normalization. To observe the effect of residual blocks on backpropagation, consider the partial derivative of a loss function E {\displaystyle {\mathcal {E}}} with respect to some residual block input x ℓ {\displaystyle x_{\ell }} . Using the equation above from forward propagation for a later residual block L > ℓ {\displaystyle L>\ell } : ∂ E ∂ x ℓ = ∂ E ∂ x L ∂ x L ∂ x ℓ = ∂ E ∂ x L ( 1 + ∂ ∂ x ℓ ∑ i = ℓ L − 1 F ( x i ) ) = ∂ E ∂ x L + ∂ E ∂ x L ∂ ∂ x ℓ ∑ i = ℓ L − 1 F ( x i ) {\displaystyle {\begin{aligned}{\frac {\partial {\mathcal {E}}}{\partial x_{\ell }}}&={\frac {\partial {\mathcal {E}}}{\partial x_{L}}}{\frac {\partial x_{L}}{\partial x_{\ell }}}\\&={\frac {\partial {\mathcal {E}}}{\partial x_{L}}}\left(1+{\frac {\partial }{\partial x_{\ell }}}\sum _{i=\ell }^{L-1}F(x_{i})\right)\\&={\frac {\partial {\mathcal {E}}}{\partial x_{L}}}+{\frac {\partial {\mathcal {E}}}{\partial x_{L}}}{\frac {\partial }{\partial x_{\ell }}}\sum _{i=\ell }^{L-1}F(x_{i})\end{aligned}}} This formulation suggests that the gradient computation of a shallower layer, ∂ E ∂ x ℓ {\textstyle {\frac {\partial {\mathcal {E}}}{\partial x_{\ell }}}} , always has a later term ∂ E ∂ x L {\textstyle {\frac {\partial {\mathcal {E}}}{\partial x_{L}}}} that is directly added. Even if the gradients of the F ( x i ) {\displaystyle F(x_{i})} terms are small, the total gradient ∂ E ∂ x ℓ {\textstyle {\frac {\partial {\mathcal {E}}}{\partial x_{\ell }}}} resists vanishing due to the added term ∂ E ∂ x L {\textstyle {\frac {\partial {\mathcal {E}}}{\partial x_{L}}}} . == Variants of residual blocks == === Basic block === A basic block is the simplest building block studied in the original ResNet. This block consists of two sequential 3x3 convolutional layers and a residual connection. The input and output dimensions of both layers are equal. === Bottleneck block === A bottleneck block consists of three sequential convolutional layers and a residual connection. The first layer in this block is a 1×1 convolution for dimension reduction (e.g., to 1/2 of the input dimension); the second layer performs a 3×3 convolution; the last layer is another 1×1 convolution for dimension restoration. The models of ResNet-50, ResNet-101, and ResNet-152 are all based on bottleneck blocks. === Pre-activation block === The pre-activation residual block applies activation functions before applying the residual function F {\displaystyle F} . Formally, the computation of a pre-activation residual block can be written as: x ℓ + 1 = F ( ϕ ( x ℓ ) ) + x ℓ {\displaystyle x_{\ell +1}=F(\phi (x_{\ell }))+x_{\ell }} where ϕ {\displaystyle \phi } can be any activation (e.g. ReLU) or normalization (e.g. LayerNorm) operation. This design reduces the number of non-identity mappings between residual blocks, and allows an identity mapping directly from the input to the output. This design was used to train models with 200 to over 1000 layers, and was found to consistently outperform variants where the residual path is not an identity function. The pre-activation ResNet with 200 layers took 3 weeks to train for ImageNet on 8 GPUs in 2016. Since GPT-2, transformer blocks have been mostly implemented as pre-activation blocks. This is often referred to as "pre-normalization" in the literature of transformer models. == Applications == Originally, ResNet was designed for computer vision. All transformer architectures include residual connections. Indeed, very deep transformers cannot be trained without them. The original ResNet paper made no claim on being inspired by biological systems. However, later research has related ResNet to biologically-plausible algorithms. A study published in Science in 2023 disclosed the complete connectome of an insect brain (specifically that of a fruit fly larva). This study discovered "multilayer shortcuts" that resemble the skip connections in artificial neural networks, including ResNets. == History == === Previous work === Residual connections were noticed in neu

    Read more →
  • Neighborhood operation

    Neighborhood operation

    In computer vision and image processing a neighborhood operation is a commonly used class of computations on image data which implies that it is processed according to the following pseudo code: Visit each point p in the image data and do { N = a neighborhood or region of the image data around the point p result(p) = f(N) } This general procedure can be applied to image data of arbitrary dimensionality. Also, the image data on which the operation is applied does not have to be defined in terms of intensity or color, it can be any type of information which is organized as a function of spatial (and possibly temporal) variables in p. The result of applying a neighborhood operation on an image is again something which can be interpreted as an image, it has the same dimension as the original data. The value at each image point, however, does not have to be directly related to intensity or color. Instead it is an element in the range of the function f, which can be of arbitrary type. Normally the neighborhood N is of fixed size and is a square (or a cube, depending on the dimensionality of the image data) centered on the point p. Also the function f is fixed, but may in some cases have parameters which can vary with p, see below. In the simplest case, the neighborhood N may be only a single point. This type of operation is often referred to as a point-wise operation. == Examples == The most common examples of a neighborhood operation use a fixed function f which in addition is linear, that is, the computation consists of a linear shift invariant operation. In this case, the neighborhood operation corresponds to the convolution operation. A typical example is convolution with a low-pass filter, where the result can be interpreted in terms of local averages of the image data around each image point. Other examples are computation of local derivatives of the image data. It is also rather common to use a fixed but non-linear function f. This includes median filtering, and computation of local variances. The Nagao-Matsuyama filter is an example of a complex local neighbourhood operation that uses variance as an indicator of the uniformity within a pixel group. The result is similar to a convolution with a low-pass filter with the added effect of preserving sharp edges. There is also a class of neighborhood operations in which the function f has additional parameters which can vary with p: Visit each point p in the image data and do { N = a neighborhood or region of the image data around the point p result(p) = f(N, parameters(p)) } This implies that the result is not shift invariant. Examples are adaptive Wiener filters. == Implementation aspects == The pseudo code given above suggests that a neighborhood operation is implemented in terms of an outer loop over all image points. However, since the results are independent, the image points can be visited in arbitrary order, or can even be processed in parallel. Furthermore, in the case of linear shift-invariant operations, the computation of f at each point implies a summation of products between the image data and the filter coefficients. The implementation of this neighborhood operation can then be made by having the summation loop outside the loop over all image points. An important issue related to neighborhood operation is how to deal with the fact that the neighborhood N becomes more or less undefined for points p close to the edge or border of the image data. Several strategies have been proposed: Compute result only for points p for which the corresponding neighborhood is well-defined. This implies that the output image will be somewhat smaller than the input image. Zero padding: Extend the input image sufficiently by adding extra points outside the original image which are set to zero. The loops over the image points described above visit only the original image points. Border extension: Extend the input image sufficiently by adding extra points outside the original image which are set to the image value at the closest image point. The loops over the image points described above visit only the original image points. Mirror extension: Extend the image sufficiently much by mirroring the image at the image boundaries. This method is less sensitive to local variations at the image boundary than border extension. Wrapping: The image is tiled, so that going off one edge wraps around to the opposite side of the image. This method assumes that the image is largely homogeneous, for example a stochastic image texture without large textons.

    Read more →
  • Landmark point

    Landmark point

    In morphometrics, landmark point or shortly landmark is a point in a shape object in which correspondences between and within the populations of the object are preserved. In other disciplines, landmarks may be known as vertices, anchor points, control points, sites, profile points, 'sampling' points, nodes, markers, fiducial markers, etc. Landmarks can be defined either manually by experts or automatically by a computer program. There are three basic types of landmarks: anatomical landmarks, mathematical landmarks or pseudo-landmarks. An anatomical landmark is a biologically-meaningful point in an organism. Usually experts define anatomical points to ensure their correspondences within the same species. Examples of anatomical landmark in shape of a skull are the eye corner, tip of the nose, jaw, etc. Anatomical landmarks determine homologous parts of an organism, which share a common ancestry. Mathematical landmarks are points in a shape that are located according to some mathematical or geometrical property, for instance, a high curvature point or an extreme point. A computer program usually determines mathematical landmarks used for an automatic pattern recognition. Pseudo-landmarks are constructed points located between anatomical or mathematical landmarks. A typical example is an equally spaced set of points between two anatomical landmarks to get more sample points from a shape. Pseudo-landmarks are useful during shape matching, when the matching process requires a large number of points.

    Read more →
  • Cygwin

    Cygwin

    Cygwin ( SIG-win) is a free and open-source Unix-like environment and command-line interface (CLI) for Microsoft Windows. The project also provides a software repository containing open-source packages. Cygwin allows source code for Unix-like operating systems to be compiled and run on Windows. Cygwin provides native integration of Windows-based applications. The terminal emulator mintty is the default command-line interface provided to interact with the environment. The Cygwin installation directory layout mimics the root file system of Unix-like systems, with directories such as /bin, /home, /etc, /usr, and /var. Cygwin is released under the GNU Lesser General Public License version 3. It was originally developed by Cygnus Solutions, which was later acquired by Red Hat (now part of IBM), to port the GNU toolchain to Win32, including the GNU Compiler Suite. Rather than rewrite the tools to use the Win32 runtime environment, Cygwin implemented a POSIX-compatible environment in the form of a DLL. The brand motto is "Get that Linux feeling – on Windows", although Cygwin doesn't have Linux in it. == History == Cygwin began in 1995 as a project of Steve Chamberlain, a Cygnus engineer who observed that Windows NT and 95 used COFF as their object file format, and that GNU already included support for x86 and COFF, and the C library newlib. He thought that it would be possible to retarget GCC and produce a cross compiler generating executables that could run on Windows. A prototype was later developed. Chamberlain bootstrapped the compiler on a Windows system, to emulate Unix to let the GNU configure shell script run. Initially, Cygwin was called Cygwin32. When Microsoft registered the trademark Win32, the "32" was dropped to simply become Cygwin. In 1999, Cygnus offered Cygwin 1.0 as a commercial product. Subsequent versions have not been released, instead relying on continued open source releases. Geoffrey Noer was the project lead from 1996 to 1999. Christopher Faylor was lead from 1999 to 2004; he left Red Hat and became co-lead with Corinna Vinschen. Corinna Vinschen has been the project lead from mid-2014 to date (as of September, 2024). From June 23, 2016, the Cygwin library version 2.5.2 was licensed under the GNU Lesser General Public License (LGPL) version 3. == Description == Cygwin is provided in two versions: the full 64-bit version and a stripped-down 32-bit version, whose final version was released in 2022. Cygwin consists of a library that implements the POSIX system call API in terms of Windows system calls to enable the running of a large number of application programs equivalent to those on Unix systems, and a GNU development toolchain (including GCC and GDB). Programmers have ported the X Window System, K Desktop Environment 3, GNOME, Apache, and TeX. Cygwin permits installing inetd, syslogd, sshd, Apache, and other daemons as standard Windows services. Cygwin programs have full access to the Windows API and other Windows libraries. Cygwin programs are installed by running Cygwin's "setup" program, which downloads them from repositories on the Internet. The Cygwin API library is licensed under the GNU Lesser General Public License version 3 (or later), with an exception to allow linking to any free and open-source software whose license conforms to the Open Source Definition. Cygwin consists of two parts: A dynamic-link library in the form of a C standard library that acts as a compatibility layer for the POSIX API and A collection of software tools and applications that provide a Unix-like look and feel. Cygwin supports POSIX symbolic links, representing them as plain-text files with the system attribute set. Cygwin 1.5 represented them as Windows Explorer shortcuts, but this was changed for reasons of performance and POSIX correctness. Cygwin also recognises NTFS junction points and symbolic links and treats them as POSIX symbolic links, but it does not create them. The POSIX API for handling access control lists (ACLs) is supported. === Technical details === A Cygwin-specific version of the Unix mount command allows mounting Windows paths as "filesystems" in the Unix file space. Initial mount points can be configured in /etc/fstab, which has a format very similar to Unix systems, except that Windows paths appear in place of devices. Filesystems can be mounted in binary mode (by default), or in text mode, which enables automatic conversion between LF and CRLF endings (which only affects programs that open files without explicitly specifying text or binary mode). Cygwin 1.7 introduced comprehensive support for POSIX locales, and the UTF-8 Unicode encoding became the default. The fork system call for duplicating a process is fully implemented, but the copy-on-write optimization strategy could not be used. Cygwin's default user interface is the bash shell running in the mintty terminal emulator. The DLL also implements pseudo terminal (pty) devices, and Cygwin ships with a number of terminal emulators that are based on them, including rxvt/urxvt and xterm. The version of GCC that comes with Cygwin has various extensions for creating Windows DLLs, such as specifying whether a program is a windowing or console-mode program. Support for compiling programs that do not require the POSIX compatibility layer provided by the Cygwin DLL used to be included in the default GCC, but as of 2014, it is provided by cross-compilers contributed by the MinGW-w64 project. == Software packages == Cygwin's base package selection is approximately 100MB, containing the bash (interactive user) and dash (installation) shells and the core file and text manipulation utilities. Additional packages are available as optional installs from within the Cygwin "setup" program and package manager ("setup-x86_64.exe" – 64 bit). The Cygwin Ports project provided additional packages that were not available in the Cygwin distribution itself. Examples included GNOME, K Desktop Environment 3, MySQL database, and the PHP scripting language. Most ports have been adopted by volunteer maintainers as Cygwin packages, and Cygwin Ports are no longer maintained. Cygwin ships with GTK+ and Qt. The Cygwin/X project allows graphical Unix programs to display their user interfaces on the Windows desktop for both local and remote programs.

    Read more →
  • Stochastic parrot

    Stochastic parrot

    In machine learning, the term stochastic parrot is a metaphor that frames large language models as systems that statistically mimic text without real understanding. The word "stochastic" – from the ancient Greek "στοχαστικός" (stokhastikos, 'based on guesswork') – is a term from probability theory meaning "randomly determined". The word "parrot" refers to parrots' ability to mimic human speech. The term was introduced in a 2021 paper on AI ethics titled "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜" and authored by Timnit Gebru, Emily M. Bender, Angelina McMillan-Major, and Margaret Mitchell. The paper outlined possible risks associated with large language models (LLMs). In December 2020, it was the subject of a workplace dispute between Gebru (then co-leader of Google's Ethical Artificial Intelligence Team) and Google, which had requested the retraction of the paper. The incident culminated in Gebru's controversial departure from the company. The paper was later presented at the 2021 ACM Conference, and the term "stochastic parrot" has seen widespread use in academic research concerning generative AI and LLMs. The term has been interpreted negatively as an insult towards AI. == Background == Timnit Gebru is an AI ethics researcher, Emily M. Bender is a linguist specializing in computational linguistics, and Margaret Mitchell is a computer scientist specializing in algorithmic bias. Gebru had joined Google in 2018, where she co-led a team on the ethics of artificial intelligence with Mitchell. In late 2020, the paper "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜" was co-written by Gebru and five other researchers, four of whom were Google employees. The paper argues that large language models (LLMs) present significant risks such as environmental and financial costs, inscrutability leading to unknown dangerous biases, and potential for deception as LLMs do not understand the concepts underlying what they learn. The paper states that LLMs are "stitching together sequences of linguistic forms ... observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning." Therefore, they are labeled "stochastic parrots". === Dismissal of Gebru by Google === After the paper was submitted for consideration to the 2021 ACM Conference, Google requested that Gebru either retract the paper from the conference or remove the names of Google employees from it. Gebru refused to do so without further discussion, and emailed Google Research vice president Megan Kacholia that if the company could not explain the request for retraction and address other concerns regarding similar projects, she would plan to resign after a transition period, stating that they could "work on a last date". The following day, on December 2, 2020, Gebru received an email saying that Google was "accepting her resignation". Her abrupt firing sparked protests by Google employees and negative publicity for the company. == Usage == The phrase has been used by AI skeptics to signify that LLMs lack understanding of the meaning of their outputs. Sam Altman, CEO of OpenAI, used the term shortly after the release of ChatGPT in December 2022, tweeting "i am a stochastic parrot, and so r u". The term was nominated as the 2023 AI-related Word of the Year by the American Dialect Society. == Debate == Some LLMs, such as ChatGPT, have become capable of interacting with users in convincingly human-like conversations. The development of these new systems has deepened the discussion of the extent to which LLMs understand or are simply "parroting". According to machine learning researchers Lindholm, Wahlström, Lindsten, and Schön, the term "stochastic parrot" highlights two vital limitations of LLMs: LLMs are limited by the data they are trained on and are simply stochastically repeating contents of datasets. Because they are just making up outputs based on training data, LLMs do not understand if they are saying something incorrect or inappropriate. Lindholm et al. noted that, with poor quality datasets and other limitations, a learning machine might produce results that are "dangerously wrong". === Subjective experience === In the mind of a human being, words and language correspond to things one has experienced. For LLMs, according to proponents of the theory, words correspond only to other words and patterns of usage fed into their training data. Proponents of the idea of stochastic parrots thus conclude that statements about LLMs are due to "the human tendency to attribute meaning to text", and claim this occurs despite the LLMs not actually understanding language. === Fine-tuning === Kelsey Piper argued that the claim that LLMs are stochastic parrots or mere "next-token predictors" focuses on pre-training, ignoring that modern LLMs are also fine-tuned to follow instructions and to prefer accurate answers. === Hallucinations and mistakes === The tendency of LLMs to pass off false information as fact is held as support. Called hallucinations or confabulations, LLMs will occasionally synthesize information that matches some pattern. LLMs may fail to distinguish fact and fiction, which leads to the claim that they can't connect words to a comprehension of the world, as humans do. Furthermore, LLMs may fail to decipher complex or ambiguous grammar cases that rely on understanding the meaning of language. For example: The wet newspaper that fell down off the table is my favorite newspaper. But now that my favorite newspaper fired the editor I might not like reading it anymore. Can I replace 'my favorite newspaper' by 'the wet newspaper that fell down off the table' in the second sentence? GPT-4, an LLM released in March 2023, responded yes, not understanding that the meaning of "newspaper" is different in these two contexts; it is first an object and second an institution. === Benchmarks and experiments === One argument against the hypothesis that LLMs are stochastic parrot is their results on benchmarks for reasoning, common sense and language understanding. In 2023, some LLMs have shown good results on many language understanding tests, such as the Super General Language Understanding Evaluation (SuperGLUE). GPT-4 scored in the >90th-percentile on the Uniform Bar Examination and achieved 93% accuracy on the MATH benchmark of high-school Olympiad problems, results that exceed rote pattern-matching expectations. Such tests, and the smoothness of many LLM responses, help as many as 51% of AI professionals believe they can truly understand language with enough data, according to a 2022 survey. === Expert rebuttals === Some AI researchers dispute the notion that LLMs merely "parrot" their training data. Geoffrey Hinton, a pioneering figure in neural networks, counters that the metaphor misunderstands the prerequisite for accurate language prediction. He argues that "to predict the next word accurately, you have to understand the sentence", a view he presented on 60 Minutes in 2023. From this perspective, understanding is not an alternative to statistical prediction, but rather an emergent property required to perform it effectively at scale. Hinton also uses logical puzzles to demonstrate that LLMs actually understand language. A 2024 Scientific American investigation described a closed Berkeley workshop where state-of-the-art models solved novel tier-4 mathematics problems and produced coherent proofs, indicating reasoning abilities beyond memorization. The GPT-4 Technical Report showed human-level results on professional and academic exams (e.g., the Uniform Bar Exam and USMLE), challenging the "parrot" characterization. Anthropic conducted mechanistic interpretability research on Claude, using attribution graphs to identify circuits. The research showed how the LLM processes information via chains of fuzzy logical inference, and indicated an ability to plan ahead. They found that Claude 3.5 Haiku "employs remarkably general abstractions", forms "internally generated plans for its future outputs" and "works backwards from its longer-term goals". They noted that "The mechanisms of the model can apparently only be faithfully described using an overwhelmingly large causal graph." They also found that the model includes "mechanisms that could underlie a simple form of metacognition", in that it "thinks about" the level of its own knowledge before reaching its answer. === Interpretability === Another line of evidence against the 'stochastic parrot' claim comes from mechanistic interpretability, a research field dedicated to reverse-engineering LLMs to understand their internal workings. Rather than only observing the model's input-output behavior, these techniques probe the model's internal activations, which can be used to determine if they contain structured representations of the world. The goal is to investigate whether LLMs are merely manipulating surface statistics or if t

    Read more →
  • Talking Angela

    Talking Angela

    Talking Angela is a mobile game (formerly a chatbot), developed by Slovenian studio Outfit7 as part of the Talking Tom & Friends series. It was released on 13 November 2012 and December 2012 for iPhone, iPod and iPad, January 2013 for Android, and January 2014 for Google Play. The game's successor, the My Talking Angela game, was released in December 2014. The game takes place in a café in Paris and allows players to interact with Angela, an anthropomorphic white cat in different ways. Players can use coins to purchase makeup, accessories and items, as well as drinks that will trigger different visual effects. The fortune cookie button causes Angela to read out a fortune cookie, while the bird icon will prompt birds to fly around the screen, or have Angela feed them. Players can also pet or poke Angela, as well the café's sign. Prior to their removal, the game featured a chat system and a camera button. Users can engage in conversations with Angela, ask for quizzes or initiate a short snippet of the song "That's Falling In Love". If the player was to type in "Who is an idiot?", Angela would respond with a random swear word. Additionally, inquiring Angela about sexual topics would cause her to reply with "Do you want to talk about sex?", though she will quickly change the topic regardless of what the player writes next. A hoax claiming that Angela's eyes were hidden cameras that enabled hackers or paedophiles to watch children was spread. Despite the claims, Snopes and The Guardian found no evidence. Due to the hoax, Angela received a blue dress, as well as an altered eye asset with a different reflection, and later the chat and camera functions were removed altogether. == Hoaxes == In February 2014, Talking Angela was the subject of an Internet hoax alleging that the application was a front for child predators to exploit children. The rumor, which was widely circulated on Facebook and various websites claiming to be dedicated to parenting, claims that a sinister sexual predator or hacker, asked children for private personal information using the game's text-chat feature. Other versions of the rumour even attributed the disappearance of a child to the game; one news report claimed that a seven year old boy disappeared after downloading the app. Another variation included that it was run by a paedophile ring, citing a man that could be seen in Angela's eyes. The app's developers, Outfit7, later gave a statement refuting the hoaxes. The hoax was eventually debunked by Snopes, a fact-checking website. The site's owners, Barbara and David Mikkelson, reported that they had tried to "prompt" it to give responses asking for private information, but were unsuccessful, even when asking it explicitly sexual questions. While it is true that, in the game with child mode off, Angela does ask for the user's name, age and personal preferences to determine conversation topics, Outfit7 has said that this information is all "anonymized" and all personal information is removed from it. It is also impossible for a person to take control of what Angela says in the game, since the game is based on chatbot software. When the mode was turned on, the chat feature was disabled, meaning no personal questions could be asked. In 2015, the hoax was revived on Facebook, which prompted online security company Sophos and The Guardian to debunk it again. Sophos employee Paul Ducklin wrote that the message being posted on Facebook promoting the hoax was "close to 600 rambling, repetitious words, despite claiming at the start that it didn't have words to describe the situation. It's ill-written, and borders on being illiterate and incomprehensible." Bruce Wilcox, one of the game's programmers, attributed the hoax's popularity to the fact that the chatbot program in Talking Angela aimed to sound realistic. Concern was raised that the game's child mode may have been too easy for children to turn off. It allowed them to purchase "coins", premium currency in the game, via iTunes, and enabled the chat feature. While not "connecting your children to paedophiles", this still raised concerns according to The Guardian. === Impact === The scare significantly boosted the game's popularity, and was credited with helping the app enter the top 10 free iPhone apps soon after the hoax became widely known in February 2015,In the truth the reason there is a man in Angela’s eyes is because of pareidoila, the ability to see through diamonds and other minerals and water bodies and shiny objects,which is the reason why players notice a man in her eyes,The truth is that being Angela’s eyes simply serve as a reflective surface,Because of the low quality of this reflection the reflection was mistaken for a humanoid figure. oref>Smith, Josh (19 February 2014). "Talking Angela App Scare Skyrockets App to Top of Charts". GottaBeMobile.com. Archived from the original on 2 April 2016. Retrieved 10 May 2014. and third most popular for all iPhone apps at the start of the following month. In 2016, Outfit7 removed the chat feature along with the camera function from the app due to this controversy, though this decision was met with criticism.

    Read more →
  • Night Sky (app)

    Night Sky (app)

    Night Sky (app) is an application developed and published by indie studio iCandi Apps Ltd. from the UK. Night Sky is a stargazing reference app, where the user can explore a virtual representation of the night sky to identify stars, planets, constellations and satellites. The app is developed specifically for iOS, tvOS and watchOS devices. Night Sky was first released on November 1, 2011 for iOS, and has had multiple updates since launch. Night Sky was mentioned in the September 2016 Apple Keynote during the Apple Watch Series 2 announcement. In October 2016, Night Sky was featured as the Free App of The Week on the Apple App Store. == Reception == Night Sky was featured in Apple's 'Best of 2012' and has also been pre-installed onto iPads in Apple retail stores worldwide.

    Read more →
  • WikiTok

    WikiTok

    WikiTok is a web application that shows random Wikipedia articles in an infinite scrolling format similar to TikTok. Created by New York-based civil and software engineer Isaac Gemal in February 2025, the application aims to provide an "anti-algorithmic" alternative to traditional social media platforms while combating doomscrolling. == Development == WikiTok was developed on February 3, 2025, after Gemal saw a viral tweet from developer Tyler Angert proposing "all of wikipedia on a single, scrollable page". Gemal created the initial prototype in approximately two hours, using the Claude LLM and Cursor coding editor, completing it by 2 a.m. that same night. The application was built with React 18, TypeScript, Tailwind CSS, and Vite, consisting of only a few hundred lines of code with no backend infrastructure. Gemal has made the project open-source, with the code publicly available on GitHub. == Functionality == WikiTok serves users completely random Wikipedia article snippets from Wikipedia's nearly 9.5 million entries. Each article preview includes a full-screen image from Wikipedia and a short text excerpt. Users can click "Read More" to access the full Wikipedia article, or continue scrolling to see the next random entry. The application functions as a progressive web app that can be downloaded as a hybrid between an app and a website. It works on both mobile and desktop browsers, and is designed to be responsive whether users scroll with their thumb on mobile devices or use a cursor on desktop computers. As of 2025, WikiTok supports 14 different languages for article translation. == Philosophy == Gemal has explicitly positioned WikiTok as "anti-algorithmic," resisting numerous requests to implement personalized content algorithms. In an interview with Business Insider, Gemal stated: "I have had plenty of people message me and even make issues on my GitHub asking for some insane crazy WikiTok algorithm... we're already ruled by ruthless, opaque algorithms in our everyday life; why can't we just have one little corner in the world without them?" The application aims to combat "doomscrolling" – the practice of mindlessly consuming negative news online that can cause anxiety, helplessness, and anger. Unlike traditional social media platforms, WikiTok deliberately excludes algorithms, advertisements, and user tracking.

    Read more →
  • Natural Language Toolkit

    Natural Language Toolkit

    The Natural Language Toolkit, or more commonly NLTK, is a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for English written in the Python programming language. It supports classification, tokenization, stemming, tagging, parsing, and semantic reasoning functionalities. It was developed by Steven Bird and Edward Loper in the Department of Computer and Information Science at the University of Pennsylvania. NLTK includes graphical demonstrations and sample data. It is accompanied by a book that explains the underlying concepts behind the language processing tasks supported by the toolkit, plus a cookbook. NLTK is intended to support research and teaching in NLP or closely related areas, including empirical linguistics, cognitive science, artificial intelligence, information retrieval, and machine learning. NLTK has been used successfully as a teaching tool, as an individual study tool, and as a platform for prototyping and building research systems. == Library highlights == Discourse representation Lexical analysis: Word and text tokenizer n-gram and collocations Part-of-speech tagger Tree model and Text chunker for capturing Named-entity recognition

    Read more →
  • OrCam device

    OrCam device

    OrCam devices such as OrCam MyEye are portable, artificial vision devices that allow visually impaired people to understand text and identify objects through audio feedback, describing what they are unable to see. Reuters described an important part of how it works as "a wireless smartcamera" which, when attached outside eyeglass frames, can read and verbalize text, and also supermarket barcodes. This information is converted to spoken words and entered "into the user’s ear." Face-recognition is also part of OrCam's feature set. == Devices == OrCam Technologies Ltd has created three devices; OrCam MyEye 2.0, OrCam MyEye 1, and OrCam MyReader. OrCam My Eye 2.0: OrCam debuted the second-generation model, the OrCam MyEye 2.0 in December 2017. About the size of a finger, the MyEye 2.0 is battery-powered, and has been compressed into a self-contained device. The device snaps onto any eyeglass frame magnetically. Orcam 2.0 is small and light (22.5 grams/0.8 ounces) with functionality to restore independence to the visually impaired. It comes in two versions. The basic model can read text, and a more advanced one adds features such as face recognition and barcode reading. As of July 2023, the retail cost is between $4000 and $6000 (USD). == Clinical Studies == JAMA Ophthalmology: In 2016 JAMA Ophthalmology conducted a study involving 12 legally blind participants to evaluate the usefulness of a portable artificial vision device (OrCam) for patients with low vision. The results showed that the OrCam device improved the patient's ability to perform tasks simulating those of daily living, such as reading a message on an electronic device, a newspaper article or a menu. Wills Eye: Wills Eye was a clinical study designed to measure the impact of the OrCam device on the quality of life of patients with End-stage Glaucoma. The conclusion was that OrCam, a novel artificial vision device using a mini-camera mounted on eyeglasses, allowed legally blind patients with end-stage glaucoma to read independently, subsequently improving their quality of life. == Employee testing == The New York Times described how a pre-release OrCam device was used by a Coloboma-impaired employee of the device's developer in 2013 for grocery shopping. It was the small size of the prototype rather than the functionality that gave her added mobility in an Israeli store's aisles. Added life-enhancement was described: "to both recognize and speak .. bus numbers .. traffic lights." == Social aspects == In contrast to an early version of Google Glass, which "failed ... because .. Glass wearers were ..mocked", early OrCam devices used designs that "clip unobtrusively on your shirt or perhaps your belt." In addition, it does not record sounds or images, what was called "the privacy puzzle that stumped Google. One 2018 technology reviewer wrote that he wished it had a headphone jack "so it would be less disruptive in places where others are working." An attempt was made to use bone conduction. == USA introduction == In 2018 a team headed by New York Assemblyman Dov Hikind introduced use of OrCam devices to ten individuals screened for what he termed "new Israeli technology that really makes a difference to the blind." Although not the first USA success, it was more focused than a publicly funded project that was authorized in 2016 by a California government agency. Also in 2016 the Chicago Lighthouse for the Blind demonstrated its use. == Technology == In the area of hardware, miniaturization has been quite important, but one major area, software, was mentioned by Assemblyman Hikind, and reported by The Times of Israel is the "AI-driven algorithms" that "reports .. how many people are in a room. In addition to reading printed text, it can also aid in "seeing" what is on a television or computer screen. Although OrCam can't help with handwritten information, it can reuse information, the basis of recognizing "US currency, and even faces." === Features === While early language support was for English, French, German, Hebrew and Spanish, others now available include Danish, Dutch, Finnish, Italian, Norwegian, Portuguese and Swedish. == History == OrCam Technologies Ltd was founded in 2010 by Professor Amnon Shashua and Ziv Aviram. Before co-founding OrCam, the two in 1999 co-founded Mobileye, an Israeli company that develops vision-based advanced driver-assistance systems (ADAS) providing warnings for collision prevention and mitigation, which was acquired by Intel for $15.3 billion in 2017. OrCam launched OrCam MyEye in 2013 after years of development and testing, and began selling it commercially in 2015. In its early years, the company raised $22 million, $6 million of which came from Intel Capital. By 2014, Intel, which was also investing in Google Glass, had invested $15 million in Orcam. In March 2017, OrCam had raised $41 million in capital, making it worth $600 million. === Marketing === One outcome of initial marketing in the USA was that they "reached a deal with the California Department of Rehabilitation, ...qualifying blind and visually impaired state residents." == OrCam Technologies Ltd == OrCam Technologies Ltd. is the Israeli-based company producing these OrCam devices, which are wearable artificial intelligence space. The company develops and manufactures assistive technology devices for individuals who are visually impaired, partially sighted, blind, print disabilities, or have other disabilities. OrCam headquarters is located in Jerusalem, operating under the company name OrCam Technologies Ltd. OrCam has over 150 employees, is headquartered in Jerusalem, and has offices in New York, Toronto, and London. == Awards == 2018 Last Gadget Standing Winner 2018 CES Innovation Awards Honoree in Accessible Tech 2017 NAIDEX Innovation Award 2016 Louise Braille Corporate Recognition Award 2016 Silmo-d-Or Award

    Read more →
  • Gradient vector flow

    Gradient vector flow

    Gradient vector flow (GVF), a computer vision framework introduced by Chenyang Xu and Jerry L. Prince, is the vector field that is produced by a process that smooths and diffuses an input vector field. It is usually used to create a vector field from images that points to object edges from a distance. It is widely used in image analysis and computer vision applications for object tracking, shape recognition, segmentation, and edge detection. In particular, it is commonly used in conjunction with active contour model. == Background == Finding objects or homogeneous regions in images is a process known as image segmentation. In many applications, the locations of object edges can be estimated using local operators that yield a new image called an edge map. The edge map can then be used to guide a deformable model, sometimes called an active contour or a snake, so that it passes through the edge map in a smooth way, therefore defining the object itself. A common way to encourage a deformable model to move toward the edge map is to take the spatial gradient of the edge map, yielding a vector field. Since the edge map has its highest intensities directly on the edge and drops to zero away from the edge, these gradient vectors provide directions for the active contour to move. When the gradient vectors are zero, the active contour will not move, and this is the correct behavior when the contour rests on the peak of the edge map itself. However, because the edge itself is defined by local operators, these gradient vectors will also be zero far away from the edge and therefore the active contour will not move toward the edge when initialized far away from the edge. Gradient vector flow (GVF) is the process that spatially extends the edge map gradient vectors, yielding a new vector field that contains information about the location of object edges throughout the entire image domain. GVF is defined as a diffusion process operating on the components of the input vector field. It is designed to balance the fidelity of the original vector field, so it is not changed too much, with a regularization that is intended to produce a smooth field on its output. Although GVF was designed originally for the purpose of segmenting objects using active contours attracted to edges, it has been since adapted and used for many alternative purposes. Some newer purposes including defining a continuous medial axis representation, regularizing image anisotropic diffusion algorithms, finding the centers of ribbon-like objects, constructing graphs for optimal surface segmentations, creating a shape prior, and much more. == Theory == The theory of GVF was originally described by Xu and Prince. Let f ( x , y ) {\displaystyle \textstyle f(x,y)} be an edge map defined on the image domain. For uniformity of results, it is important to restrict the edge map intensities to lie between 0 and 1, and by convention f ( x , y ) {\displaystyle \textstyle f(x,y)} takes on larger values (close to 1) on the object edges. The gradient vector flow (GVF) field is given by the vector field v ( x , y ) = [ u ( x , y ) , v ( x , y ) ] {\displaystyle \textstyle \mathbf {v} (x,y)=[u(x,y),v(x,y)]} that minimizes the energy functional In this equation, subscripts denote partial derivatives and the gradient of the edge map is given by the vector field ∇ f = ( f x , f y ) {\displaystyle \textstyle \nabla f=(f_{x},f_{y})} . Figure 1 shows an edge map, the gradient of the (slightly blurred) edge map, and the GVF field generated by minimizing E {\displaystyle \textstyle {\mathcal {E}}} . Equation 1 is a variational formulation that has both a data term and a regularization term. The first term in the integrand is the data term. It encourages the solution v {\displaystyle \textstyle \mathbf {v} } to closely agree with the gradients of the edge map since that will make v − ∇ f {\displaystyle \textstyle \mathbf {v} -\nabla f} small. However, this only needs to happen when the edge map gradients are large since v − ∇ f {\displaystyle \textstyle \mathbf {v} -\nabla f} is multiplied by the square of the length of these gradients. The second term in the integrand is a regularization term. It encourages the spatial variations in the components of the solution to be small by penalizing the sum of all the partial derivatives of v {\displaystyle \textstyle \mathbf {v} } . As is customary in these types of variational formulations, there is a regularization parameter μ > 0 {\displaystyle \textstyle \mu >0} that must be specified by the user in order to trade off the influence of each of the two terms. If μ {\displaystyle \textstyle \mu } is large, for example, then the resulting field will be very smooth and may not agree as well with the underlying edge gradients. Theoretical Solution. Finding v ( x , y ) {\displaystyle \textstyle \mathbf {v} (x,y)} to minimize Equation 1 requires the use of calculus of variations since v ( x , y ) {\displaystyle \textstyle \mathbf {v} (x,y)} is a function, not a variable. Accordingly, the Euler equations, which provide the necessary conditions for v {\displaystyle \textstyle \mathbf {v} } to be a solution can be found by calculus of variations, yielding where ∇ 2 {\displaystyle \textstyle \nabla ^{2}} is the Laplacian operator. It is instructive to examine the form of the equations in (2). Each is a partial differential equation that the components u {\displaystyle u} and v {\displaystyle v} of v {\displaystyle \mathbf {v} } must satisfy. If the magnitude of the edge gradient is small, then the solution of each equation is guided entirely by Laplace's equation, for example ∇ 2 u = 0 {\displaystyle \textstyle \nabla ^{2}u=0} , which will produce a smooth scalar field entirely dependent on its boundary conditions. The boundary conditions are effectively provided by the locations in the image where the magnitude of the edge gradient is large, where the solution is driven to agree more with the edge gradients. Computational Solutions. There are two fundamental ways to compute GVF. First, the energy function E {\displaystyle {\mathcal {E}}} itself (1) can be directly discretized and minimized, for example, by gradient descent. Second, the partial differential equations in (2) can be discretized and solved iteratively. The original GVF paper used an iterative approach, while later papers introduced considerably faster implementations such as an octree-based method, a multi-grid method, and an augmented Lagrangian method. In addition, very fast GPU implementations have been developed in Extensions and Advances. GVF is easily extended to higher dimensions. The energy function is readily written in a vector form as which can be solved by gradient descent or by finding and solving its Euler equation. Figure 2 shows an illustration of a three-dimensional GVF field on the edge map of a simple object (see ). The data and regularization terms in the integrand of the GVF functional can also be modified. A modification described in , called generalized gradient vector flow (GGVF) defines two scalar functions and reformulates the energy as While the choices g ( ∇ f | ) = μ {\displaystyle \textstyle g(\nabla f|)=\mu } and h ( | ∇ f | ) = | ∇ f | 2 {\displaystyle \textstyle h(|\nabla f|)=|\nabla f|^{2}} reduce GGVF to GVF, the alternative choices g ( | ∇ f | ) = exp ⁡ { − | ∇ f | / K } {\displaystyle \textstyle g(|\nabla f|)=\exp\{-|\nabla f|/K\}} and h ( ∇ f | ) = 1 − g ( | ∇ f | ) {\displaystyle \textstyle h(\nabla f|)=1-g(|\nabla f|)} , for K {\displaystyle K} a user-selected constant, can improve the tradeoff between the data term and its regularization in some applications. The GVF formulation has been further extended to vector-valued images in where a weighted structure tensor of a vector-valued image is used. A learning based probabilistic weighted GVF extension was proposed in to further improve the segmentation for images with severely cluttered textures or high levels of noise. The variational formulation of GVF has also been modified in motion GVF (MGVF) to incorporate object motion in an image sequence. Whereas the diffusion of GVF vectors from a conventional edge map acts in an isotropic manner, the formulation of MGVF incorporates the expected object motion between image frames. An alternative to GVF called vector field convolution (VFC) provides many of the advantages of GVF, has superior noise robustness, and can be computed very fast. The VFC field v V F C {\displaystyle \textstyle \mathbf {v} _{\mathrm {VFC} }} is defined as the convolution of the edge map f {\displaystyle f} with a vector field kernel k {\displaystyle \mathbf {k} } where The vector field kernel k {\displaystyle \textstyle \mathbf {k} } has vectors that always point toward the origin but their magnitudes, determined in detail by the function m {\displaystyle m} , decrease to zero with increasing distance from the origin. The beauty of VFC is that it can be computed very rapidly using a fast Fourier tra

    Read more →
  • Stevens Award

    Stevens Award

    The Stevens Award is a software engineering lecture award given by the Reengineering Forum, an industry association. The international Stevens Award was created to recognize outstanding contributions to the literature or practice of methods for software and systems development. The first award was given in 1995. The presentations focus on the current state of software methods and their direction for the future. This award lecture is named in memory of Wayne Stevens (1944-1993), a consultant, author, pioneer, and advocate of the practical application of software methods and tools. The Stevens Award and lecture is managed by the Reengineering Forum. The award was founded by International Workshop on Computer Aided Software Engineering (IWCASE), an international workshop association of users and developers of computer-aided software engineering (CASE) technology, which merged into The Reengineering Forum. Wayne Stevens was a charter member of the IWCASE executive board. == Recipients == 1995: Tony Wasserman 1996: David Harel 1997: Michael Jackson 1998: Thomas McCabe 1999: Tom DeMarco 2000: Gerald Weinberg 2001: Peter Chen 2002: Cordell Green 2003: Manny Lehman 2004: François Bodart 2005: Mary Shaw, Jim Highsmith 2006: Grady Booch 2007: Nicholas Zvegintzov 2008: Harry Sneed 2009: Larry Constantine 2010: Peter Aiken 2011: Jared Spool, Barry Boehm 2012: Philip Newcomb 2013: Jean-Luc Hainaut 2014: François Coallier 2015: Pierre Bourque

    Read more →
  • Microsoft Whiteboard

    Microsoft Whiteboard

    Microsoft Whiteboard is a free multi-platform application, as well as an online service and a feature in Microsoft Teams, which simulates a virtual whiteboard and enables real-time collaboration between users. == Overview and features == Microsoft Whiteboard allows users to draw on a virtual whiteboard using input methods such as a stylus pen or a mouse and keyboard, and write down notes, draw connections between shareable ideas, and interact in real time. Microsoft Whiteboard is available to download on the following platforms and devices: Microsoft Windows (on Windows 10 or above) Android Apple iOS Surface Hub devices It is also available on the web and as a feature in Microsoft Teams. Microsoft Whiteboard allows users with Microsoft accounts to view, edit, and share whiteboards using the provided tools and options. The feature set includes tools for drawing, shapes, and media. Drawing in Microsoft Whiteboard is called inking. It works both on mobile devices and computers. The inking toolbar has customizable pencils, a ruler, a highlighter, an eraser, and an object selector. Whiteboard can recognize shapes drawn by hand and straighten them. Holding the Shift key on a computer while inking draws straight lines. Microsoft Whiteboard has keyboard shortcuts for some functions. Additional features include inserting sticky notes, text boxes, stickers, as well as images. Grid lines and colors are adjustable. Different templates can be inserted into the whiteboard. Users can also share their reactions. A feature limited to boards created in Microsoft Teams, is the ability to make them read-only; other participants from the meeting cannot edit them. == Reviews == PC Magazine gave Microsoft Whiteboard a score of 3.5 out of 5, praising the app's free availability and plentiful templates. It compared it to other, paid whiteboarding solutions, and concluded that Microsoft offers the best free one. Some of the cons, described by PCMag, include the inability to view boards without a Microsoft account and the inability to create custom templates.

    Read more →
  • Transfer learning

    Transfer learning

    Transfer learning (TL) is a technique in machine learning (ML) in which knowledge learned from a task is re-used in order to boost performance on a related task. For example, for image classification, knowledge gained while learning to recognize cars could be applied when trying to recognize trucks. This topic is related to the psychological literature on transfer of learning, although practical ties between the two fields are limited. Reusing or transferring information from previously learned tasks to new tasks has the potential to significantly improve learning efficiency. Since transfer learning makes use of training with multiple objective functions it is related to cost-sensitive machine learning and multi-objective optimization. == History == In 1976, Bozinovski and Fulgosi published a paper addressing transfer learning in neural network training. The paper gives a mathematical and geometrical model of the topic. In 1981, a report considered the application of transfer learning to a dataset of images representing letters of computer terminals, experimentally demonstrating positive and negative transfer learning. In 1992, Lorien Pratt formulated the discriminability-based transfer (DBT) algorithm. By 1998, the field had advanced to include multi-task learning, along with more formal theoretical foundations. Influential publications on transfer learning include the book Learning to Learn in 1998, a 2009 survey and a 2019 survey. Ng said in his NIPS 2016 tutorial that TL would become the next driver of machine learning commercial success after supervised learning. In the 2020 paper, "Rethinking Pre-Training and self-training", Zoph et al. reported that pre-training can hurt accuracy, and advocate self-training instead. == Definition == The definition of transfer learning is given in terms of domains and tasks. A domain D {\displaystyle {\mathcal {D}}} consists of: a feature space X {\displaystyle {\mathcal {X}}} and a marginal probability distribution P ( X ) {\displaystyle P(X)} , where X = { x 1 , . . . , x n } ∈ X {\displaystyle X=\{x_{1},...,x_{n}\}\in {\mathcal {X}}} . Given a specific domain, D = { X , P ( X ) } {\displaystyle {\mathcal {D}}=\{{\mathcal {X}},P(X)\}} , a task consists of two components: a label space Y {\displaystyle {\mathcal {Y}}} and an objective predictive function f : X → Y {\displaystyle f:{\mathcal {X}}\rightarrow {\mathcal {Y}}} . The function f {\displaystyle f} is used to predict the corresponding label f ( x ) {\displaystyle f(x)} of a new instance x {\displaystyle x} . This task, denoted by T = { Y , f ( x ) } {\displaystyle {\mathcal {T}}=\{{\mathcal {Y}},f(x)\}} , is learned from the training data consisting of pairs { x i , y i } {\displaystyle \{x_{i},y_{i}\}} , where x i ∈ X {\displaystyle x_{i}\in {\mathcal {X}}} and y i ∈ Y {\displaystyle y_{i}\in {\mathcal {Y}}} . Given a source domain D S {\displaystyle {\mathcal {D}}_{S}} and learning task T S {\displaystyle {\mathcal {T}}_{S}} , a target domain D T {\displaystyle {\mathcal {D}}_{T}} and learning task T T {\displaystyle {\mathcal {T}}_{T}} , where D S ≠ D T {\displaystyle {\mathcal {D}}_{S}\neq {\mathcal {D}}_{T}} , or T S ≠ T T {\displaystyle {\mathcal {T}}_{S}\neq {\mathcal {T}}_{T}} , transfer learning aims to help improve the learning of the target predictive function f T ( ⋅ ) {\displaystyle f_{T}(\cdot )} in D T {\displaystyle {\mathcal {D}}_{T}} using the knowledge in D S {\displaystyle {\mathcal {D}}_{S}} and T S {\displaystyle {\mathcal {T}}_{S}} . == Applications == Algorithms for transfer learning are available in Markov logic networks and Bayesian networks. Transfer learning has been applied to cancer subtype discovery, building utilization, general game playing, text classification, digit recognition, medical imaging and spam filtering. In 2020, it was discovered that, due to their similar physical natures, transfer learning is possible between electromyographic (EMG) signals from the muscles and classifying the behaviors of electroencephalographic (EEG) brainwaves, from the gesture recognition domain to the mental state recognition domain. It was noted that this relationship worked in both directions, showing that electroencephalographic can likewise be used to classify EMG. The experiments noted that the accuracy of neural networks and convolutional neural networks were improved through transfer learning both prior to any learning (compared to standard random weight distribution) and at the end of the learning process (asymptote). That is, results are improved by exposure to another domain. Moreover, the end-user of a pre-trained model can change the structure of fully-connected layers to improve performance.

    Read more →