AI Writing Helper

AI Writing Helper — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Model-based clustering

    Model-based clustering

    In statistics, cluster analysis is the algorithmic grouping of objects into homogeneous groups based on numerical measurements. Model-based clustering based on a statistical model for the data, usually a mixture model. This has several advantages, including a principled statistical basis for clustering, and ways to choose the number of clusters, to choose the best clustering model, to assess the uncertainty of the clustering, and to identify outliers that do not belong to any group. == Model-based clustering == Suppose that for each of n {\displaystyle n} observations we have data on d {\displaystyle d} variables, denoted by y i = ( y i , 1 , … , y i , d ) {\displaystyle y_{i}=(y_{i,1},\ldots ,y_{i,d})} for observation i {\displaystyle i} . Then model-based clustering expresses the probability density function of y i {\displaystyle y_{i}} as a finite mixture, or weighted average of G {\displaystyle G} component probability density functions: p ( y i ) = ∑ g = 1 G τ g f g ( y i ∣ θ g ) , {\displaystyle p(y_{i})=\sum _{g=1}^{G}\tau _{g}f_{g}(y_{i}\mid \theta _{g}),} where f g {\displaystyle f_{g}} is a probability density function with parameter θ g {\displaystyle \theta _{g}} , τ g {\displaystyle \tau _{g}} is the corresponding mixture probability where ∑ g = 1 G τ g = 1 {\displaystyle \sum _{g=1}^{G}\tau _{g}=1} . Then in its simplest form, model-based clustering views each component of the mixture model as a cluster, estimates the model parameters, and assigns each observation to cluster corresponding to its most likely mixture component. === Gaussian mixture model === The most common model for continuous data is that f g {\displaystyle f_{g}} is a multivariate normal distribution with mean vector μ g {\displaystyle \mu _{g}} and covariance matrix Σ g {\displaystyle \Sigma _{g}} , so that θ g = ( μ g , Σ g ) {\displaystyle \theta _{g}=(\mu _{g},\Sigma _{g})} . This defines a Gaussian mixture model. The parameters of the model, τ g {\displaystyle \tau _{g}} and θ g {\displaystyle \theta _{g}} for g = 1 , … , G {\displaystyle g=1,\ldots ,G} , are typically estimated by maximum likelihood estimation using the expectation-maximization algorithm (EM); see also EM algorithm and GMM model. Bayesian inference is also often used for inference about finite mixture models. The Bayesian approach also allows for the case where the number of components, G {\displaystyle G} , is infinite, using a Dirichlet process prior, yielding a Dirichlet process mixture model for clustering. === Choosing the number of clusters === An advantage of model-based clustering is that it provides statistically principled ways to choose the number of clusters. Each different choice of the number of groups G {\displaystyle G} corresponds to a different mixture model. Then standard statistical model selection criteria such as the Bayesian information criterion (BIC) can be used to choose G {\displaystyle G} . The integrated completed likelihood (ICL) is a different criterion designed to choose the number of clusters rather than the number of mixture components in the model; these will often be different if highly non-Gaussian clusters are present. === Parsimonious Gaussian mixture model === For data with high dimension, d {\displaystyle d} , using a full covariance matrix for each mixture component requires estimation of many parameters, which can result in a loss of precision, generalizabity and interpretability. Thus it is common to use more parsimonious component covariance matrices exploiting their geometric interpretation. Gaussian clusters are ellipsoidal, with their volume, shape and orientation determined by the covariance matrix. Consider the eigendecomposition of a matrix Σ g = λ g D g A g D g T , {\displaystyle \Sigma _{g}=\lambda _{g}D_{g}A_{g}D_{g}^{T},} where D g {\displaystyle D_{g}} is the matrix of eigenvectors of Σ g {\displaystyle \Sigma _{g}} , A g = diag { A 1 , g , … , A d , g } {\displaystyle A_{g}={\mbox{diag}}\{A_{1,g},\ldots ,A_{d,g}\}} is a diagonal matrix whose elements are proportional to the eigenvalues of Σ g {\displaystyle \Sigma _{g}} in descending order, and λ g {\displaystyle \lambda _{g}} is the associated constant of proportionality. Then λ g {\displaystyle \lambda _{g}} controls the volume of the ellipsoid, A g {\displaystyle A_{g}} its shape, and D g {\displaystyle D_{g}} its orientation. Each of the volume, shape and orientation of the clusters can be constrained to be equal (E) or allowed to vary (V); the orientation can also be spherical, with identical eigenvalues (I). This yields 14 possible clustering models, shown in this table: It can be seen that many of these models are more parsimonious, with far fewer parameters than the unconstrained model that has 90 parameters when G = 4 {\displaystyle G=4} and d = 9 {\displaystyle d=9} . Several of these models correspond to well-known heuristic clustering methods. For example, k-means clustering is equivalent to estimation of the EII clustering model using the classification EM algorithm. The Bayesian information criterion (BIC) can be used to choose the best clustering model as well as the number of clusters. It can also be used as the basis for a method to choose the variables in the clustering model, eliminating variables that are not useful for clustering. Different Gaussian model-based clustering methods have been developed with an eye to handling high-dimensional data. These include the pgmm method, which is based on the mixture of factor analyzers model, and the HDclassif method, based on the idea of subspace clustering. The mixture-of-experts framework extends model-based clustering to include covariates. == Example == We illustrate the method with a dateset consisting of three measurements (glucose, insulin, sspg) on 145 subjects for the purpose of diagnosing diabetes and the type of diabetes present. The subjects were clinically classified into three groups: normal, chemical diabetes and overt diabetes, but we use this information only for evaluating clustering methods, not for classifying subjects. The BIC plot shows the BIC values for each combination of the number of clusters, G {\displaystyle G} , and the clustering model from the Table. Each curve corresponds to a different clustering model. The BIC favors 3 groups, which corresponds to the clinical assessment. It also favors the unconstrained covariance model, VVV. This fits the data well, because the normal patients have low values of both sspg and insulin, while the distributions of the chemical and overt diabetes groups are elongated, but in different directions. Thus the volumes, shapes and orientations of the three groups are clearly different, and so the unconstrained model is appropriate, as selected by the model-based clustering method. The classification plot shows the classification of the subjects by model-based clustering. The classification was quite accurate, with a 12% error rate as defined by the clinical classification. Other well-known clustering methods performed worse with higher error rates, such as single-linkage clustering with 46%, average link clustering with 30%, complete-linkage clustering also with 30%, and k-means clustering with 28%. == Outliers in clustering == An outlier in clustering is a data point that does not belong to any of the clusters. One way of modeling outliers in model-based clustering is to include an additional mixture component that is very dispersed, with for example a uniform distribution. Another approach is to replace the multivariate normal densities by t {\displaystyle t} -distributions, with the idea that the long tails of the t {\displaystyle t} -distribution would ensure robustness to outliers. However, this is not breakdown-robust. A third approach is the "tclust" or data trimming approach which excludes observations identified as outliers when estimating the model parameters. == Non-Gaussian clusters and merging == Sometimes one or more clusters deviate strongly from the Gaussian assumption. If a Gaussian mixture is fitted to such data, a strongly non-Gaussian cluster will often be represented by several mixture components rather than a single one. In that case, cluster merging can be used to find a better clustering. A different approach is to use mixtures of complex component densities to represent non-Gaussian clusters. == Non-continuous data == === Categorical data === Clustering multivariate categorical data is most often done using the latent class model. This assumes that the data arise from a finite mixture model, where within each cluster the variables are independent. === Mixed data === These arise when variables are of different types, such as continuous, categorical or ordinal data. A latent class model for mixed data assumes local independence between the variable. The location model relaxes the local independence assumption. The clustMD approach assumes that the observed variables are manifestations of underlying continuous Gaussian latent

    Read more →
  • Couch to 5K

    Couch to 5K

    Couch to 5K, abbreviated C25K, is an exercise plan that gradually progresses from beginner running toward a 5 kilometre (3.1 mile) run over nine weeks. == Operations == The Couch to 5K running plan, also known as C25K, created by Josh Clark in 1996, was developed with the expectation of creating a plan for new runners to start running. The plan is aimed to have users work out for 20 to 30 minutes, three days a week. Within the program, users can be expected to perform different tasks such as intervals of running with period of short walks in between to help build endurance in the weeks up to the final goal of a 5K run. During the nine weeks leading up to the race, the runner will learn to set their own pace and where their strengths and weaknesses are within running. Often, the daily workouts start with a five-minute warm-up walk and works up to running five kilometres without a walking break within nine weeks. Users are not expected to have any experience in running and can be some of the first running that they ever do. The main goal is to turn that unexperienced runner into someone who can run a 5K. Clark started the website Kick and featured C25K on the site. In 2001, Kick merged with Cool Running, a New England–based running site. Clark later sold his stake in Cool Running and the Couch to 5K program. Cool Running was absorbed into Active.com, operated by Active Network, LLC. Active Network provides mobile apps for Couch to 5K, as well as 5K to 10K, a follow-up program. The NHS in the UK provides downloadable podcasts and a smartphone app (Android and iOS) for the plan. A mobile app, created by Zen Labs, has training plans that are based on the Couch to 5K running plan from CoolRunning.com. It is one of the highest-rated health and fitness apps available on Android and iOS. As of 2016, the C25K app has been used by over 5 million people.

    Read more →
  • Automate This

    Automate This

    Automate This: How Algorithms Came to Rule Our World is a book written by Christopher Steiner and published by Penguin Group. == Book == Steiner begins his study of algorithms on Wall Street in the 1980s but also provides examples from other industries. For example, he explains the history of Pandora Radio and the use of algorithms in music identification. He expresses concern that such use of algorithms may lead to the homogenization of music over time. Steiner also discusses the algorithms that eLoyalty (now owned by Mattersight Corporation following divestiture of the technology) was created by dissecting 2 million speech patterns and can now identify a caller's personality style and direct the caller with a compatible customer support representative. Steiner's book shares both the warning and the opportunity that algorithms bring to just about every industry in the world, and the pros and cons of the societal impact of automation (e.g. impact on employment).

    Read more →
  • Normalization (image processing)

    Normalization (image processing)

    In image processing, normalization is a process that changes the range of pixel intensity values, a kind of intensity mapping. Applications include photographs with poor contrast due to glare, for example. A typical case is contrast stretching. In more general fields of data processing, such as digital signal processing, it is referred to as dynamic range expansion. The purpose of dynamic range expansion in the various applications is usually to bring the image, or other type of signal, into a range that is more familiar or normal to the senses, hence the term normalization. Often, the motivation is to achieve consistency in dynamic range for a set of data, signals, or images to avoid mental distraction or fatigue. For example, a newspaper will strive to make all of the images in an issue share a similar range of grayscale. Auto-normalization in image processing software typically normalizes to the full dynamic range of the number system specified in the image file format. == Definition == Normalization transforms an n-dimensional grayscale image I : { X ⊆ R n } → { Min , . . , Max } {\displaystyle I:\{\mathbb {X} \subseteq \mathbb {R} ^{n}\}\rightarrow \{{\text{Min}},..,{\text{Max}}\}} with intensity values in the range ( Min , Max ) {\displaystyle ({\text{Min}},{\text{Max}})} , into a new image I N : { X ⊆ R n } → { newMin , . . , newMax } {\displaystyle I_{N}:\{\mathbb {X} \subseteq \mathbb {R} ^{n}\}\rightarrow \{{\text{newMin}},..,{\text{newMax}}\}} with intensity values in the range ( newMin , newMax ) {\displaystyle ({\text{newMin}},{\text{newMax}})} . The linear normalization of a grayscale digital image is performed according to the formula I N = ( I − Min ) newMax − newMin Max − Min + newMin {\displaystyle I_{N}=(I-{\text{Min}}){\frac {{\text{newMax}}-{\text{newMin}}}{{\text{Max}}-{\text{Min}}}}+{\text{newMin}}} For example, if the intensity range of the image is 50 to 180 and the desired range is 0 to 255 the process entails subtracting 50 from each of pixel intensity, making the range 0 to 130. Then each pixel intensity is multiplied by 255/130, making the range 0 to 255. Normalization might also be non-linear, as the relationship between I {\displaystyle I} and I N {\displaystyle I_{N}} may not be linear. An example of non-linear normalization is when the normalization follows a sigmoid function, in which case the normalized image is computed according to the formula I N = ( newMax − newMin ) 1 1 + e − I − β α + newMin {\displaystyle I_{N}=({\text{newMax}}-{\text{newMin}}){\frac {1}{1+e^{-{\frac {I-\beta }{\alpha }}}}}+{\text{newMin}}} Where α {\displaystyle \alpha } defines the width of the input intensity range, and β {\displaystyle \beta } defines the intensity around which the range is centered. Gamma correction (log/inverse log) is also a common transformation function. === Colorspace === Intensity operations generally operate on a colorspace that maps to the human perception of lightness without intentionally changing the other properties. This can be done, for example, by operating on the L component of the CIELAB color space, or approximately by operating on the Y component of YCbCr. It is also possible to operate on each of the RGB color channels, though the result will not always make sense. == Contrast stretching == This is the most significant and essential technique of spatial-based image enhancement. The basic intent of this contrast enhancement technique is to adjust the local contrast in the image so as to bring out the clear regions or objects in the image. Low-contrast images often result from poor or non-uniform lighting conditions, a limited dynamic range of the imaging sensor, or improper settings of the lens aperture. This operation tries to change the intensity of the pixel in the image, particularly in the input image, to obtain an enhanced image. It is based on the number of techniques, namely local, global, dark and bright levels of contrast. The contrast enhancement is considered as the amount of color or gray differentiation that lies among the different features in an image. The contrast enhancement improves the quality of image by increasing the luminance difference between the foreground and background. A contrast stretching transformation can be achieved by: Stretching the dark range of input values into a wider range of output values: This involves increasing the brightness of the darker areas in the image to enhance details and improve visibility. Shifting the mid-range of input values: This involves adjusting the brightness levels of the mid-tones in the image to improve overall contrast and clarity. Compressing the bright range of input values: This process involves reducing the brightness of the brighter areas in the image to prevent overexposure resulting in a more balanced and visually appealing image. It can be described as the following piecewise funciton: I N = { s 1 r 1 I if I < r 1 s 2 − s 1 r 1 − r 2 ( I − r 1 ) if r 1 ≤ I ≤ r 2 1 − s 2 1 − r 2 ( I − r 2 ) if I > r 2 {\displaystyle I_{N}={\begin{cases}{\frac {s_{1}}{r_{1}}}I&{\text{if }}Ir_{2}\end{cases}}} Where: ( r 1 , s 1 ) {\displaystyle (r_{1},s_{1})} defines the transition point between the "dark" range to the "main" range. ( r 2 , s 2 ) {\displaystyle (r_{2},s_{2})} defines the transition point between the "main" range to the "bright" range. A typical linear stretch is obtained when ( r 1 , s 1 ) = ( r min , 0 ) {\displaystyle (r_{1},s_{1})=(r_{\text{min}},0)} and ( r 2 , s 2 ) = ( r max , 1 ) {\displaystyle (r_{2},s_{2})=(r_{\text{max}},1)} , where r min {\displaystyle r_{\text{min}}} and r max {\displaystyle r_{\text{max}}} denote the minimum and maximum levels in the source image. === Global contrast stretching === Global Contrast Stretching considers all color palate ranges at once to determine the maximum and minimum values for the entire RGB color image. This approach utilizes the combination of RGB colors to derive a single maximum and minimum value for contrast stretching across the entire image. === Local contrast stretching === Local contrast stretching (LCS) is an image enhancement method that focuses on locally adjusting each pixel's value to improve the visualization of structures within an image, particularly in both the darkest and lightest portions. It operates by utilizing sliding windows, known as kernels, which traverse the image. The central pixel within each kernel is adjusted using the following formula: I p ( x , y ) = 255 × [ I 0 ( x , y ) − m i n ] ( m a x − m i n ) {\displaystyle I_{p}(x,y)=255\times {\frac {[I_{0}(x,y)-min]}{(max-min)}}} Where: Ip(x,y) is the color level for the output pixel (x,y) after the contrast stretching process. I0(x,y) is the color level input for data pixel (x, y). max is the maximum value for color level in the input image within the selected kernel. min is the minimum value for color level in the input image within the selected kernel. A piecewise form (see above) may also be used. LCS can be applied to the three color channels of an image separately.

    Read more →
  • Parasolid

    Parasolid

    Parasolid is a geometric modeling kernel originally developed by Shape Data Limited, now owned and developed by Siemens Digital Industries Software. It can be licensed by other companies for use in their 3D computer graphics software products. Parasolid's abilities include model creation and editing utilities such as Boolean modeling operators, feature modeling support, advanced surfacing, thickening and hollowing, blending and filleting, and sheet modeling. It also incorporates modeling with mesh surfaces and lattices. Parasolid also includes tools for direct model editing, including tapering, offsetting, geometry replacement and removing feature details with automated regeneration of surrounding data. Parasolid also provides wide-ranging graphical and rendering support, including hidden-line, wireframe and drafting, tessellation, and model data inquiries. To use Parasolid effectively, software developers need knowledge of CAD in general, computational geometry, and topology. Parasolid is available for Windows (32-bit, 64-bit and AArch64), Linux (64-bit and AArch64), macOS (Apple silicon and Intel), iOS, and Android. == Parasolid XT format == Parasolid parts are normally saved in XT format, which usually has the file extension .X_T. The format is documented and open. There is also a binary version of the format, usually with an .X_B extension, which is somewhat more compact. Both .X_T and .X_B are used for parts files. == Applications == It is used in many computer-aided design (CAD), computer-aided manufacturing (CAM), computer-aided engineering (CAE), product visualization, and CAD data exchange packages. Notable uses include:

    Read more →
  • Steerable filter

    Steerable filter

    In image processing, a steerable filter is an orientation-selective filter that can be computationally rotated to any direction. Rather than designing a new filter for each orientation, a steerable filter is synthesized from a linear combination of a small, fixed set of "basis filters". This approach is efficient and is widely used for tasks that involve directionality, such as edge detection, texture analysis, and shape-from-shading. The principle of steerability has been generalized in deep learning to create equivariant neural networks, which can recognize features in data regardless of their orientation or position. == Example == A common example of a steerable filter is the first derivative of a two-dimensional Gaussian function. This filter responds strongly to oriented image features like edges. It is constructed from two basis filters: the partial derivative of the Gaussian with respect to the horizontal direction ( x {\displaystyle x} ) and the vertical direction ( y {\displaystyle y} ). If G ( x , y ) {\displaystyle G(x,y)} is the Gaussian function, and G x {\displaystyle G_{x}} and G y {\displaystyle G_{y}} are its partial derivatives (which measure the rate of change in the x {\displaystyle x} and y {\displaystyle y} directions, respectively), a new filter G θ {\displaystyle G_{\theta }} oriented at an angle θ {\displaystyle \theta } can be synthesized with the formula: G θ = cos ⁡ ( θ ) G x + sin ⁡ ( θ ) G y {\displaystyle G_{\theta }=\cos(\theta )G_{x}+\sin(\theta )G_{y}} Here, the basis filters G x {\displaystyle G_{x}} and G y {\displaystyle G_{y}} are weighted by cos ⁡ ( θ ) {\displaystyle \cos(\theta )} and sin ⁡ ( θ ) {\displaystyle \sin(\theta )} to "steer" the filter's sensitivity to the desired orientation. This is equivalent to taking the dot product of the direction vector ( cos ⁡ θ , sin ⁡ θ ) {\displaystyle (\cos \theta ,\sin \theta )} with the filter's gradient, ( G x , G y ) {\displaystyle (G_{x},G_{y})} . == Generalization in deep learning: Equivariant neural networks == The concept of steerability is foundational to equivariant neural networks, a class of models in deep learning designed to understand symmetries in data. A network is considered equivariant to a transformation (like a rotation) if transforming the input and then passing it through the network produces the same result as passing the input through the network first and then transforming the output. Formally, for a transformation T {\displaystyle T} and a network f {\displaystyle f} , this property is defined as f ( T ( input ) ) = T ( f ( input ) ) {\displaystyle f(T({\text{input}}))=T(f({\text{input}}))} . This built-in understanding of geometry makes models more data-efficient. For example, a network equivariant to rotation does not need to be shown an object in multiple orientations to learn to recognize it; it inherently understands that a rotated object is still the same object. This leads to better generalization and performance, particularly in scientific applications. === Mathematical foundation === Equivariant neural networks use principles from group theory to create operations that respect geometric symmetries, such as the SO(3) group for 3D rotations or the E(3) group for rotations and translations. Instead of learning standard filter kernels, these networks learn how to combine a fixed set of basis kernels. These basis functions are chosen so that they have well-defined behaviors under transformation groups. Spherical harmonics are frequently used as basis functions because they form a complete set of functions that behave predictably under rotation, making them ideal for creating steerable 3D kernels. Features within the network are treated as geometric tensors, which are mathematical objects (like scalars or vectors) that are "typed" by their behavior under transformations. These types correspond to the irreducible representations (irreps) of the group. The tensor product is the fundamental operation used to combine these typed features in a way that preserves equivariance, guaranteeing that the network as a whole respects the desired symmetry. Frameworks like e3nn simplify the construction of these networks by automating the complex mathematics of irreducible representations and tensor products. === Applications === Steerable and equivariant models are highly effective for problems with inherent geometric symmetries. Examples include: Protein structure analysis: SE(3)-equivariant networks can process 3D molecular structures while respecting their rotational and translational symmetries. 3D Point cloud processing: Rotation-equivariant filters built from steerable spherical functions can perform tasks like 3D shape classification. Computational chemistry: E(3)-equivariant graph neural networks are used to model interatomic potentials for molecular dynamics simulations, creating highly accurate and data-efficient models of physical systems.

    Read more →
  • Reverse correlation technique

    Reverse correlation technique

    The reverse correlation technique is a data driven study method used primarily in psychological and neurophysiological research. This method earned its name from its origins in neurophysiology, where cross-correlations between white noise stimuli and sparsely occurring neuronal spikes could be computed quicker when only computing it for segments preceding the spikes. The term has since been adopted in psychological experiments that usually do not analyze the temporal dimension, but also present noise to human participants. In contrast to the original meaning, the term is here thought to reflect that the standard psychological practice of presenting stimuli of defined categories to the participants is "reversed": Instead, the participant's mental representations of categories are estimated from interactions of the presented noise and the behavioral responses. It is used to create composite pictures of individual and/or group mental representations of various items (e.g. faces, bodies, and the self) that depict characteristics of said items (e.g. trustworthiness and self-body image). This technique is helpful when evaluating the mental representations of those with and without mental illnesses. == Terms == This technique utilizes spike-triggered average to explain what areas of signal and noise in an image are valuable for the given research question. Signal is information used to produce objects of value that help explain and connect the world around us. Noise is commonly referred to as unwanted signal that obscures the information that the signal is trying to present. Most importantly for reverse correlation studies, noise is randomly varying information. To determine the areas of importance using reverse correlation, noise is applied to a base image and then evaluated by observers. A base image is any image void of noise that relates to the research question. A base image that has noise superimposed on top is the stimuli that is presented to and evaluated by participants. Each time a new set of stimuli is presented to a participant, this is known as a trial. After a participant has responded to hundreds to thousands of trials, a researcher is ready to create a classification image. A classification image (abbreviated as "CI" in some studies) is a single image that represents the average noise patterns in the images selected by participants. A classification image can also be computed for groups by averaging the individuals’ classification images. These classification images are what researchers use to interpret the data and draw conclusions. As a whole, the reverse correlation method is a process that results in a composite image (from an individual or group) that can be used to estimate and interpret mental representations. == Basic study layout == The reverse correlation method is typically executed as an in-lab computer experiment. This method follows four broad steps. Each of the following steps are described in greater detail below. After creating a research question and determining that the reverse correlation method is the most suitable technique to answer the question, a researcher must (1) design randomly varying stimuli. After the stimuli have been prepared, a researcher should (2) collect data from participants who will see and respond to approximately 300 -1,000 trials. Each trial will either consist of one or two images (side by side) derived from the same base image with noise superimposed on top. Participant responses will depend on the chosen study design; if a researcher presents only one image at a time, participants rate the image on a 4pt scale, but when two images are shown, the participant is asked to choose which best aligns with the given category (e.g. choose the image that looks the most aggressive). Once all of the data is collected, the researcher will (3) compute classification images for each participant and using those images compute group classification images. Finally, with the classification images available, the researcher will (4) evaluate the images and draw conclusions about their results. === Step 1: making stimuli === When designing the stimuli for a reverse correlation study, the two primary factors that one should consider are (1) the base image and (2) the noise that will be used. While not all bases are images per se, the majority are and for this reason the base is typically referred to as a base image. The base image should represent whatever the research question is addressing. For example, if you are interested in peoples’ mental representations of Chinese people, it would not make sense to use a base image of a Spanish or Caucasian person. Again, if you are interested in the mental representations of male vocal patterns, it would make the most sense to use a base vocal pattern that has been produced by a male. Having a base is important because it provides a kind of anchor for participants to work from. When there is no base image, the number of trials that are required increases dramatically, thus making it harder to collect data. While there are studies that have excluded a base image, (e.g. the S study), for more elaborate and nuanced research questions, it is important to have a base image that is a fair representation of what participants are being asked to categorize. Photographs of faces are generally the most popular base image. Although the reverse correlation method is capable of investigating a wide variety of research questions, the most common application of the method is for evaluating faces on a single trait. Reverse correlation studies that address evaluations of the face are sometimes referred to as being a face space reverse correlation model (FSRCM). Thankfully, there are existing databases for face images of varying demographics and emotion that work well as base images. The reverse correlation method can also be used to help researchers identify what areas of an image (e.g. the areas on the face) have diagnostic value. In order to identify these areas of value, researchers start by minimizing the space a participant can pull information from. By imposing a “mask” on an image (e.g. blur an image while leaving random areas un-blurred), this reduces the information individuals might see, and forces them to focus on certain areas. Then, if/when participants are able to correctly identify an image with a trait repeatedly, we can draw conclusions about what areas have diagnostic value. While faces and visual stimuli are the most popular, this is not the only stimuli that can be used in a reverse correlation study. This method was originally designed for auditory stimuli which allows researchers to investigate how perceivers interpret auditory information and create trait based attributions to different sound patterns. For example, by segmenting a vocal recording of a single word (total sound time 426 ms) into six segments (71 ms each), and varying each segment's pitch using Gaussian distributions, researchers were able to uncover what vocal patterns people associated with certain traits. Specifically, this study investigated how listeners rated sound clips of the word “really” as sounding more interrogative (i.e. like the more common reverse correlation studies this study had participants listen to two sound clips per trial, choose which fit the category the best, and then created an average of the pitch contours). Beyond face and auditory perception, research utilizing the reverse correlation method has expanded to investigate how individuals see three-dimensional objects in images with noise (but no signal). After selecting your base image, regardless of what the image is, it is helpful to apply a Gaussian blur to smooth noise in the image. While noise will be applied later, it is helpful to reduce existing noise in the photo before applying your chosen noise. There are three primary choices when it comes to noise: white noise, sine-wave noise, and Gabor noise. The latter two of these constrain the configurations that the noise can have, and because of this white noise is usually the most commonly used. Regardless of the type of noise that is chosen, it is crucial that the noise randomly varies. === Step 2: data collection === Once the stimuli for the study has been developed, the researcher must make a few decisions before actually collecting the data. The researcher must come to a conclusion on how many stimuli will be presented at a time and how many trials the participants will see. In terms of stimuli presentation, a researcher can choose from either a 2-Image Forced Choice (2IFC) or a 4-Alternative Forced Choice (4AFC). The 2IFC presents two images at once (side by side) and requires participants to choose between the two on a specified category (e.g. which image looks the most like a male). Typically the noise from the left image is the mathematical inverse of the noise from the right image. This method was developed to better answer questions that could n

    Read more →
  • Showbox.com

    Showbox.com

    Showbox is an online video streaming platform that enables users to stream and download many videos, commonly movies and TV shows, for free. == History == The company opened the platforms to users who registered from its beta in late 2015. The platform was officially launched in February 2016, enabling any visitor to sign up and create videos online. In April 2016, Showbox was featured on the Product Hunt website, coming to the top of the website's lists for that day and week with over 1400 upvotes from the Product Hunt community. Also in April 2016, Showbox partnered with YouTube's leading multi-channel networks, including Fullscreen, BroadbandTV, StyleHaul, AwesomenessTV, and BuzzMyVideos, to enable their communities of creators to access the platform. In June 2016, the company launched Showbox For Brands, a business-oriented video creation platform, enabling companies to create video content in-house and with their communities and influencers. In March 2017, the company launched Showbox Engage, a use case of its B2B product launched in 2016, enabling companies to launch user-generated content campaigns with their communities. In April 2017, Showbox and the United Nations announced a partnership around the 70th anniversary of the declaration of human rights, with an annual, ongoing global campaign in 135 languages, inviting people worldwide to create their part of the declaration in a video from anywhere around the world. In November 2017, Showbox partnered with the Ad:tech and Digital Marketing World Forum conferences (DMWF) in New York to provide their users and communities with a User Generated Content video solution. == Technology == Showbox's video creation technology includes an online green screen feature, proprietary computer vision algorithms, deep learning technology to support the automatic creation of videos in the cloud, and advanced video composition, including special effects. == Coverage and awards == In March 2015, Showbox was nominated as one of the 10 Israeli startups to take over our TV screens this year. In July 2016, Showbox won the Publicis90 award as part of Publicis' "global initiative to foster digital entrepreneurship". In March 2017, Showbox was chosen as one of The Culture Trip's 10 startups to watch for in 2017.

    Read more →
  • Wargame (hacking)

    Wargame (hacking)

    In hacking, a wargame (or war game) is a cyber-security challenge and mind sport in which the competitors must exploit or defend a vulnerability in a system or application, and/or gain or prevent access to a computer system. A wargame usually involves a capture the flag logic, based on pentesting, semantic URL attacks, knowledge-based authentication, password cracking, reverse engineering of software (often JavaScript, C and assembly language), code injection, SQL injections, cross-site scripting, exploits, IP address spoofing, forensics, and other hacking techniques. == Wargames for preparedness == Wargames are also used as a method of cyberwarfare preparedness. The NATO Cooperative Cyber Defence Centre of Excellence (CCDCOE) organizes an annual event, Locked Shields, which is an international live-fire cyber exercise. The exercise challenges cyber security experts through real-time attacks in fictional scenarios and is used to develop skills in national IT defense strategies. == Additional applications == Wargames can be used to teach the basics of web attacks and web security, giving participants a better understanding of how attackers exploit security vulnerabilities. Wargames are also used as a way to "stress test" an organization's response plan and serve as a drill to identify gaps in cyber disaster preparedness.

    Read more →
  • Polynomial texture mapping

    Polynomial texture mapping

    Polynomial texture mapping (PTM), also known as Reflectance Transformation Imaging (RTI), is a technique of imaging and interactively displaying objects under varying lighting conditions to reveal surface phenomena. The data acquisition method is single camera multi light (SCML). == Origins == The method was originally developed by Tom Malzbender of HP Labs in order to generate enhanced 3D computer graphics and it has since been adopted for cultural heritage applications. == Methodology == A series of images is captured in a darkened environment with the camera in a fixed position and the object lit from different angles (Single Camera Multi Light). Interactive software processes and combines the set of images to enable the user inspecting the object to control a virtual light source. The virtual light source may be manipulated to simulate light from different angles and of different intensity or wavelengths to illuminate the surface of artefacts and reveal details. Open-source tools for processing the captured images and publishing the resulting relightable images on the web are freely available. == Applications == Polynomial texture mapping may be used for detailed recording and documentation, 3D modeling, edge detection, and to aid the study of inscriptions, rock art and other artefacts. It has been applied to hundreds of the Vindolanda tablets by the Centre for the Study of Ancient Documents at the University of Oxford in conjunction with the British Museum. It has also been deployed, by Ben Altshuler of the Institute for Digital Archaeology, to scan the Philae obelisk at Kingston Lacy and the Parian Chronicle at the Ashmolean Museum; in both cases scans revealed significant, previously illegible text. Method was also used for identifying microscopic worked antler from Star Carr and recording ancient rock art in Armenia. A 'dome' supporting twenty-four lights has been used to image paintings in the National Gallery and produce polynomial texture maps, providing information on condition phenomena for conservation purposes. Studies of the technique at the National Gallery and Tate concluded that it is an effective tool for documenting changes in the condition of paintings, more easily repeatable than raking light photography, and therefore could be used to assess paintings during structural treatment and before and after loan. Twelve dome-based systems built by the University of Southampton have been used to capture thousands of cuneiform tablets at various museums. The technique is now also finding uses in the field of forensic science, for example in imaging footprints, tyre marks, and indented writing.

    Read more →
  • Video renderer

    Video renderer

    A video renderer is software that processes a video file and sends it sequentially to the video display controller card for display on a computer screen. An example of a video renderer, is the VMR-7 that was used by Microsoft's DirectShow. An example of a UNIX video renderer is the one container within GStreamer. Commonly used video renderers are: Enhanced Video Renderer VMR9 Renderless Haali's Video Renderer Madvr Video Renderer JRVR, a part of JRiver Media Center

    Read more →
  • Prism Video Converter

    Prism Video Converter

    Prism is a multi-format video converter developed by NCH Software for Windows and Mac OS. It offers converting tools for instant media conversions. Prism Video Converter can handle large and high-quality resolution media files. It provides built-in compressor and adjuster settings, allowing users to customize and optimize their videos according to their needs. The software also includes features such as previewing videos and adding effects. Prism offers a free version for non-commercial use as well as a premium version. == Features == Prism Video File Converter supports a wide range of file formats. It enables users to convert videos into formats like AVI, ASF, WMV, MP4, 3GP, etc. It offers the ability to convert DVDs into various formats. It provides tools for adjusting colour and filter options. Prism Video File Converter provides several customizable options for tweaking the output files during the conversion process. Users can adjust compression/encoder rates, set the resolution and frame rate, and specify the desired output file size. The software also offers various effects like video rotation, captions, watermarks, and text overlay. It also includes a built-in preview feature, that enables users to view their videos before and after the conversion process. It supports batch conversion and running conversion in background. == Controversy == Previously, Prism and certain other NCH Software products were bundled with optional browser plugins, including the Google Chrome toolbar and the Conduit toolbar. This resulted in user complaints and raised concerns from antivirus software companies like Norton and McAfee, which flagged them as potential malware. NCH Software has since removed all toolbars, browsers, and third-party app offerings in all Prism versions.

    Read more →
  • JSGF

    JSGF

    JSGF stands for Java Speech Grammar Format or the JSpeech Grammar Format (in a W3C Note). Developed by Sun Microsystems, it is a textual representation of grammars for use in speech recognition for technologies like XHTML+Voice. JSGF adopts the style and conventions of the Java programming language in addition to use of traditional grammar notations. The Speech Recognition Grammar Specification was derived from this specification. == Example == The following JSGF grammar will recognize the words coffee, tea, and milk.

    Read more →
  • Pixel aspect ratio

    Pixel aspect ratio

    A pixel aspect ratio (PAR) is a mathematical ratio that describes how the width of a pixel in a digital image compares to the height of that pixel. Most digital imaging systems display an image as a grid of tiny, square pixels. However, some imaging systems, especially those that must be compatible with standard-definition television motion pictures, display an image as a grid of rectangular pixels, in which the pixel width and height are different. Pixel aspect ratio describes this difference. Use of pixel aspect ratio mostly involves pictures pertaining to standard-definition television and some other exceptional cases. Most other imaging systems, including those that comply with SMPTE standards and practices, use square pixels. PAR is also known as sample aspect ratio and abbreviated SAR, though it can be confused with storage aspect ratio. == Introduction == The ratio of the width to the height of an image is known as the aspect ratio, or more precisely the display aspect ratio (DAR) – the aspect ratio of the image as displayed; for TV, DAR was traditionally 4:3 (a.k.a. fullscreen), with 16:9 (a.k.a. widescreen) now the standard for HDTV. In digital images, there is a distinction with the storage aspect ratio (SAR), which is the ratio of pixel dimensions. If an image is displayed with square pixels, then these ratios agree; if not, then non-square, "rectangular" pixels are used, and these ratios disagree. The aspect ratio of the pixels themselves is known as the pixel aspect ratio (PAR) – for square pixels this is 1:1 – and these are related by the identity: Rearranging (solving for PAR) yields: For example: A 640 × 480 VGA image has a SAR of 640/480 = 4:3, and if displayed on a 4:3 display (DAR = 4:3) has square pixels, hence a PAR of 1:1. By contrast, a 720 × 576 D-1 PAL image has a SAR of 720/576 = 5:4, but if displayed on a 4:3 display (DAR = 4:3) the PAR is 4/3 : 5/4 = 16:15 ≈ 1.066. This means that the pixels of the PAL picture must be "stretched" by this amount to fit in the 4:3 display. In analog images such as film there is no notion of pixel, nor notion of SAR or PAR, but in the digitization of analog images the resulting digital image has pixels, hence SAR (and accordingly PAR, if displayed at the same aspect ratio as the original). Non-square pixels arise often in early digital TV standards, related to digitalization of analog TV signals – whose vertical and "effective" horizontal resolutions differ and are thus best described by non-square pixels – and also in some digital video cameras and computer display modes, such as Color Graphics Adapter (CGA). Today they arise also in transcoding between resolutions with different SARs. Actual displays do not generally have non-square pixels, though digital sensors might; they are rather a mathematical abstraction used in resampling images to convert between resolutions. There are several complicating factors in understanding PAR, particularly as it pertains to digitization of analog video: First, analog video does not have pixels, but rather a raster scan, and thus has a well-defined vertical resolution (the lines of the raster), but not a well-defined horizontal resolution, since each line is an analog signal. However, by a standardized sampling rate, the effective horizontal resolution can be determined by the sampling theorem, as is done below. Second, due to overscan, some of the lines at the top and bottom of the raster are not visible, as are some of the possible image on the left and right – see Overscan: Analog to digital resolution issues. Also, the resolution may be rounded (DV NTSC uses 480 lines, rather than the 486 that are possible). Third, analog video signals are interlaced – each image (frame) is sent as two "fields", each with half the lines. Thus either the pixels are twice as tall as they would be without interlacing, or the image is deinterlaced. == Background == Video is presented as a sequential series of images called video frames. Historically, video frames were created and recorded in analog form. As digital display technology, digital broadcast technology, and digital video compression evolved separately, it resulted in video frame differences that must be addressed using pixel aspect ratio. Digital video frames are generally defined as a grid of pixels used to present each sequential image. The horizontal component is defined by pixels (or samples), and is known as a video line. The vertical component is defined by the number of lines, as in 480 lines. Standard-definition television standards and practices were developed as broadcast technologies and intended for terrestrial broadcasting, and were therefore not designed for digital video presentation. Such standards define an image as an array of well-defined horizontal "Lines", well-defined vertical "Line Duration" and a well-defined picture center. However, there is not a standard-definition television standard that properly defines image edges or explicitly demands a certain number of picture elements per line. Furthermore, analog video systems such as NTSC 480i and PAL 576i, instead of employing progressively displayed frames, employ fields or interlaced half-frames displayed in an interwoven manner to reduce flicker and double the image rate for smoother motion. === Analog-to-digital conversion === As a result of computers becoming powerful enough to serve as video editing tools, video digital-to-analog converters and analog-to-digital converters were made to overcome this incompatibility. To convert analog video lines into a series of square pixels, the industry adopted a default sampling rate at which luma values were extracted into pixels. The luma sampling rate for 480i pictures was 12+3⁄11 MHz and for 576i pictures was 14+3⁄4 MHz. The term pixel aspect ratio was first coined when ITU-R BT.601 (commonly known as Rec. 601) specified that standard-definition television pictures are made of lines of exactly 720 non-square pixels. ITU-R BT.601 did not define the exact pixel aspect ratio but did provide enough information to calculate the exact pixel aspect ratio based on industry practices: The standard luma sampling rate of precisely 13+1⁄2 MHz. Based on this information: The pixel aspect ratio for 480i would be 10:11 as: 12 3 11 ÷ 13 1 2 = 10 11 {\displaystyle 12{\tfrac {3}{11}}\div 13{\tfrac {1}{2}}={\tfrac {10}{11}}} The pixel aspect ratio for 576i would be 59:54 as: 14 3 4 ÷ 13 1 2 = 59 54 {\displaystyle 14{\tfrac {3}{4}}\div 13{\tfrac {1}{2}}={\tfrac {59}{54}}} SMPTE RP 187 further attempted to standardize the pixel aspect ratio values for 480i and 576i. It designated 177:160 for 480i or 1035:1132 for 576i. However, due to significant difference with practices in effect by industry and the computational load that they imposed upon the involved hardware, SMPTE RP 187 was simply ignored. SMPTE RP 187 information annex A.4 further suggested the use of 10:11 for 480i. As of this writing, ITU-R BT.601-6, which is the latest edition of ITU-R BT.601, still implies that the pixel aspect ratios mentioned above are correct. === Digital video processing === As stated above, ITU-R BT.601 specified that standard-definition television pictures are made of lines of 720 non-square pixels, sampled with a precisely specified sampling rate. A simple mathematical calculation reveals that a 704 pixel width would be enough to contain a 480i or 576i standard 4:3 picture: A 4:3 480-line picture, digitized with the Rec. 601-recommended sampling rate, would be 704 non-square pixels wide. x 480 × 10 11 = 4 3 ⇒ x = 480 × 11 × 4 10 × 3 = 704 {\displaystyle {\frac {x}{480}}\times {\frac {10}{11}}={\frac {4}{3}}\Rightarrow x={\frac {480\times 11\times 4}{10\times 3}}=704} A 4:3 576-line picture, digitized with the Rec. 601-recommended sampling rate, would be 702+54⁄59 non-square pixels wide. x 576 × 59 54 = 4 3 ⇒ x = 576 × 54 × 4 59 × 3 = 702 54 59 {\displaystyle {\frac {x}{576}}\times {\frac {59}{54}}={\frac {4}{3}}\Rightarrow x={\frac {576\times 54\times 4}{59\times 3}}=702{\tfrac {54}{59}}} Unfortunately, not all standard TV pictures are exactly 4:3: As mentioned earlier, in analog video, the center of a picture is well-defined but the edges of the picture are not standardized. As a result, some analog devices (mostly PAL devices but also some NTSC devices) generated motion pictures that were horizontally (slightly) wider. This also proportionately applies to anamorphic widescreen (16:9) pictures. Therefore, to maintain a safe margin of error, ITU-R BT.601 required sampling 16 more non-square pixels per line (8 more at each edge) to ensure saving all video data near the margins. This requirement, however, had implications for PAL motion pictures. PAL pixel aspect ratios for standard (4:3) and anamorphic wide screen (16:9), respectively 59:54 and 118:81, were awkward for digital image processing, especially for mixing PAL and NTSC video clips. Therefore, video editing products chose the almost equivalent value

    Read more →
  • Bitstrips

    Bitstrips

    Bitstrips, Inc. was a Canadian media and technology company based in Toronto, founded in 2007 by Jacob Blackstock, David Kennedy, Shahan Panth, Dorian Baldwin, and Jesse Brown. The company created and offered a web application, Bitstrips.com, which allowed users to create comic strips using personalized avatars, and preset templates and poses. Brown and Blackstock explained that the service was meant to enable self-expression without the need to have artistic skills. Bitstrips was first presented in 2008 at South by Southwest in Austin, Texas, and the service later piloted and launched a version designed for use as educational software. The service achieved increasing prominence following the launch of versions for Facebook and mobile platforms. In 2014, Bitstrips launched a spin-off app known as Bitmoji, which allows users to create personalized stickers for use in instant messaging. In July 2016, Snapchat Inc. announced that it had acquired the company; the Bitstrips comic service was shut down, but Bitmoji remains operational, and has subsequently been given greater prominence within Snapchat's overall platform. == History == Bitstrips was co-developed by Toronto-based comic artist Jacob Blackstock and his high school friend, journalist Jesse Brown. The service was originally envisioned as a means to allow anyone to create their own comic strip without needing artistic skills. Brown explained that "it's so difficult and time-consuming to tell a story in comic book form, drawing the same characters again and again in these tiny little panels, and just the amount of craftsmanship required. And even if you can do it well, which I never could, it takes years to make a story." Brown stated that the service would be "groundwork for a whole new way to communicate", and went as far as describing the service as being a "YouTube for comics". Blackstock explained that the concept of Bitstrips was influenced by his own use of comics as a form of socialization; a student, Blackstock and his friends drew comics featuring each other and shared them during classes. He felt that Bitstrips was a "medium for self-expression", stating that "It's not just about you making the comics, but since you and your friends star in these comics, it's like you're the medium. The visual nature of comics just speaks so much louder than text." The service was publicly unveiled at South by Southwest in 2008. In 2009, the service introduced a version oriented towards the educational market, Bitstrips for Schools, which was initially piloted at a number of schools in Ontario. The service was praised by educators for being engaging to students, especially within language classes. Brown noted that students were using the service to create comics outside of class as well, stating that it was "so gratifying and shocking what people do with your tool to make their own stories in ways that you never would have anticipated. Some of them are just brilliant." In December 2012, Bitstrips launched a version for Facebook; by July 2013, Bitstrips had 10 million unique users on Facebook, having created over 50 million comics. In October 2013, Bitstrips launched a mobile app; in two months, Bitstrips became a top-downloaded app in 40 countries, and over 30 million avatars had been created with it. In November 2013, Bitstrips secured a round of funding from Horizons Ventures and Li Ka-shing. In October 2014, Bitstrips launched Bitmoji, a spin-off app that allows users to create stickers featuring Bitstrips characters in various templates. In July 2016, following unconfirmed reports earlier in the year, Snapchat Inc. announced that it had acquired Bitstrips. The company's staff continue to operate out of Toronto, but the original Bitstrips comic service was shut down in favour of focusing exclusively on Bitmoji, leaving many Bitstrips users to call for a reboot of the comic service.

    Read more →