AI Detector Humanize

AI Detector Humanize — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Ulead DVD MovieFactory

    Ulead DVD MovieFactory

    Corel DVD MovieFactory is a video editing and DVD authoring software product for Microsoft Windows, initially made by Ulead Systems and subsequently by Corel. It creates and authors multimedia discs in HD DVD, Blu-ray, DVD Video and DVD Audio. It also creates and rips Audio CDs and MP3 CDs. DVD MovieFactory is commonly bundled with many of the modern Toshiba Satellite laptops. Official Japanese version is also known as MovieWriter.

    Read more →
  • Customer support

    Customer support

    Customer support is a range of services to assist customers in making cost effective and correct use of a product. It includes assistance in planning, installation, training, troubleshooting, maintenance, upgrading, and disposal of a product. Regarding technology products such as mobile phones, televisions, computers, software products or other electronic or mechanical goods, it is termed technical support. It aims to ensure users can effectively operate the product and resolve any issues that may arise throughout its lifecycle. Support is delivered through various channels, including telephone, email, live chat, self-service knowledge bases, and social media. Research indicates that most customers attempt to resolve issues through self-service before contacting a representative. For products sold across multiple regions, support may be provided in several languages, as consumers tend to prefer assistance in their native language. Requirements for customer contact centres are defined in international standards such as ISO 18295.

    Read more →
  • Tribute (website)

    Tribute (website)

    Tribute is an American video-sharing website headquartered in Brooklyn. Created in 2014 by Andrew Horn and Rory Petty, the platform lets customers create video montages (called "tributes") for occasions including weddings, birthdays, anniversaries, get well soon, and memorials. Tribute.co allows users to record video messages, request submissions from friends and family, insert photos, add music, and send the resulting video tribute montage to a recipient. == Overview == Tribute's collaborative technology starts with inviting people to contribute via email, SMS or social media. Participants receive a prompt to record a short video via their phone, computer or tablet. The site's video editing software allows users to drag and drop the clips in their desired order without prior video editing experience. == History == When Andrew Horn turned twenty-seven, his girlfriend, Miki Agrawal surprised him with a video montage containing clips of his family and closest friends explaining why they loved him. This resulted in Andrew's idea to create Tribute–a "living eulogy" video-compilation service that he co-founded with software engineer Rory Petty. Founded in 2014, Tribute's activity accelerated in 2020 due to the COVID-19 pandemic, and it had sent over 5 million videos as of December 2021. While social distance restrictions were in effect, the site provided a way for people to connect while in-person celebrations were put on hold. For each video sold, Tribute makes one available to hospitals for free and has partnered with Cleveland Clinic Cancer Center in Ohio, Lurie Children's Hospital in Illinois and CarePoint Health in New Jersey.

    Read more →
  • Distribution management system

    Distribution management system

    A distribution management system (DMS) is a collection of applications designed to monitor and control the electric power distribution networks efficiently and reliably. It acts as a decision support system to assist the control room and field operating personnel with the monitoring and control of the electric distribution system. Improving the reliability and quality of service in terms of reducing power outages, minimizing outage time, maintaining acceptable frequency and voltage levels are the key deliverables of a DMS. Given the complexity of distribution grids, such systems may involve communication and coordination across multiple components. For example, the control of active loads may require a complex chain of communication through different components as described in US patent 11747849B2 In recent years, utilization of electrical energy increased exponentially and customer requirement and quality definitions of power were changed enormously. As electric energy became an essential part of daily life, its optimal usage and reliability became important. Real-time network view and dynamic decisions have become instrumental for optimizing resources and managing demands, leading to the need for distribution management systems in large-scale electrical networks. == Overview == Most distribution utilities have been comprehensively using IT solutions through their Outage Management System (OMS) that makes use of other systems like Customer Information System (CIS), Geographical Information System (GIS) and Interactive Voice Response System (IVRS). An outage management system has a network component/connectivity model of the distribution system. By combining the locations of outage calls from customers with knowledge of the locations of the protection devices (such as circuit breakers) on the network, a rule engine is used to predict the locations of outages. Based on this, restoration activities are charted out and the crew is dispatched for the same. In parallel with this, distribution utilities began to roll out Supervisory Control and Data Acquisition (SCADA) systems, initially only at their higher voltage substations. Over time, use of SCADA has progressively extended downwards to sites at lower voltage levels. DMSs access real-time data and provide all information on a single console at the control centre in an integrated manner. Their development varied across different geographic territories. In the US, for example, DMSs typically grew by taking Outage Management Systems to the next level, automating the complete sequences and providing an end to end, integrated view of the entire distribution spectrum. In the UK, by contrast, the much denser and more meshed network topologies, combined with stronger Health & Safety regulation, had led to early centralisation of high-voltage switching operations, initially using paper records and schematic diagrams printed onto large wallboards which were 'dressed' with magnetic symbols to show the current running states. There, DMSs grew initially from SCADA systems as these were expanded to allow these centralised control and safety management procedures to be managed electronically. These DMSs required even more detailed component/connectivity models and schematics than those needed by early OMSs as every possible isolation and earthing point on the networks had to be included. In territories such as the UK, therefore, the network component/connectivity models were usually developed in the DMS first, whereas in the USA these were generally built in the GIS. The typical data flow in a DMS has the SCADA system, the Information Storage & Retrieval (ISR) system, Communication (COM) Servers, Front-End Processors (FEPs) & Field Remote Terminal Units (FRTUs). == Why DMS? == Reduce the duration of outages Improve the speed and accuracy of outage predictions. Reduce crew patrol and drive times through improved outage locating. Improve the operational efficiency Determine the crew resources necessary to achieve restoration objectives. Effectively utilize resources between operating regions. Determine when best to schedule mutual aid crews. Increased customer satisfaction A DMS incorporates IVR and other mobile technologies, through which there is an improved outage communications for customer calls. Provide customers with more accurate estimated restoration times. Improve service reliability by tracking all customers affected by an outage, determining electrical configurations of every device on every feeder, and compiling details about each restoration process. == DMS Functions == In order to support proper decision making and O&M activities, DMS solutions should support the following functions: Network visualization & support tools Applications for Analytical & Remedial Action Utility Planning Tools System Protection Schemes The various sub functions of the same, carried out by the DMS are listed below:- === Network Connectivity Analysis (NCA) === Distribution network usually covers over a large area and catering power to different customers at different voltage levels. So locating required sources and loads on a larger GIS/Operator interface is often very difficult. Panning & zooming provided with normal SCADA system GUI does not cover the exact operational requirement. Network connectivity analysis is an operator specific functionality which helps the operator to identify or locate the preferred network or component very easily. NCA does the required analyses and provides display of the feed point of various network loads. Based on the status of all the switching devices such as circuit breaker (CB), Ring Main Unit (RMU) and/or isolators that affect the topology of the network modeled, the prevailing network topology is determined. The NCA further assists the operator to know operating state of the distribution network indicating radial mode, loops and parallels in the network. === Switching Schedule & Safety Management === In territories such as the UK a core function of a DMS has always been to support safe switching and work on the networks. Control engineers prepare switching schedules to isolate and make safe a section of network before work is carried out, and the DMS validates these schedules using its network model. Switching schedules can combine telecontrolled and manual (on-site) switching operations. When the required section has been made safe, the DMS allows a Permit To Work (PTW) document to be issued. After its cancellation when the work has been finished, the switching schedule then facilitates restoration of the normal running arrangements. Switching components can also be tagged to reflect any Operational Restrictions that are in force. The network component/connectivity model, and associated diagrams, must always be kept absolutely up to date. The switching schedule facility therefore also allows 'patches' to the network model to be applied to the live version at the appropriate stage(s) of the jobs. The term 'patch' is derived from the method previously used to maintain the wallboard diagrams. === State Estimation (SE) === The state estimator is an integral part of the overall monitoring and control systems for transmission networks. It is mainly aimed at providing a reliable estimate of the system voltages. This information from the state estimator flows to control centers and database servers across the network. The variables of interest are indicative of parameters like margins to operating limits, health of equipment and required operator action. State estimators allow the calculation of these variables of interest with high confidence despite the facts that the measurements may be corrupted by noise, or could be missing or inaccurate. Even though we may not be able to directly observe the state, it can be inferred from a scan of measurements which are assumed to be synchronized. The algorithms need to allow for the fact that presence of noise might skew the measurements. In a typical power system, the State is quasi-static. The time constants are sufficiently fast so that system dynamics decay away quickly (with respect to measurement frequency). The system appears to be progressing through a sequence of static states that are driven by various parameters like changes in load profile. The inputs of the state estimator can be given to various applications like Load Flow Analysis, Contingency Analysis, and other applications. === Load Flow Applications (LFA) === Load flow study is an important tool involving numerical analysis applied to a power system. The load flow study usually uses simplified notations like a single-line diagram and focuses on various forms of AC power rather than voltage and current. It analyzes the power systems in normal steady-state operation. The goal of a power flow study is to obtain complete voltage angle and magnitude information for each bus in a power system for specified load and generator real power and voltage conditions. Once this

    Read more →
  • Texture filtering

    Texture filtering

    In computer graphics, texture filtering or texture smoothing is the method used to determine the texture color for a texture mapped pixel, using the colors of nearby texels (ie. pixels of the texture). Filtering describes how a texture is applied at many different shapes, size, angles and scales. Depending on the chosen filter algorithm, the result will show varying degrees of blurriness, detail, spatial aliasing, temporal aliasing and blocking. Depending on the circumstances, filtering can be performed in software (such as a software rendering package) or in hardware, eg. with either real time or GPU accelerated rendering circuits, or in a mixture of both. For most common interactive graphical applications, modern texture filtering is performed by dedicated hardware which optimizes memory access through memory cacheing and pre-fetch, and implements a selection of algorithms available to the user and developer. There are two main categories of texture filtering: magnification filtering and minification filtering. Depending on the situation, texture filtering is either a type of reconstruction filter where sparse data is interpolated to fill gaps (magnification), or a type of anti-aliasing (AA) where texture samples exist at a higher frequency than required for the sample frequency needed for texture fill (minification). There are many methods of texture filtering, which make different trade-offs between computational complexity, memory bandwidth and image quality. == The need for filtering == During the texture mapping process for any arbitrary 3D surface, a texture lookup takes place to find out where on the texture each pixel center falls. For texture-mapped polygonal surfaces composed of triangles typical of most surfaces in 3D games and movies, every pixel (or subordinate pixel sample) of that surface will be associated with some triangle(s) and a set of barycentric coordinates, which are used to provide a position within a texture. Such a position may not lie perfectly on the "pixel grid," necessitating some function to account for these cases. In other words, since the textured surface may be at an arbitrary distance and orientation relative to the viewer, one pixel does not usually correspond directly to one texel. Some form of filtering has to be applied to determine the best color for the pixel. Insufficient or incorrect filtering will show up in the image as artifacts (errors in the image), such as 'blockiness', jaggies, or shimmering. There can be different types of correspondence between a pixel and the texel/texels it represents on the screen. These depend on the position of the textured surface relative to the viewer, and different forms of filtering are needed in each case. Given a square texture mapped on to a square surface in the world, at some viewing distance the size of one screen pixel is exactly the same as one texel. Closer than that, the texels are larger than screen pixels, and need to be scaled up appropriately — a process known as texture magnification. Farther away, each texel is smaller than a pixel, and so one pixel covers multiple texels. In this case an appropriate color has to be picked based on the covered texels, via texture minification. Graphics APIs such as OpenGL allow the programmer to set different choices for minification and magnification filters. Note that even in the case where the pixels and texels are exactly the same size, one pixel will not necessarily match up exactly to one texel. It may be misaligned or rotated, and cover parts of up to four neighboring texels. Hence some form of filtering is still required. == Mipmapping == Mipmapping is a standard technique used to save some of the filtering work needed during texture minification. It is also highly beneficial for cache coherency - without it the memory access pattern during sampling from distant textures will exhibit extremely poor locality, adversely affecting performance even if no filtering is performed. During texture magnification, the number of texels that need to be looked up for any pixel is always four or fewer; during minification, however, as the textured polygon moves farther away potentially the entire texture might fall into a single pixel. This would necessitate reading all of its texels and combining their values to correctly determine the pixel color, a prohibitively expensive operation. Mipmapping avoids this by prefiltering the texture and storing it in smaller sizes down to a single pixel. As the textured surface moves farther away, the texture being applied switches to the prefiltered smaller size. Different sizes of the mipmap are referred to as 'levels', with Level 0 being the largest size (used closest to the viewer), and increasing levels used at increasing distances. == Filtering methods == This section lists the most common texture filtering methods, in increasing order of computational cost and image quality. === Nearest-neighbor interpolation === Nearest-neighbor interpolation is the simplest and crudest filtering method — it simply uses the color of the texel closest to the pixel center for the pixel color. While simple, this results in a large number of artifacts - texture 'blockiness' during magnification, and aliasing and shimmering during minification. This method is fast during magnification but during minification the stride through memory becomes arbitrarily large and it can often be less efficient than MIP-mapping due to the lack of spatially coherent texture access and cache-line reuse. === Nearest-neighbor with mipmapping === This method still uses nearest neighbor interpolation, but adds mipmapping — first the nearest mipmap level is chosen according to distance, then the nearest texel center is sampled to get the pixel color. This reduces the aliasing and shimmering significantly during minification but does not eliminate it entirely. In doing so it improves texture memory access and cache-line reuse through avoiding arbitrarily large access strides through texture memory during rasterization. This does not help with blockiness during magnification as each magnified texel will still appear as a large rectangle. === Linear mipmap filtering === Less commonly used, OpenGL and other APIs support nearest-neighbor sampling from individual mipmaps whilst linearly interpolating the two nearest mipmaps relevant to the sample. === Bilinear filtering === In Bilinear filtering, the four nearest texels to the pixel center are sampled (at the closest mipmap level), and their colors are combined by weighted average according to distance. This removes the 'blockiness' seen during magnification, as there is now a smooth gradient of color change from one texel to the next, instead of an abrupt jump as the pixel center crosses the texel boundary. Bilinear filtering for magnification filtering is common. When used for minification it is often used with mipmapping; though it can be used without, it would suffer the same aliasing and shimmering problems as nearest-neighbor filtering when minified too much. For modest minification ratios, however, it can be used as an inexpensive hardware accelerated weighted texture supersample. The Nintendo 64 used an unusual version of bilinear filtering where only three pixels are used known as 3-point texture filtering, instead of four due to hardware optimization concerns. This introduces a noticeable "triangulation bias" in some textures. === Trilinear filtering === Trilinear filtering is a remedy to a common artifact seen in mipmapped bilinearly filtered images: an abrupt and very noticeable change in quality at boundaries where the renderer switches from one mipmap level to the next. Trilinear filtering solves this by doing a texture lookup and bilinear filtering on the two closest mipmap levels (one higher and one lower quality), and then linearly interpolating the results. This results in a smooth degradation of texture quality as distance from the viewer increases, rather than a series of sudden drops. Of course, closer than Level 0 there is only one mipmap level available, and the algorithm reverts to bilinear filtering. === Anisotropic filtering === Anisotropic filtering is the highest quality filtering available in current consumer 3D graphics cards. Simpler, "isotropic" techniques use only square mipmaps which are then interpolated using bi– or trilinear filtering. (Isotropic means same in all directions, and hence is used to describe a system in which all the maps are squares rather than rectangles or other quadrilaterals.) When a surface is at a high angle relative to the camera, the fill area for a texture will not be approximately square. Consider the common case of a floor in a game: the fill area is far wider than it is tall. In this case, none of the square maps are a good fit. The result is blurriness and/or shimmering, depending on how the fit is chosen. Anisotropic filtering corrects this by sampling the texture as a non-square shape. The goal is

    Read more →
  • Automated Lip Reading

    Automated Lip Reading

    Automated Lip Reading (ALR) is a software technology developed by speech recognition expert Frank Hubner. A video image of a person talking can be analysed by the software. The shapes made by the lips can be examined and then turned into sounds. The sounds are compared to a dictionary to create matches to the words being spoken. The technology was used successfully to analyse silent home movie footage of Adolf Hitler taken by Eva Braun at their Bavarian retreat Berghof. The video, with words, was included in a documentary titled "Hitler's Private World", Revealed Studios, 2006 Source: New Technology catches Hitler off guard

    Read more →
  • Pulse-coupled networks

    Pulse-coupled networks

    Pulse-coupled networks or pulse-coupled neural networks (PCNNs) are neural models proposed by modeling a cat's visual cortex, and developed for high-performance biomimetic image processing. In 1989, Eckhorn introduced a neural model to emulate the mechanism of cat's visual cortex. The Eckhorn model provided a simple and effective tool for studying small mammal’s visual cortex, and was soon recognized as having significant application potential in image processing. In 1994, Johnson adapted the Eckhorn model to an image processing algorithm, calling this algorithm a pulse-coupled neural network. The basic property of the Eckhorn's linking-field model (LFM) is the coupling term. LFM is a modulation of the primary input by a biased offset factor driven by the linking input. These drive a threshold variable that decays from an initial high value. When the threshold drops below zero it is reset to a high value and the process starts over. This is different than the standard integrate-and-fire neural model, which accumulates the input until it passes an upper limit and effectively "shorts out" to cause the pulse. LFM uses this difference to sustain pulse bursts, something the standard model does not do on a single neuron level. It is valuable to understand, however, that a detailed analysis of the standard model must include a shunting term, due to the floating voltages level in the dendritic compartment(s), and in turn this causes an elegant multiple modulation effect that enables a true higher-order network (HON). A PCNN is a two-dimensional neural network. Each neuron in the network corresponds to one pixel in an input image, receiving its corresponding pixel's color information (e.g. intensity) as an external stimulus. Each neuron also connects with its neighboring neurons, receiving local stimuli from them. The external and local stimuli are combined in an internal activation system, which accumulates the stimuli until it exceeds a dynamic threshold, resulting in a pulse output. Through iterative computation, PCNN neurons produce temporal series of pulse outputs. The temporal series of pulse outputs contain information of input images and can be used for various image processing applications, such as image segmentation and feature generation. Compared with conventional image processing means, PCNNs have several significant merits, including robustness against noise, independence of geometric variations in input patterns, capability of bridging minor intensity variations in input patterns, etc. A simplified PCNN called a spiking cortical model was developed in 2009. == Applications == PCNNs are useful for image processing, as discussed in a book by Thomas Lindblad and Jason M. Kinser. PCNNs have been used in a variety of image processing applications, including: image segmentation, pattern recognition, feature generation, face extraction, motion detection, region growing, image denoising and image enhancement Multidimensional pulse image processing of chemical structure data using PCNN has been discussed by Kinser, et al. They have also been applied to an all pairs shortest path problem.

    Read more →
  • Mathematical morphology

    Mathematical morphology

    Mathematical morphology (MM) is a theory and technique for analyzing and processing geometrical structures. It's based on set theory, lattice theory, topology, and random functions. MM is most commonly applied to digital images, but it can be employed as well on graphs, surface meshes, solids, and many other spatial structures. Topological and geometrical continuous-space concepts such as size, shape, convexity, connectivity, and geodesic distance, were introduced by MM on both continuous and discrete spaces. MM is also the foundation of morphological image processing, which consists of a set of operators that transform images according to the above characterizations. The basic morphological operators are erosion, dilation, opening and closing. MM was originally developed for binary images, and was later extended to grayscale functions and images. The subsequent generalization to complete lattices is widely accepted today as MM's theoretical foundation. == History == Mathematical Morphology was developed in 1964 by the collaborative work of Georges Matheron and Jean Serra, at the École des Mines de Paris, France. Matheron supervised the PhD thesis of Serra, devoted to the quantification of mineral characteristics from thin cross sections, and this work resulted in a novel practical approach, as well as theoretical advancements in integral geometry and topology. In 1968, the Centre de Morphologie Mathématique was founded by the École des Mines de Paris in Fontainebleau, France, led by Matheron and Serra. During the rest of the 1960s and most of the 1970s, MM dealt essentially with binary images, treated as sets, and generated a large number of binary operators and techniques: Hit-or-miss transform, dilation, erosion, opening, closing, granulometry, thinning, skeletonization, ultimate erosion, conditional bisector, and others. A random approach was also developed, based on novel image models. Most of the work in that period was developed in Fontainebleau. From the mid-1970s to mid-1980s, MM was generalized to grayscale functions and images as well. Besides extending the main concepts (such as dilation, erosion, etc.) to functions, this generalization yielded new operators, such as morphological gradients, top-hat transform and the Watershed (MM's main segmentation approach). In the 1980s and 1990s, MM gained a wider recognition, as research centers in several countries began to adopt and investigate the method. MM started to be applied to a large number of imaging problems and applications, especially in the field of non-linear filtering of noisy images. In 1986, Serra further generalized MM, this time to a theoretical framework based on complete lattices. This generalization brought flexibility to the theory, enabling its application to a much larger number of structures, including color images, video, graphs, meshes, etc. At the same time, Matheron and Serra also formulated a theory for morphological filtering, based on the new lattice framework. The 1990s and 2000s also saw further theoretical advancements, including the concepts of connections and levelings. In 1993, the first International Symposium on Mathematical Morphology (ISMM) took place in Barcelona, Spain. Since then, ISMMs are organized every 2–3 years: Fontainebleau, France (1994); Atlanta, USA (1996); Amsterdam, Netherlands (1998); Palo Alto, CA, USA (2000); Sydney, Australia (2002); Paris, France (2005); Rio de Janeiro, Brazil (2007); Groningen, Netherlands (2009); Intra (Verbania), Italy (2011); Uppsala, Sweden (2013); Reykjavík, Iceland (2015); Fontainebleau, France (2017); and Saarbrücken, Germany (2019). =

    Read more →
  • Cyber attribution

    Cyber attribution

    In the area of computer security, cyber attribution is an attribution of cybercrime, i.e., finding who perpetrated a cyberattack. Uncovering a perpetrator may give insights into various security issues, such as infiltration methods, communication channels, etc., and may help in enacting specific countermeasures. Cyber attribution is a costly endeavor requiring considerable resources and expertise in cyber forensic analysis. For governments and other major players dealing with cybercrime would require not only technical solutions, but legal and political ones as well, and for the latter ones cyber attribution is crucial. Attributing a cyberattack is difficult, and of limited interest to companies that are targeted by cyberattacks. In contrast, secret services often have a compelling interest in finding out whether a state is behind the attack. A further challenge in attribution of cyberattacks is the possibility of a false flag attack, where the actual perpetrator makes it appear that someone else caused the attack. Every stage of the attack may leave artifacts, such as entries in log files, that can be used to help determine the attacker's goals and identity. In the aftermath of an attack, investigators often begin by saving as many artifacts as they can find, and then try to determine the attacker.

    Read more →
  • NER model

    NER model

    NER is one of several formulas for accessing live subtitles in television broadcasts and events that are produced using speech recognition. The three letters stand for number, edit error and recognition error. It has been promoted as an alternative to Word error rate (Word Error Rate) which is a more objective measure. The overall score is calculated as follows: Firstly, the number of edit and recognition errors is deducted from the total number of words in the live subtitles. This number is then divided by the total number of words in the live subtitles and finally multiplied by one hundred. N E R v a l u e = N − E − R N ∗ 100 {\displaystyle NERvalue={\frac {N-E-R}{N}}100} . The acronyms stand for the following: N (number) = total number of words in the live subtitles E (Edit error) = edit error R (Recognition error) = recognition error This measurement process has been used for public television broadcasts in European countries like Italy and Switzerland. One major drawback with NER is that it requires a human assessor to rate errors as either: 1 Minor edition or recognition errors 2 Normal edition or recognition errors 3 Serious errors which are then weighted in the assessment process. This is both subjective, time consuming and costly. Also, NER fails to account for words left out subtitles which is something that does not take account of the D/deaf audience who want verbatim subtitles. As a result, NER cannot accurately reflect the audience's experience of subtitles. Another problem is the inconsistency of human evaluation of subtitles, particularly with live subtitles, where there are differing opinions of the importance of subtitle errors. By way of contrast, Word error rate is an objective measure of subtitle errors, since it measures the textual discrepancy between the subtitles and the speech.

    Read more →
  • Customer support

    Customer support

    Customer support is a range of services to assist customers in making cost effective and correct use of a product. It includes assistance in planning, installation, training, troubleshooting, maintenance, upgrading, and disposal of a product. Regarding technology products such as mobile phones, televisions, computers, software products or other electronic or mechanical goods, it is termed technical support. It aims to ensure users can effectively operate the product and resolve any issues that may arise throughout its lifecycle. Support is delivered through various channels, including telephone, email, live chat, self-service knowledge bases, and social media. Research indicates that most customers attempt to resolve issues through self-service before contacting a representative. For products sold across multiple regions, support may be provided in several languages, as consumers tend to prefer assistance in their native language. Requirements for customer contact centres are defined in international standards such as ISO 18295.

    Read more →
  • Voice activity detection

    Voice activity detection

    Voice activity detection (VAD), also known as speech activity detection or speech detection, is the detection of the presence or absence of human speech, used in speech processing. The main uses of VAD are in speaker diarization, speech coding and speech recognition. It can facilitate speech processing, and can also be used to deactivate some processes during non-speech section of an audio session: it can avoid unnecessary coding/transmission of silence packets in Voice over Internet Protocol (VoIP) applications, saving on computation and on network bandwidth. VAD is an important enabling technology for a variety of speech-based applications. Therefore, various VAD algorithms have been developed that provide varying features and compromises between latency, sensitivity, accuracy and computational cost. Some VAD algorithms also provide further analysis, for example whether the speech is voiced, unvoiced or sustained. Voice activity detection is usually independent of language. It was first investigated for use on time-assignment speech interpolation (TASI) systems. == Algorithm overview == The typical design of a VAD algorithm is as follows: There may first be a noise reduction stage, e.g. via spectral subtraction. Then some features or quantities are calculated from a section of the input signal. A classification rule is applied to classify the section as speech or non-speech – often this classification rule finds when a value exceeds a certain threshold. There may be some feedback in this sequence, in which the VAD decision is used to improve the noise estimate in the noise reduction stage, or to adaptively vary the threshold(s). These feedback operations improve the VAD performance in non-stationary noise (i.e. when the noise varies a lot). A representative set of recently published VAD methods formulates the decision rule on a frame by frame basis using instantaneous measures of the divergence distance between speech and noise. The different measures which are used in VAD methods include spectral slope, correlation coefficients, log likelihood ratio, cepstral, weighted cepstral, and modified distance measures. Independently from the choice of VAD algorithm, a compromise must be made between having voice detected as noise, or noise detected as voice (between false positive and false negative). A VAD operating in a mobile phone must be able to detect speech in the presence of a range of very diverse types of acoustic background noise. In these difficult detection conditions it is often preferable that a VAD should fail-safe, indicating speech detected when the decision is in doubt, to lower the chance of losing speech segments. The biggest difficulty in the detection of speech in this environment is the very low signal-to-noise ratios (SNRs) that are encountered. It may be impossible to distinguish between speech and noise using simple level detection techniques when parts of the speech utterance are buried below the noise. == Applications == VAD is an integral part of different speech communication systems such as audio conferencing, echo cancellation, speech recognition, speech encoding, speaker recognition and hands-free telephony. In the field of multimedia applications, VAD allows simultaneous voice and data applications. Similarly, in Universal Mobile Telecommunications Systems (UMTS), it controls and reduces the average bit rate and enhances overall coding quality of speech. In cellular radio systems (for instance GSM and CDMA systems) based on Discontinuous Transmission (DTX) mode, VAD is essential for enhancing system capacity by reducing co-channel interference and power consumption in portable digital devices. In speech processing applications, voice activity detection plays an important role since non-speech frames are often discarded. For a wide range of applications such as digital mobile radio, Digital Simultaneous Voice and Data (DSVD) or speech storage, it is desirable to provide a discontinuous transmission of speech-coding parameters. Advantages can include lower average power consumption in mobile handsets, higher average bit rate for simultaneous services like data transmission, or a higher capacity on storage chips. However, the improvement depends mainly on the percentage of pauses during speech and the reliability of the VAD used to detect these intervals. On the one hand, it is advantageous to have a low percentage of speech activity. On the other hand, clipping, that is the loss of milliseconds of active speech, should be minimized to preserve quality. This is the crucial problem for a VAD algorithm under heavy noise conditions. === Use in telemarketing === One controversial application of VAD is in conjunction with predictive dialers used by telemarketing firms. In order to maximize agent productivity, telemarketing firms set up predictive dialers to call more numbers than they have agents available, knowing most calls will end up in either "Ring – No Answer" or answering machines. When a person answers, they typically speak briefly ("Hello", "Good evening", etc.) and then there is a brief period of silence. Answering machine messages are usually 3–15 seconds of continuous speech. By setting VAD parameters correctly, dialers can determine whether a person or a machine answered the call and, if it's a person, transfer the call to an available agent. If it detects an answering machine message, the dialer hangs up. Often, even when the system correctly detects a person answering the call, no agent may be available, resulting in a "silent call". Call screening with a multi-second message like "please say who you are, and I may pick up the phone" will frustrate such automated calls. == Performance evaluation == To evaluate a VAD, its output using test recordings is compared with those of an "ideal" VAD – created by hand-annotating the presence or absence of voice in the recordings. The performance of a VAD is commonly evaluated on the basis of the following four parameters: FEC (Front End Clipping): clipping introduced in passing from noise to speech activity; MSC (Mid Speech Clipping): clipping due to speech misclassified as noise; OVER: noise interpreted as speech due to the VAD flag remaining active in passing from speech activity to noise; NDS (Noise Detected as Speech): noise interpreted as speech within a silence period. Although the method described above provides useful objective information concerning the performance of a VAD, it is only an approximate measure of the subjective effect. For example, the effects of speech signal clipping can at times be hidden by the presence of background noise, depending on the model chosen for the comfort noise synthesis, so some of the clipping measured with objective tests is in reality not audible. It is therefore important to carry out subjective tests on VADs, the main aim of which is to ensure that the clipping perceived is acceptable. In VoIP applications, front-end clipping can be reduced by rewinding to shortly before the detection and sending very slightly delayed data. This kind of test requires a certain number of listeners to judge recordings containing the processing results of the VADs being tested, giving marks to several speech sequences on the following features: Quality; Comprehension difficulty; Audibility of clipping. These marks are then used to calculate average results for each of the features listed above, thus providing a global estimate of the behavior of the VAD being tested. To conclude, whereas objective methods are very useful in an initial stage to evaluate the quality of a VAD, subjective methods are more significant. As they require the participation of several people for a few days, increasing cost, they are generally only used when a proposal is about to be standardized. == Implementations == One early standard VAD is that developed by British Telecom for use in the Pan-European digital cellular mobile telephone service in 1991. It uses inverse filtering trained on non-speech segments to filter out background noise, so that it can then more reliably use a simple power-threshold to decide if a voice is present. The G.729 standard calculates the following features for its VAD: line spectral frequencies, full-band energy, low-band energy (<1 kHz), and zero-crossing rate. It applies a simple classification using a fixed decision boundary in the space defined by these features, and then applies smoothing and adaptive correction to improve the estimate. The GSM standard includes two VAD options developed by ETSI. Option 1 computes the SNR in nine bands and applies a threshold to these values. Option 2 calculates different parameters: channel power, voice metrics, and noise power. It then thresholds the voice metrics using a threshold that varies according to the estimated SNR. The Speex audio compression library uses a procedure named Improved Minima Controlled Recursive Averaging, which uses a smoothed representation of spectral pow

    Read more →
  • List of data science software

    List of data science software

    This is a list of data science software and platforms used in data science, which includes programming languages, programming environments, machine learning frameworks, data engineering tools, statistical software, data analysis, plotting, MLOps systems, and more. == Programming languages == == Development environments == These interactive notebooks, IDEs, and platforms provide specialised development environments. Apache Zeppelin Architect — Eclipse (software) CoCalc Dataiku Data Science Studio FreeMat GNU Octave Google Colab DataSpell Jupyter Notebook / JupyterLab Kaggle Notebooks MATLAB O-Matrix PyCharm RStudio SAS (software) and SAS Studio Spyder Visual Studio Code == Machine and deep learning software == The Machine learning / deep learning tools support development in those fields. == Data engineering == Examples of Data engineering tools. Apache Airflow Apache Flink Apache Hadoop Apache Kafka Apache NiFi Apache Spark Dask Data build tool (dbt) == Data mining == Examples of Data mining tools. === Free and open-source === === Proprietary === == Database management == === List of RDBMS === ==== Proprietary ==== == Data warehouses == Data warehouse environments include: Amazon Redshift Snowflake Google BigQuery Microsoft Azure Synapse Teradata Vertica == Data lakes == Data lake environments include: Apache Hadoop Cloudera Databricks Delta Lake Amazon S3 Google Cloud Storage Azure Data Lake == Algorithms == Apriori algorithm – frequent itemset mining and association rule learning in market basket analysis Backpropagation – algorithm for training artificial neural networks using gradient descent Decision Trees – tree-based algorithm for classification and regression Expectation–maximization algorithm – iterative procedure for maximum likelihood estimation with latent variables Gradient descent – iterative optimization algorithm for minimizing a loss function ID3 algorithm – used to generate a decision tree from a dataset K-Means – clustering algorithm based on minimizing within-cluster distances K-Nearest Neighbors (KNN) – instance-based learning and classification method Linear regression – estimation method for predicting a dependent variable based on independent variables Logistic regression – classification algorithm for predicting a binary outcome Naive Bayes – probabilistic classifier based on Bayes' theorem Ordinary least squares – estimation method for parameters in linear regression PageRank – graph-based algorithm for link analysis and search ranking Principal component analysis – technique to reduce high-dimensional data while preserving variance Q-learning – reinforcement learning algorithm for learning optimal actions Random forest – ensemble of decision trees for improved classification or regression Sequential minimal optimization – solver for training support vector machines Stochastic gradient descent – randomized variant of gradient descent for large-scale machine learning Support Vector Machines (SVM) – algorithm for finding a hyperplane to separate classes == Statistical software == === Open-source === === Public domain === CSPro Dataplot Epi Map X-13ARIMA-SEATS === Freeware === BV4.1 MINUIT WinBUGS Winpepi === Proprietary === == Data processing == Tools for Data processing and analysis: == Data and information visualization == Software for Data visualization: == Plotting software == Software for plotting data to support processing and visualise results. == Maps and geospatial visualization == ArcGIS Carto Epi Map GeoDA Google Earth Engine Leaflet Mapbox MountainsMap QGIS == Machine learning == MLOps and model deployment: BentoML Data Version Control (DVC) Kubeflow MLflow Seldon Core Streamlit TensorFlow Serving Weights & Biases == Data repositories == Kaggle – platform for data science competitions, datasets, and notebooks. OpenML – collaborative platform for sharing datasets, algorithms, and experiments. University of California, Irvine Machine Learning Repository Zenodo – open-access repository supported by CERN and the EU. == Educational data science software == Kaggle – online platform for data science education, competitions, datasets, and collaborative learning. KNIME – open-source data analytics platform used for teaching data science, machine learning, and workflow-based analysis. RapidMiner – used in academic research and education for data mining and machine learning. Statistics Online Computational Resource (SOCR) – online tools and instructional resources for statistics education. Tanagra (machine learning) – data mining software developed for research and teaching purposes. TinkerPlots – explore and analyze data through visual modeling.

    Read more →
  • Lenna

    Lenna

    Lenna (or Lena) is a standard test image used in the field of digital image processing, starting in 1973. It is a picture of the Swedish model Lena Forsén, shot by photographer Dwight Hooker and cropped from the centerfold of the November 1972 issue of Playboy magazine. Lenna has attracted controversy because of its subject matter. Starting in the mid-2010s, many journals have deemed it inappropriate and discouraged its use, while others have banned it from publication outright. Forsén herself has called for it to be retired, saying "It's time I retired from tech." The spelling "Lenna" came from the model's desire to encourage the proper pronunciation of her name. "I didn't want to be called Leena [English: ]," she explained. == History == Before Lenna, the first use of a Playboy magazine image to illustrate image processing algorithms was in 1961. Lawrence G. Roberts used two cropped six-bit grayscale facsimile scanned images from Playboy's July 1960 issue featuring Playmate Teddi Smith, in his master's thesis on image dithering at Massachusetts Institute of Technology. Lenna was originally intended for high resolution color image processing study. Its history was described in the May 2001 newsletter of the IEEE Professional Communication Society, in an article by Jamie Hutchinson: Alexander Sawchuk estimates that it was in June or July of 1973 when he, then an assistant professor of electrical engineering at the University of Southern California Signal and Image Processing Institute (SIPI), along with a graduate student and the SIPI lab manager, was hurriedly searching the lab for a good image to scan for a colleague's conference paper. They got tired of their stock of usual test images, dull stuff dating back to television standards work in the early 1960s. They wanted something glossy to ensure good output dynamic range, and they wanted a human face. Just then, somebody happened to walk in with a recent issue of Playboy. The engineers tore away the top third of the centerfold so they could wrap it around the drum of their Muirhead wirephoto scanner, which they had outfitted with analog-to-digital converters (one each for the red, green, and blue channels) and a Hewlett Packard 2100 minicomputer. The Muirhead had a fixed resolution of 100 lines per inch and the engineers wanted a 512×512 image, so they limited the scan to the top 5.12 inches of the picture, effectively cropping it at the subject's shoulders. The image's reach was limited in the 1970s and 80s, which is reflected in it initially only appearing in .org domains, but in July 1991, the image featured on the cover of Optical Engineering alongside Peppers, another popular test image. This drew the attention of Playboy to the potential copyright infringement. The peak of image hits on the internet was in 1995. The scan became one of the most used images in computer history. The use of the photo in electronic imaging has been described as "clearly one of the most important events in [its] history". The image spread to over 100 different domains, particularly .com and .edu. In a 1999 issue of IEEE Transactions on Image Processing "Lena" was used in three separate articles, and the picture continued to appear in scientific journals throughout the beginning of the 21st century. Lenna is so widely accepted in the image processing community that Forsén was a guest at the 50th annual Conference of the Society for Imaging Science and Technology (IS&T) in 1997. In 2015, Lena Forsén was also guest of honor at the banquet of IEEE ICIP 2015. After delivering a speech, she chaired the best paper award ceremony. To explain why the image became a standard in the field, David C. Munson, editor-in-chief of IEEE Transactions on Image Processing, stated that it was a good test image because of its detail, flat regions, shading, and texture. He also noted that "the Lena image is a picture of an attractive woman. It is not surprising that the (mostly male) image processing research community gravitated toward an image that they found attractive." While Playboy often cracks down on illegal uses of its material and did initially send a notice to the publisher of Optical Engineering about its unauthorized use in that publication, over time it has decided to overlook the wide use of Lena. Eileen Kent, VP of new media at Playboy, said, "We decided we should exploit this, because it is a phenomenon." == Criticism == The use of the image has produced controversy because Playboy is "seen (by some) as being degrading to women". In a 1999 essay on reasons for the male predominance in computer science, applied mathematician Dianne P. O'Leary wrote: Suggestive pictures used in lectures on image processing ... convey the message that the lecturer caters to the males only. For example, it is amazing that the "Lena" pin-up image is still used as an example in courses and published as a test image in journals today. A 2012 paper on compressed sensing used a photo of the model Fabio Lanzoni as a test image to draw attention to this issue. The use of the test image at the magnet school Thomas Jefferson High School for Science and Technology in Fairfax County, Virginia, provoked a guest editorial by a senior in The Washington Post in 2015 about its detrimental impact on aspiring female students in computer science. In 2017, the Journal of Modern Optics published an editorial titled "On alternatives to Lenna" suggesting three images (Pirate, Cameraman, and Peppers) that "are reasonably close to Lenna in feature space". In 2018, the Nature Nanotechnology journal announced that they would no longer consider articles using Lenna. In the same year SPIE, the publishers of Optical Engineering, also announced that they "strongly discourage" the use of Lenna, and would no longer consider new submissions containing the image "without convincing scientific justification for its use". They noted that aside from the copyright and ethical issues, that it was also no longer useful as a standard image: "In today's age of high-resolution digital image technology, it seems difficult to argue that a 512 × 512 image produced with a 1970s-era analog scanner is the best we have to offer as an image quality test standard". Forsén stated in the 2019 documentary film Losing Lena, "I retired from modeling a long time ago. It's time I retired from tech, too... Let's commit to losing me." The Institute of Electrical and Electronics Engineers (IEEE) announced that, starting April 1, 2024, it will no longer allow use of Lenna in its publications.

    Read more →
  • Deluxe Paint Animation

    Deluxe Paint Animation

    DeluxePaint Animation is a 1990 graphics editor and animation creation package for MS-DOS, based on Deluxe Paint for the Amiga. It was adapted by Brent Iverson with additional animation features by Steve Shaw and released by Electronic Arts. The program requires VGA graphics, MS-DOS 2.1 or higher, and a mouse. == Features == Listed from the back of the box. Complete selection of painting tools — Draw any shape you want, any way you want. Turn any image into a brush. You can rotate, flip, shear, resize, smear, and shade it. 7 levels of magnification — Paint in magnified mode if you want. Use variable zoom for detailed editing at the pixel level. 3-D perspective — Move and rotate images in full 3-D, automatically. Use color cycling and gradient fills to create great special effects. Stencils — Protect your designs from the slip of the hand or a bad idea. A stencil masks your image so you can paint "behind" and "in front of" it. Use the handy Move Dialog to animate brushes in full 3-D — automatically! Ideal for creating spinning titles for low-cost videos. 37 multi-sized fonts

    Read more →