Graph cut optimization

Graph cut optimization

Graph cut optimization is a combinatorial optimization method applicable to a family of functions of discrete variables, named after the concept of cut in the theory of flow networks. Thanks to the max-flow min-cut theorem, determining the minimum cut over a graph representing a flow network is equivalent to computing the maximum flow over the network. Given a pseudo-Boolean function f {\displaystyle f} , if it is possible to construct a flow network with positive weights such that each cut C {\displaystyle C} of the network can be mapped to an assignment of variables x {\displaystyle \mathbf {x} } to f {\displaystyle f} (and vice versa), and the cost of C {\displaystyle C} equals f ( x ) {\displaystyle f(\mathbf {x} )} (up to an additive constant) then it is possible to find the global optimum of f {\displaystyle f} in polynomial time by computing a minimum cut of the graph. The mapping between cuts and variable assignments is done by representing each variable with one node in the graph and, given a cut, each variable will have a value of 0 if the corresponding node belongs to the component connected to the source, or 1 if it belong to the component connected to the sink. Not all pseudo-Boolean functions can be represented by a flow network, and in the general case the global optimization problem is NP-hard. There exist sufficient conditions to characterise families of functions that can be optimised through graph cuts, such as submodular quadratic functions. Graph cut optimization can be extended to functions of discrete variables with a finite number of values, that can be approached with iterative algorithms with strong optimality properties, computing one graph cut at each iteration. Graph cut optimization is an important tool for inference over graphical models such as Markov random fields or conditional random fields, and it has applications in computer vision problems such as image segmentation, denoising, registration and stereo matching. == Representability == A pseudo-Boolean function f : { 0 , 1 } n → R {\displaystyle f:\{0,1\}^{n}\to \mathbb {R} } is said to be representable if there exists a graph G = ( V , E ) {\displaystyle G=(V,E)} with non-negative weights and with source and sink nodes s {\displaystyle s} and t {\displaystyle t} respectively, and there exists a set of nodes V 0 = { v 1 , … , v n } ⊂ V − { s , t } {\displaystyle V_{0}=\{v_{1},\dots ,v_{n}\}\subset V-\{s,t\}} such that, for each tuple of values ( x 1 , … , x n ) ∈ { 0 , 1 } n {\displaystyle (x_{1},\dots ,x_{n})\in \{0,1\}^{n}} assigned to the variables, f ( x 1 , … , x n ) {\displaystyle f(x_{1},\dots ,x_{n})} equals (up to a constant) the value of the flow determined by a minimum cut C = ( S , T ) {\displaystyle C=(S,T)} of the graph G {\displaystyle G} such that v i ∈ S {\displaystyle v_{i}\in S} if x i = 0 {\displaystyle x_{i}=0} and v i ∈ T {\displaystyle v_{i}\in T} if x i = 1 {\displaystyle x_{i}=1} . It is possible to classify pseudo-Boolean functions according to their order, determined by the maximum number of variables contributing to each single term. All first order functions, where each term depends upon at most one variable, are always representable. Quadratic functions f ( x ) = w 0 + ∑ i w i ( x i ) + ∑ i < j w i j ( x i , x j ) . {\displaystyle f(\mathbf {x} )=w_{0}+\sum _{i}w_{i}(x_{i})+\sum _{i 0 {\displaystyle p>0} then w i j k ( x i , x j , x k ) = w i j k ( 0 , 0 , 0 ) + p 1 ( x i − 1 ) + p 2 ( x j − 1 ) + p 3 ( x k − 1 ) + p 23 ( x j − 1 ) x k + p 31 x i ( x k − 1 ) + p 12 ( x i − 1 ) x j − p x i x j x k {\displaystyle w_{ijk}(x_{i},x_{j},x_{k})=w_{ijk}(0,0,0)+p_{1}(x_{i}-1)+p_{2}(x_{j}-1)+p_{3}(x_{k}-1)+p_{23}(x_{j}-1)x_{k}+p_{31}x_{i}(x_{k}-1)+p_{12}(x_{i}-1)x_{j}-px_{i}x_{j}x_{k}} with p 1 = w i j k ( 1 , 0 , 1 ) − w i j k ( 0 , 0 , 1 ) p 2 = w i j k ( 1 , 1 , 0 ) − w i j k ( 1 , 0 , 1 ) p 3 = w i j k ( 0 , 1 , 1 ) − w i j k ( 0 , 1 , 0 ) p 23 = w i j k ( 0 , 0 , 1 ) + w i j k ( 0 , 1 , 0 ) − w i j k ( 0 , 0 , 0 ) − w i j k ( 0 , 1 , 1 ) p 31 = w i j k ( 0 , 0 , 1 ) + w i j k ( 1 , 0 , 0 ) − w i j k ( 0 , 0 , 0 ) − w i j k ( 1 , 0 , 1 ) p 12 = w i j k ( 0 , 1 , 0 ) + w i j k ( 1 , 0 , 0 ) − w i j k ( 0 , 0 , 0 ) − w i j k ( 1 , 1 , 0 ) . {\displaystyle {\begin{aligned}p_{1}&=w_{ijk}(1,0,1)-w_{ijk}(0,0,1)\\p_{2}&=w_{ijk}(1,1,0)-w_{ijk}(1,0,1)\\p_{3}&=w_{ijk}(0,1,1)-w_{ijk}(0,1,0)\\p_{23}&=w_{ijk}(0,0,1)+w_{ijk}(0,1,0)-w_{ijk}(0,0,0)-w_{ijk}(0,1,1)\\p_{31}&=w_{ijk}(0,0,1)+w_{ijk}(1,0,0)-w_{ijk}(0,0,0)-w_{ijk}(1,0,1)\\p_{12}&=w_{ijk}(0,1,0)+w_{ijk}(1,0,0)-w_{ijk}(0,0,0)-w_{ijk}(1,1

Scene statistics

Scene statistics is a discipline within the field of perception. It is concerned with the statistical regularities related to scenes. It is based on the premise that a perceptual system is designed to interpret scenes. Biological perceptual systems have evolved in response to physical properties of natural environments. Therefore natural scenes receive a great deal of attention. Natural scene statistics are useful for defining the behavior of an ideal observer in a natural task, typically by incorporating signal detection theory, information theory or estimation theory. == Within-domain versus across-domain == Geisler (2008) distinguishes between four kinds of domains: (1) Physical environments (2) Images/Scenes (3) Neural responses and (4) Behavior. Within the domain of images/scenes one can study the characteristics of information related to redundancy and efficient coding. Across-domain statistics determine how an autonomous system should make inferences about its environment, process information and control its behavior. To study these statistics it is necessary to sample or register information in multiple domains simultaneously. == Applications == === Prediction of picture and video quality === One of the most successful applications of Natural Scenes Statistics Models has been perceptual picture and video quality prediction. For example, the Visual Information Fidelity (VIF) algorithm, which is used to measure the degree of distortion of pictures and videos, is used extensively by the image and video processing communities to assess perceptual quality. This is often after processing, such as compression, which can degrade the appearance of a visual signal. The premise is that the scene statistics are changed by distortion and that the visual system is sensitive to the changes in the scene statistics. VIF is heavily used in the streaming television industry. Other popular picture quality models that use natural scene statistics include BRISQUE and NIQE, both of which are no-reference since they do not require any reference picture to measure quality against.

Glossary of operating systems terms

This page is a glossary of Operating systems terminology. == A == access token: In Microsoft Windows operating systems, an access token contains the security credentials for a login session and identifies the user, the user's groups, the user's privileges, and, in some cases, a particular application. == B == binary semaphore: See semaphore. booting: In computing, booting (also known as booting up) is the initial set of operations that a computer performs after electrical power is switched on or when the computer is reset. This can take tens of seconds and typically involves performing a power-on self-test, locating and initializing peripheral devices, and then finding, loading and starting the operating system. == C == cache: In computer science, a cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere. cloud: Cloud computing operating systems are recent, and were not mentioned in Gagne's 8th Edition (2009). In contrast, by Gagne's 9th (2012), cloud o/s received 3 pages of coverage (41, 42, 716). Doeppner (2011) mentions them (p. 3), but only to prove that operating systems "are not a solved problem" and that even if the day of the dedicated PC is waning, cloud computing has created an entirely new opportunity for o/s development ala sharing, networks, memory, parallelism, etc. Gagne (2012) adds that in addition to numerous traditional o/s's at cloud warehouses, Virtual machine o/s (VMMs), Eucalyptus, Vware, vCloud Director and others are being developed specifically for cloud management with numerous traditional o/s features (security, threads, file and memory management, guis, etc.) (p. 42). Microsoft's investment in cloud aspects of o/s tend to support that argument. concurrency == D == daemon: Operating systems often start daemons at boot time and serve the function of responding to network requests, hardware activity, or other programs by performing some task. Daemons can also configure hardware (like udevd on some Linux systems), run scheduled tasks (like cron), and perform a variety of other tasks. == E == == F == == G == == H == == I == == J == == K == kernel: In computing, the kernel is a computer program that manages input/output requests from software and translates them into data processing instructions for the central processing unit and other electronic components of a computer. The kernel is a fundamental part of a modern computer's operating system. == L == lock: In computer science, a lock or mutex (from mutual exclusion) is a synchronization mechanism for enforcing limits on access to a resource in an environment where there are many threads of execution. A lock is designed to enforce a mutual exclusion concurrency control policy. == M == mutual exclusion: Mutual exclusion is to allow only one process at a time to access the same critical section (a part of code which accesses the critical resource). This helps prevent race conditions. mutex: See lock. == N == == O == == P == paging daemon: See daemon. process == Q == == R == == S == semaphore: In computer science, particularly in operating systems, a semaphore is a variable or abstract data type that is used for controlling access, by multiple processes, to a common resource in a parallel programming or a multi user environment. == T == thread: In computer science, a thread of execution is the smallest sequence of programmed instructions that can be managed independently by an operating system scheduler. The scheduler itself is a light-weight process. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process. templating: In an o/s context, templating refers to creating a single virtual machine image as a guest operating system, then saving it as a tool for multiple running virtual machines (Gagne, 2012, p. 716). The technique is used both in virtualization and cloud computing management, and is common in large server warehouses. == U == == V == == W == == Z ==

Feature detection (web development)

Feature detection (also feature testing) is a technique used in web development for handling differences between runtime environments (typically web browsers or user agents), by programmatically testing for clues that the environment may or may not offer certain functionality. This information is then used to make the application adapt in some way to suit the environment: to make use of certain APIs, or tailor for a better user experience. Its proponents claim it is more reliable and future-proof than other techniques like user agent sniffing and browser-specific CSS hacks. == Techniques == A feature test can take many forms. It is essentially any snippet of code which gives some level of confidence that a required feature is indeed supported. However, in contrast to other techniques, feature detection usually focuses on performing actions which directly relate to the feature to be detected, rather than heuristics. === JavaScript === JavaScript feature detection can inspect the DOM and the local JavaScript environment to test whether browser features or APIs are supported. The simplest technique is to check for the existence of a relevant object or property. For example, the Geolocation API (used for accessing the device's knowledge of its geographical location, possibly obtained from a GPS navigation device) exposes a geolocation property on the navigator object in the DOM; the presence of which implies the Geolocation API is supported: if ('geolocation' in navigator) { // Geolocation API is supported } For a higher level of confidence, some feature tests will attempt to invoke the feature then look for clues that it behaved properly. For example, a test for support for cookies might attempt to set a value as a cookie and then verify it can be read back. === CSS === In CSS, the at-rule @supports introduced in 2015 allows to test if a given feature is supported. For instance the following code activates the declarations only if the user agent supports display: flex: == Undetectables == Some browser features are considered undetectable, because no clues are known to give sufficient confidence that a feature is supported. These are often because of limited information available to the JavaScript environment in the browser; generally features must be exposed via the DOM in some way in order to be detectable using JavaScript. When undetectables are encountered, it is common to turn to user agent sniffing as an alternative mechanism, or to employ defensive coding to minimise the impact if the feature turns out not to be supported. The Modernizr project maintains a record of known undetectables on their wiki.

Digital cassettes

Digital audio cassette formats introduced to the professional audio and consumer markets: Digital Audio Tape (or DAT) is the most well-known, and had some success as an audio storage format among professionals and "prosumers" before the prices of hard drive and solid-state flash memory-based digital recording devices dropped in the late 1990s. Hard-drive recording has mostly made DAT obsolete, as hard disk recorders offer more editing versatility than tape, and easier importation into digital audio workstations (DAWs) and non-linear video editing (NLE) systems. Digital Compact Cassette was intended as a digital replacement for the mass-market analog cassette tape, but received very little attention or adaptation. Its failure is generally attributed to higher production costs than audio CDs, durability and indifferent reception by consumers. Digital video cassettes include: Betacam IMX (Sony) D-VHS (JVC) D1 (Sony) D2 (Sony) D3 D5 HD Digital-S D9 (JVC) Digital Betacam (Sony) Digital8 (Sony) DV HDV ProHD (JVC) MiniDV MicroMV == Analog cassettes used as digital data storage == Historically, the compact audio cassette which was originally designed for analog storage of music was used as an alternative to disk drives in the late 1970s and early 1980s to provide data storage for home computers. There is a number of unique and incompatible cassette tape data storage formats that all use the same analog compact audio cassette tape media. The ADAT system uses Super VHS tapes to record 8 synchronized digital audiotracks at once. There have also been several audio recording systems that used VHS video recorders as storage devices and video tape transports, generally by encoding the digital data to be recorded into an analog composite video signal (which resembles static) and then recording this to magnetic tape. These systems were often used as "mixdown" recorders, to record the finished mix from a multi-track recorder in preparation for the manufacture of a vinyl record, cassette tape, or CD. An example was the Dbx Model 700. Another example is the Sony PCM adaptor series. Several companies sold VHS backup solutions in the 1980s and 1990s where data was converted to a video image which was then saved onto a VHS tape. the Corvus "Mirror" ( U.S. patent 4380047A ) the Metrum Model 64 on S-VHS tape, the Danmere Backer tape backup system, the Alpha Microsystems Videotrax the Legacy Storage Systems International VAST (Variable Array Storage) the ArVid the Video Backup System Amiga, The S2 VLBI system at three NASA Deep Space Network complexes and over 20 other radio telescopes stores digital data on SVHS tapes.

Adobe Presenter

Adobe Presenter is eLearning software released by Adobe Systems available on the Microsoft Windows platform as a Microsoft PowerPoint plug-in, and on both Windows and OS X as the screencasting and video editing tool Adobe Presenter Video Express. It is mainly targeted towards learning professionals and trainers. In addition to recording one's computer desktop and speech, it also provides the option to add quizzes and track performance by integrating with learning management systems. Adobe Presenter was designed to replace the discontinued Adobe Ovation software, which had similar functions. == Predecessor == Adobe Ovation was originally released by Serious Magic. It converted PowerPoint slides into visual presentations with additional effects. Ovation included themes called PowerLooks that could add motion and polish the presentations. They were available in a variety of color variations complete with animated backgrounds and dynamic text effects. Ovation could make text with jagged edges more readable. TimeKeeper could be used to set the period of the presentation, and the PointPrompter scrolled down the notes. Ovation's development has been discontinued, nor does it support PowerPoint 2007. == Features == The main purpose of Adobe Presenter is to capture on-screen presentations and convert them into more interactive and engaging videos. Support is given to convert Microsoft PowerPoint 2010 and 2013 presentations into videos. It also allows for content authoring on PowerPoint and ActionScript 3, and offers integration with Adobe Captivate. Slide branching enables users to control slide navigation and titles and create complex slide branching to guide viewers through the content of the presentation. Video editing tools are also provided, and offer the ability to upload to video-sharing platforms such as YouTube, Vimeo and other sites. Multimedia features such as annotations, eLearning templates, actors, audio narration and drag-and-drop elements enrich users' presentations. Quizzes and surveys is another highlighted feature, which include generating question pools, importing questions from existing quizzes and in-course collaboration which allows presenters to receive feedback by allowing them to comment on specific content within a course or ask questions for more clarity. Presenters could opt to receive feedback from viewers through video analytics and create Experience API, SCORM and AICC-compliant content. Options to publish to Adobe Connect are provided. Other unique features include universal standards support, file size control, navigational restrictions among others.

T.38

T.38 is an ITU recommendation for allowing transmission of fax over IP networks (FoIP) in real time. == History == The T.38 fax relay standard was devised in 1998 as a way to transport faxes across IP networks between existing Group 3 (G3) fax terminals. T.4 and related fax standards were published by the ITU in 1980, before the rise of the Internet. In the late 1990s, VoIP, or voice over IP, began to gain ground as an alternative to the conventional public switched telephone network (PSTN). However, because most VoIP systems are optimized (through their use of aggressive lossy bandwidth-saving compression) for voice rather than data calls, conventional fax machines worked poorly or not at all on them due to the network impairments such as delay, jitter, packet loss, and so on. Thus, some way of transmitting fax over IP was needed. == Overview == In practical scenarios, a T.38 fax call has at least part of the call being carried over PSTN, although this is not required by the T.38 definition, and two T.38 devices can send faxes to each other. This particular type of device is called Internet-Aware Fax device, or IAF, and it is capable of initiating or completing a fax call towards the IP network. The typical scenario where T.38 is used is – T.38 fax relay – where a T.30 fax device sends a fax over PSTN to a T.38 fax gateway which converts or encapsulates the T.30 protocol into a T.38 data stream. This is then sent either to a T.38-enabled end point such as fax machine or fax server or another T.38 gateway that converts it back to a PSTN PCM or analog signal and terminates the fax on a T.30 device. The T.38 recommendation defines the use of both TCP and UDP to transport T.38 packets. Implementations tend to use UDP, due to TCP's requirement for acknowledgement packets and resulting retransmission during packet loss, which introduces delays. When using UDP, T.38 copes with packet loss by using redundant data packets. T.38 is not a call setup protocol, thus the T.38 devices need to use standard call setup protocols to negotiate the T.38 call, e.g. H.323, SIP & MGCP. == Operation == There are two primary ways that fax transactions are conveyed across packet networks. The T.37 standard specifies how a fax image is encapsulated in e-mail and transported, ultimately, to the recipient using a store-and-forward process through intermediary entities. T.38, however, defines a protocol that supports the use of the T.30 protocol in both the sender and recipient terminals. (See diagram above.) T.38 lets one transmit a fax across an IP network in real time, just as the original G3 fax standards did for the traditional (time-division multiplexed (TDM)) network, also called the public switched telephone network or PSTN. A special protocol is needed for real-time fax over IP (Internet Protocol) since existing fax terminals only supported PSTN connections, where the information flow was generally smooth and uninterrupted, as opposed to the jittery arrival of IP packets. The trick was to come up with a protocol that makes the IP network “invisible” to the endpoint fax terminals, which would mean the user of a legacy fax terminal need not know that the fax call was traversing an IP network. The network interconnections supported by T.38 are shown above. The two fax terminals on either side of the figure communicate using the T.30 fax protocol published by the ITU in 1980. Interconnection of the PSTN with the IP packet network requires a “gateway” between the PSTN and IP networks. PSTN-IP Gateways support TDM voice on the PSTN side and VoIP and FoIP on the packet side. For voice sessions, the gateway will take in voice packets on the IP side, accumulate a few packets to ensure a smooth flow of TDM data upon their release, and then meter them out over TDM where they eventually are heard by a human or stored on a computer for later playback. The gateway employs packet-management techniques to enhance the quality of the speech in the presence of network errors by taking advantage of the natural ability of a listener to not really hear the occasional missing or repeated packet. But facsimile data are transmitted by modems, which aren't as forgiving as the human ear is for speech. Missing packets will often cause a fax session to fail at worst or create one or more image lines in error at best. So the job of T.38 is to “fool” the terminal into “thinking” that it's communicating directly with another T.30 terminal. It will also correct for network delays with so-called spoofing techniques, and missing or delayed packets with fax-aware buffer-management techniques. Spoofing refers to the logic implemented in the protocol engine of a T.38 relay that modifies the protocol commands and responses on the TDM side to keep network delays on the IP side from causing the transaction to fail. This is done, for example, by padding image lines or deliberately causing a message to be re-transmitted to render network delays transparent to the sending/receiving fax terminals. Networks that do not have packet loss or excessive delay can exhibit acceptable fax performance without T.38, provided the PCM clocks in all gateways are of very high accuracy (explained below). T.38 not only removes the effect of PCM clocks not being synchronized, but also reduces the required network bandwidth by a factor of 10, while it corrects for packet loss and delay. === Bandwidth reduction === As shown in the diagram below, a T.38 gateway is composed of two primary elements: the fax modems and the T.38 subsystem. The fax modems modulate and demodulate the PCM samples of the analog data, turning the sampled-data representation of the fax terminal's analog signal to its binary translation, and vice versa. The PSTN network samples the analog signal of a voice or modem signal (it doesn't know the difference) 8,000 times per second (SPS), and encodes them as 8-bit data bytes. This means 8000 samples-per-second times 8-bits per sample, or 64,000 bits per second (bit/s) to represent the modem (or voice) data in one direction. For both directions the modem transaction consumes 128,000 bits of network bandwidth. However, the typical modem in a fax terminal transmits the image data at 33,600 bit/s, so if the analog data are first converted to the digital content they represent, only 33,600 bits (plus network overhead of a few bytes) are needed. And since T.30 fax is a half-duplex protocol, the network is only needed for one direction at a time. Refer to RFC 3261 === PCM clock synchronization === In the diagram above, there is a sample-rate clock in the fax terminal and one in the gateway's modems that is used to trigger the sampling of the analog line 8,000 times per second. These clocks are usually quite accurate, but in some low-cost terminal adapters (a one or two-line gateway) the PCM clock can be surprisingly inaccurate. If the terminal is sending data to the gateway, and the gateway's clock is too slow, the buffers (jitter buffers) in the gateway will eventually overflow, causing the transaction to fail. Since the difference is often quite small, this problem occurs on long, detailed fax images giving the clocks more time to cause the jitter buffer in gateway to either underflow or overflow, which is just the same as missing or duplicated packets. === Packet loss === T.38 provides facilities to eliminate the effects of packet loss through data redundancy. When a packet is sent, either zero, one, two, three, or even more of the previously sent packets are repeated. (The specification does not impose a limit.) This increases the network bandwidth required (it's still much less than not using T.38) but it allows the receiving gateway to reconstruct the complete packet sequence, even with a fairly high level of packet loss. == Related standards == T.4 is the umbrella specification for fax. It specifies the standard image sizes, two forms of image-data compression (encoding), the image-data format, and references, T.30 and the various modem standards. T.6 specifies a compression scheme that reduces the time required to transmit an image by roughly 50-percent. T.30 specifies the procedures that a sending and receiving terminal use to set up a fax call, determine the image size, encoding, and transfer speed, the demarcation between pages, and the termination of the call. T.30 also references the various modem standards. V.21, V.27ter, V.29, V.17, V.34: ITU modem standards used in facsimile. The first three were ratified prior to 1980, and were specified in the original T.4 and T.30 standards. V.34 was published for fax in 1994. T.37 The ITU standard for sending a fax-image file via e-mail to the intended recipient of a fax. G.711 pass through - this is where the T.30 fax call is carried in a VoIP call encoded as audio. This is sensitive to network packet loss, jitter and clock synchronization. When using voice high-compression encoding techniques such as, but not limited to, G.729, some fax tonal signa