Image tracing

Image tracing

In computer graphics, image tracing, raster-to-vector conversion or raster vectorization is the conversion of raster graphics into vector graphics. == Background == An image does not have any structure: it is just a collection of marks on paper, grains in film, or pixels in a bitmap. While such an image is useful, it has some limits. If the image is magnified enough, its artifacts appear. The halftone dots, film grains, and pixels become apparent. Images of sharp edges become fuzzy or jagged. See, for example, pixelation. Ideally, a vector image does not have the same problem. Edges and filled areas are represented as mathematical curves or gradients, and they can be magnified arbitrarily (though of course the final image must also be rasterized in to be rendered, and its quality depends on the quality of the rasterization algorithm for the given inputs). The task in vectorization is to convert a two-dimensional image into a two-dimensional vector representation of the image. It is not examining the image and attempting to recognize or extract a three-dimensional model that may be depicted; i.e. it is not a vision system. For most applications, vectorization also does not involve optical character recognition; characters are treated as lines, curves, or filled objects without attaching any significance to them. In vectorization, the shape of the character is preserved, so artistic embellishments remain. Vectorization is the inverse operation corresponding to rasterization, as integration is to differentiation. And, just as with these other operations, while rasterization is fairly straightforward and algorithmic, vectorization involves the reconstruction of lost information and therefore requires heuristic methods. Synthetic images such as maps, cartoons, logos, clip art, and technical drawings are suitable for vectorization. Those images could have been originally made as vector images because they are based on geometric shapes or drawn with simple curves. Continuous tone photographs (such as live portraits) are not good candidates for vectorization. The input to vectorization is an image, but an image may come in many forms such as a photograph, a drawing on paper, or one of several raster file formats. Programs that do raster-to-vector conversion may accept bitmap formats such as TIFF, BMP and PNG. The output is a vector file format. Common vector formats are SVG, DXF, EPS, EMF and AI. Vectorization can be used to update images or recover work. Personal computers often come with a simple paint program that produces a bitmap output file. These programs allow users to make simple illustrations by adding text, drawing outlines, and filling outlines with a specific color. Only the results of these operations (the pixels) are saved in the resulting bitmap; the drawing and filling operations are discarded. Vectorization can be used to recapture some of the information that was lost. Vectorization is also used to recover information that was originally in a vector format but has been lost or has become unavailable. A company may have commissioned a logo from a graphic arts firm. Although the graphics firm used a vector format, the client company may not have received a copy of that format. The company may then acquire a vector format by scanning and vectorizing a paper copy of the logo. == Process == Vectorization starts with an image. === Manual === The image can be vectorized manually. A person could look at the image, make some measurements, and then write the output file by hand. That was the case for the vectorization of a technical illustration about neutrinos. The illustration has a few geometric shapes and a lot of text; it was relatively easy to convert the shapes, and the SVG vector format allows the text (even subscripts and superscripts) to be entered easily. The original image did not have any curves (except for the text), so the conversion is straightforward. Curves make the conversion more complicated. Manual vectorization of complicated shapes can be facilitated by the tracing function built into some vector graphics editing programs. If the image is not yet in machine readable form, then it has to be scanned into a usable file format. Once there is a machine-readable bitmap, the image can be imported into a graphics editing program (such as Adobe Illustrator, CorelDRAW, or Inkscape). Then a person can manually trace the elements of the image using the program's editing features. Curves in the original image can be approximated with lines, arcs, and Bézier curves. An illustration program allows spline knots to be adjusted for a close fit. Manual vectorization is possible, but it can be tedious. Although graphics drawing programs have been around for a long time, artists may find the freehand drawing facilities awkward even when a drawing tablet is used. Instead of using a program, Pepper recommends making an initial sketch on paper. Instead of scanning the sketch and tracing it freehand in the computer, Pepper states: "Those proficient with a graphic tablet and stylus could make the following changes directly in CorelDRAW by using a scan of the sketch as an underlay and drawing over it. I prefer to use pen and ink, and a light table"; most of the final image was traced by hand in ink. Later the line-drawing image was scanned at 600 dpi, cleaned up in a paint program, and then automatically traced with a program. Once the black and white image was in the graphics program, some other elements were added and the figure was colored. Similarly, Ploch recreated a design from a digital photograph. The JPEG was imported and some "basic shapes" were traced by hand and colored in the graphics drawing program; more complex shapes were handled differently. Ploch used a bitmap editor to remove the background and crop the more complex image components. He then printed the image and traced it by hand onto tracing paper to get a clean black and white line drawing. That drawing was scanned and then vectorized with a program. === Automatic === Some programs automate the vectorization process. Example programs are Adobe Illustrator, Inkscape, Corel's PowerTRACE, and Potrace. Some of these programs have a command line interface while others are interactive that allow the user to adjust the conversion settings and view the result. Adobe Streamline is not only an interactive program, but it also allows a user to manually edit the input bitmap and the output curves. Corel's PowerTRACE is accessed through CorelDRAW; CorelDRAW can be used to modify the input bitmap and edit the output curves. Adobe Illustrator has a facility to trace individual curves. Automated programs can have mixed results. A program (PowerTRACE) was used to convert a PNG map to SVG. The program did a good job on the map boundaries (the most tedious task in the tracing) and the settings dropped out all the text (small objects). The text was manually re-inserted. Other conversions may not go as well. The results depend on having high-quality scans, reasonable settings, and good algorithms. Scanned images often have a lot of noise, which can require additional work to clean up. == Options == There are many different image styles and possibilities, and no single vectorization method works well on all images. Consequently, vectorization programs have many options that influence the result. One issue is what the predominant shapes are. If the image is of a fill-in form, then it will probably have just vertical and horizontal lines of a constant width. The program's vectorization should take that into account. On the other hand, a CAD drawing may have lines at any angle, there may be curved lines, and there may be several line weights (thick for objects and thin for dimension lines). Instead of (or in addition to) curves, the image may contain outlines filled with the same color. Adobe Streamline allows users to select a combination of line recognition (horizontal and vertical lines), centerline recognition, or outline recognition. Streamline also allows small outline shapes to be thrown out; the notion is such small shapes are noise. The user may set the noise level between 0 and 1000; an outline that has fewer pixels than that setting is discarded. Another issue is the number of colors in the image. Even images that were created as black on white drawings may end up with many shades of gray. Some line-drawing routines employ anti-aliasing; a pixel completely covered by the line will be black, but a pixel that is only partially covered will be gray. If the original image is on paper and is scanned, there is a similar result: edge pixels will be gray. Sometimes images are compressed (e.g., JPEG images), and the compression will introduce gray levels. Many of the vectorization programs will group same-color pixels into lines, curves, or outlined shapes. If each possible color is grouped into its object, there can be an enormous number of objects. Instead, the user is asked to s

Shape context

Shape context is a feature descriptor used in object recognition. Serge Belongie and Jitendra Malik proposed the term in their paper "Matching with Shape Contexts" in 2000. == Theory == The shape context is intended to be a way of describing shapes that allows for measuring shape similarity and the recovering of point correspondences. The basic idea is to pick n points on the contours of a shape. For each point pi on the shape, consider the n − 1 vectors obtained by connecting pi to all other points. The set of all these vectors is a rich description of the shape localized at that point but is far too detailed. The key idea is that the distribution over relative positions is a robust, compact, and highly discriminative descriptor. So, for the point pi, the coarse histogram of the relative coordinates of the remaining n − 1 points, h i ( k ) = # { q ≠ p i : ( q − p i ) ∈ bin ( k ) } {\displaystyle h_{i}(k)=\#\{q\neq p_{i}:(q-p_{i})\in {\mbox{bin}}(k)\}} is defined to be the shape context of p i {\displaystyle p_{i}} . The bins are normally taken to be uniform in log-polar space. The fact that the shape context is a rich and discriminative descriptor can be seen in the figure below, in which the shape contexts of two different versions of the letter "A" are shown. (a) and (b) are the sampled edge points of the two shapes. (c) is the diagram of the log-polar bins used to compute the shape context. (d) is the shape context for the point marked with a circle in (a), (e) is that for the point marked as a diamond in (b), and (f) is that for the triangle. As can be seen, since (d) and (e) are the shape contexts for two closely related points, they are quite similar, while the shape context in (f) is very different. For a feature descriptor to be useful, it needs to have certain invariances. In particular it needs to be invariant to translation, scaling, small perturbations, and, depending on the application, rotation. Translational invariance comes naturally to shape context. Scale invariance is obtained by normalizing all radial distances by the mean distance α {\displaystyle \alpha } between all the point pairs in the shape although the median distance can also be used. Shape contexts are empirically demonstrated to be robust to deformations, noise, and outliers using synthetic point set matching experiments. One can provide complete rotational invariance in shape contexts. One way is to measure angles at each point relative to the direction of the tangent at that point (since the points are chosen on edges). This results in a completely rotationally invariant descriptor. But of course this is not always desired since some local features lose their discriminative power if not measured relative to the same frame. Many applications in fact forbid rotational invariance e.g. distinguishing a "6" from a "9". == Use in shape matching == A complete system that uses shape contexts for shape matching consists of the following steps (which will be covered in more detail in the Details of Implementation section): Randomly select a set of points that lie on the edges of a known shape and another set of points on an unknown shape. Compute the shape context of each point found in step 1. Match each point from the known shape to a point on an unknown shape. To minimize the cost of matching, first choose a transformation (e.g. affine, thin plate spline, etc.) that warps the edges of the known shape to the unknown (essentially aligning the two shapes). Then select the point on the unknown shape that most closely corresponds to each warped point on the known shape. Calculate the "shape distance" between each pair of points on the two shapes. Use a weighted sum of the shape context distance, the image appearance distance, and the bending energy (a measure of how much transformation is required to bring the two shapes into alignment). To identify the unknown shape, use a nearest-neighbor classifier to compare its shape distance to shape distances of known objects. == Details of implementation == === Step 1: Finding a list of points on shape edges === The approach assumes that the shape of an object is essentially captured by a finite subset of the points on the internal or external contours on the object. These can be simply obtained using the Canny edge detector and picking a random set of points from the edges. Note that these points need not and in general do not correspond to key-points such as maxima of curvature or inflection points. It is preferable to sample the shape with roughly uniform spacing, though it is not critical. === Step 2: Computing the shape context === This step is described in detail in the Theory section. === Step 3: Computing the cost matrix === Consider two points p and q that have normalized K-bin histograms (i.e. shape contexts) g(k) and h(k). As shape contexts are distributions represented as histograms, it is natural to use the χ2 test statistic as the "shape context cost" of matching the two points: C S = 1 2 ∑ k = 1 K [ g ( k ) − h ( k ) ] 2 g ( k ) + h ( k ) {\displaystyle C_{S}={\frac {1}{2}}\sum _{k=1}^{K}{\frac {[g(k)-h(k)]^{2}}{g(k)+h(k)}}} The values of this range from 0 to 1. In addition to the shape context cost, an extra cost based on the appearance can be added. For instance, it could be a measure of tangent angle dissimilarity (particularly useful in digit recognition): C A = 1 2 ‖ ( cos ⁡ ( θ 1 ) sin ⁡ ( θ 1 ) ) − ( cos ⁡ ( θ 2 ) sin ⁡ ( θ 2 ) ) ‖ {\displaystyle C_{A}={\frac {1}{2}}{\begin{Vmatrix}{\dbinom {\cos(\theta _{1})}{\sin(\theta _{1})}}-{\dbinom {\cos(\theta _{2})}{\sin(\theta _{2})}}\end{Vmatrix}}} This is half the length of the chord in unit circle between the unit vectors with angles θ 1 {\displaystyle \theta _{1}} and θ 2 {\displaystyle \theta _{2}} . Its values also range from 0 to 1. Now the total cost of matching the two points could be a weighted-sum of the two costs: C = ( 1 − β ) C S + β C A {\displaystyle C=(1-\beta )C_{S}+\beta C_{A}\!\,} Now for each point pi on the first shape and a point qj on the second shape, calculate the cost as described and call it Ci,j. This is the cost matrix. === Step 4: Finding the matching that minimizes total cost === Now, a one-to-one matching π ( i ) {\displaystyle \pi (i)} that matches each point pi on shape 1 and qj on shape 2 that minimizes the total cost of matching, H ( π ) = ∑ i C ( p i , q π ( i ) ) {\displaystyle H(\pi )=\sum _{i}C\left(p_{i},q_{\pi (i)}\right)} is needed. This can be done in O ( N 3 ) {\displaystyle O(N^{3})} time using the Hungarian method, although there are more efficient algorithms. To have robust handling of outliers, one can add "dummy" nodes that have a constant but reasonably large cost of matching to the cost matrix. This would cause the matching algorithm to match outliers to a "dummy" if there is no real match. === Step 5: Modeling transformation === Given the set of correspondences between a finite set of points on the two shapes, a transformation T : R 2 → R 2 {\displaystyle T:\mathbb {R} ^{2}\to \mathbb {R} ^{2}} can be estimated to map any point from one shape to the other. There are several choices for this transformation, described below. ==== Affine ==== The affine model is a standard choice: T ( p ) = A p + o {\displaystyle T(p)=Ap+o\!} . The least squares solution for the matrix A {\displaystyle A} and the translational offset vector o is obtained by: o = 1 n ∑ i = 1 n ( p i − q π ( i ) ) , A = ( Q + P ) t {\displaystyle o={\frac {1}{n}}\sum _{i=1}^{n}\left(p_{i}-q_{\pi (i)}\right),A=(Q^{+}P)^{t}} Where P = ( 1 p 11 p 12 ⋮ ⋮ ⋮ 1 p n 1 p n 2 ) {\displaystyle P={\begin{pmatrix}1&p_{11}&p_{12}\\\vdots &\vdots &\vdots \\1&p_{n1}&p_{n2}\end{pmatrix}}} with a similar expression for Q {\displaystyle Q\!} . Q + {\displaystyle Q^{+}\!} is the pseudoinverse of Q {\displaystyle Q\!} . ==== Thin plate spline ==== The thin plate spline (TPS) model is the most widely used model for transformations when working with shape contexts. A 2D transformation can be separated into two TPS function to model a coordinate transform: T ( x , y ) = ( f x ( x , y ) , f y ( x , y ) ) {\displaystyle T(x,y)=\left(f_{x}(x,y),f_{y}(x,y)\right)} where each of the ƒx and ƒy have the form: f ( x , y ) = a 1 + a x x + a y y + ∑ i = 1 n ω i U ( ‖ ( x i , y i ) − ( x , y ) ‖ ) , {\displaystyle f(x,y)=a_{1}+a_{x}x+a_{y}y+\sum _{i=1}^{n}\omega _{i}U\left({\begin{Vmatrix}(x_{i},y_{i})-(x,y)\end{Vmatrix}}\right),} and the kernel function U ( r ) {\displaystyle U(r)\!} is defined by U ( r ) = r 2 log ⁡ r 2 {\displaystyle U(r)=r^{2}\log r^{2}\!} . The exact details of how to solve for the parameters can be found elsewhere but it essentially involves solving a linear system of equations. The bending energy (a measure of how much transformation is needed to align the points) will also be easily obtained. ==== Regularized TPS ==== The TPS formulation above has exact matching requirement for the pairs of points on the two shapes. For noisy data, it is best to

Automation integrator

An automation integrator is a systems integrator company or individual who makes different versions of automation hardware and software work together, generally combining several subsystems to work together as one large system. The title may refer to those who only integrate hardware, although these will often work with software integrators. Software created by automation integrators allows devices to communicate with each other, as well as collecting and reporting data. The magazine Control Engineering publishes an annual “Automation Integrator Guide” which lists over 2,000 automation integrators. They also give an annual system integrator of the year award to three automation integration firms. The Control System Integrators Association (CSIA) maintains a buyers' guide of over 1200 member and nonmember systems integrators known as the Industrial Automation Exchange, or CSIA Exchange for short. == Certification == The Control System Integrators Association (CSIA) certifies automation integrators, through an audit based on 79 critical criteria from the best practices manual. Companies must be associate members of the CSIA to be eligible for certification. Integrators can also receive certification through a program launched in 2012 by the Robotics Industries Association. == Industries == Automation Integrators work in a wide variety of industries which use robotics and automation. Some of the most common include:

Comparison of raster graphics editors

Raster graphics editors can be compared by many variables, including availability. == List == == General information == Basic general information about the editor: creator, company, license, etc. == Operating system support == The operating systems on which the editors can run natively, that is, without emulation, virtual machines or compatibility layers. In other words, the software must be specifically coded for the operation system; for example, Adobe Photoshop for Windows running on Linux with Wine does not fit. == Features == == Color spaces == == File support ==

PNGOUT

PNGOUT is a freeware command line optimizer for PNG images written by Ken Silverman. The transformation is lossless, meaning that the resulting image is visually identical to the source image. According to its author, this program can often get higher compression than other optimizers by 5–10%. It is possible to compress some inflated PNGs to a size below 1% of the original file. PNGOUT was also available as a plug-in for the freeware image viewer IrfanView and can be enabled as an option when saving files. It allows editing of various PNGOUT settings via a dialog box. PNGOUT integration was removed in IrfanView version 4.58 in favour of OptiPNG. In 2006, a commercial version of PNGOUT with a graphical user interface, known as PNGOUTWin, was released by Ardfry Imaging, a small company Silverman co-founded in 2005. There is also a freeware GUI frontend to PNGOUT available, known as PNGGauntlet. == Main operation == The main function of PNGOUT is to reduce the size of image data contained in the IDAT chunk. This chunk is compressed using the deflate algorithm. Deflate algorithms can vary in speed and compression ratio, with higher compression ratios generally implying lower speed. Ken Silverman wrote a deflate compressor for PNGOUT that is slower than the ones used in most graphics software, but produces smaller files. PNGOUT also performs automatic bit depth, color, and palette reduction where appropriate.

Multi-scale approaches

The scale space representation of a signal obtained by Gaussian smoothing satisfies a number of special properties, scale-space axioms, which make it into a special form of multi-scale representation. There are, however, also other types of "multi-scale approaches" in the areas of computer vision, image processing and signal processing, in particular the notion of wavelets. The purpose of this article is to describe a few of these approaches: == Scale-space theory for one-dimensional signals == For one-dimensional signals, there exists quite a well-developed theory for continuous and discrete kernels that guarantee that new local extrema or zero-crossings cannot be created by a convolution operation. For continuous signals, it holds that all scale-space kernels can be decomposed into the following sets of primitive smoothing kernels: the Gaussian kernel : g ( x , t ) = 1 2 π t exp ⁡ ( − x 2 / 2 t ) {\displaystyle g(x,t)={\frac {1}{\sqrt {2\pi t}}}\exp({-x^{2}/2t})} where t > 0 {\displaystyle t>0} , truncated exponential kernels (filters with one real pole in the s-plane): h ( x ) = exp ⁡ ( − a x ) {\displaystyle h(x)=\exp({-ax})} if x ≥ 0 {\displaystyle x\geq 0} and 0 otherwise where a > 0 {\displaystyle a>0} h ( x ) = exp ⁡ ( b x ) {\displaystyle h(x)=\exp({bx})} if x ≤ 0 {\displaystyle x\leq 0} and 0 otherwise where b > 0 {\displaystyle b>0} , translations, rescalings. For discrete signals, we can, up to trivial translations and rescalings, decompose any discrete scale-space kernel into the following primitive operations: the discrete Gaussian kernel T ( n , t ) = I n ( α t ) {\displaystyle T(n,t)=I_{n}(\alpha t)} where α , t > 0 {\displaystyle \alpha ,t>0} where I n {\displaystyle I_{n}} are the modified Bessel functions of integer order, generalized binomial kernels corresponding to linear smoothing of the form f o u t ( x ) = p f i n ( x ) + q f i n ( x − 1 ) {\displaystyle f_{out}(x)=pf_{in}(x)+qf_{in}(x-1)} where p , q > 0 {\displaystyle p,q>0} f o u t ( x ) = p f i n ( x ) + q f i n ( x + 1 ) {\displaystyle f_{out}(x)=pf_{in}(x)+qf_{in}(x+1)} where p , q > 0 {\displaystyle p,q>0} , first-order recursive filters corresponding to linear smoothing of the form f o u t ( x ) = f i n ( x ) + α f o u t ( x − 1 ) {\displaystyle f_{out}(x)=f_{in}(x)+\alpha f_{out}(x-1)} where α > 0 {\displaystyle \alpha >0} f o u t ( x ) = f i n ( x ) + β f o u t ( x + 1 ) {\displaystyle f_{out}(x)=f_{in}(x)+\beta f_{out}(x+1)} where β > 0 {\displaystyle \beta >0} , the one-sided Poisson kernel p ( n , t ) = e − t t n n ! {\displaystyle p(n,t)=e^{-t}{\frac {t^{n}}{n!}}} for n ≥ 0 {\displaystyle n\geq 0} where t ≥ 0 {\displaystyle t\geq 0} p ( n , t ) = e − t t − n ( − n ) ! {\displaystyle p(n,t)=e^{-t}{\frac {t^{-n}}{(-n)!}}} for n ≤ 0 {\displaystyle n\leq 0} where t ≥ 0 {\displaystyle t\geq 0} . From this classification, it is apparent that we require a continuous semi-group structure, there are only three classes of scale-space kernels with a continuous scale parameter; the Gaussian kernel which forms the scale-space of continuous signals, the discrete Gaussian kernel which forms the scale-space of discrete signals and the time-causal Poisson kernel that forms a temporal scale-space over discrete time. If we on the other hand sacrifice the continuous semi-group structure, there are more options: For discrete signals, the use of generalized binomial kernels provides a formal basis for defining the smoothing operation in a pyramid. For temporal data, the one-sided truncated exponential kernels and the first-order recursive filters provide a way to define time-causal scale-spaces that allow for efficient numerical implementation and respect causality over time without access to the future. The first-order recursive filters also provide a framework for defining recursive approximations to the Gaussian kernel that in a weaker sense preserve some of the scale-space properties.

Normalization (image processing)

In image processing, normalization is a process that changes the range of pixel intensity values, a kind of intensity mapping. Applications include photographs with poor contrast due to glare, for example. A typical case is contrast stretching. In more general fields of data processing, such as digital signal processing, it is referred to as dynamic range expansion. The purpose of dynamic range expansion in the various applications is usually to bring the image, or other type of signal, into a range that is more familiar or normal to the senses, hence the term normalization. Often, the motivation is to achieve consistency in dynamic range for a set of data, signals, or images to avoid mental distraction or fatigue. For example, a newspaper will strive to make all of the images in an issue share a similar range of grayscale. Auto-normalization in image processing software typically normalizes to the full dynamic range of the number system specified in the image file format. == Definition == Normalization transforms an n-dimensional grayscale image I : { X ⊆ R n } → { Min , . . , Max } {\displaystyle I:\{\mathbb {X} \subseteq \mathbb {R} ^{n}\}\rightarrow \{{\text{Min}},..,{\text{Max}}\}} with intensity values in the range ( Min , Max ) {\displaystyle ({\text{Min}},{\text{Max}})} , into a new image I N : { X ⊆ R n } → { newMin , . . , newMax } {\displaystyle I_{N}:\{\mathbb {X} \subseteq \mathbb {R} ^{n}\}\rightarrow \{{\text{newMin}},..,{\text{newMax}}\}} with intensity values in the range ( newMin , newMax ) {\displaystyle ({\text{newMin}},{\text{newMax}})} . The linear normalization of a grayscale digital image is performed according to the formula I N = ( I − Min ) newMax − newMin Max − Min + newMin {\displaystyle I_{N}=(I-{\text{Min}}){\frac {{\text{newMax}}-{\text{newMin}}}{{\text{Max}}-{\text{Min}}}}+{\text{newMin}}} For example, if the intensity range of the image is 50 to 180 and the desired range is 0 to 255 the process entails subtracting 50 from each of pixel intensity, making the range 0 to 130. Then each pixel intensity is multiplied by 255/130, making the range 0 to 255. Normalization might also be non-linear, as the relationship between I {\displaystyle I} and I N {\displaystyle I_{N}} may not be linear. An example of non-linear normalization is when the normalization follows a sigmoid function, in which case the normalized image is computed according to the formula I N = ( newMax − newMin ) 1 1 + e − I − β α + newMin {\displaystyle I_{N}=({\text{newMax}}-{\text{newMin}}){\frac {1}{1+e^{-{\frac {I-\beta }{\alpha }}}}}+{\text{newMin}}} Where α {\displaystyle \alpha } defines the width of the input intensity range, and β {\displaystyle \beta } defines the intensity around which the range is centered. Gamma correction (log/inverse log) is also a common transformation function. === Colorspace === Intensity operations generally operate on a colorspace that maps to the human perception of lightness without intentionally changing the other properties. This can be done, for example, by operating on the L component of the CIELAB color space, or approximately by operating on the Y component of YCbCr. It is also possible to operate on each of the RGB color channels, though the result will not always make sense. == Contrast stretching == This is the most significant and essential technique of spatial-based image enhancement. The basic intent of this contrast enhancement technique is to adjust the local contrast in the image so as to bring out the clear regions or objects in the image. Low-contrast images often result from poor or non-uniform lighting conditions, a limited dynamic range of the imaging sensor, or improper settings of the lens aperture. This operation tries to change the intensity of the pixel in the image, particularly in the input image, to obtain an enhanced image. It is based on the number of techniques, namely local, global, dark and bright levels of contrast. The contrast enhancement is considered as the amount of color or gray differentiation that lies among the different features in an image. The contrast enhancement improves the quality of image by increasing the luminance difference between the foreground and background. A contrast stretching transformation can be achieved by: Stretching the dark range of input values into a wider range of output values: This involves increasing the brightness of the darker areas in the image to enhance details and improve visibility. Shifting the mid-range of input values: This involves adjusting the brightness levels of the mid-tones in the image to improve overall contrast and clarity. Compressing the bright range of input values: This process involves reducing the brightness of the brighter areas in the image to prevent overexposure resulting in a more balanced and visually appealing image. It can be described as the following piecewise funciton: I N = { s 1 r 1 I if I < r 1 s 2 − s 1 r 1 − r 2 ( I − r 1 ) if r 1 ≤ I ≤ r 2 1 − s 2 1 − r 2 ( I − r 2 ) if I > r 2 {\displaystyle I_{N}={\begin{cases}{\frac {s_{1}}{r_{1}}}I&{\text{if }}Ir_{2}\end{cases}}} Where: ( r 1 , s 1 ) {\displaystyle (r_{1},s_{1})} defines the transition point between the "dark" range to the "main" range. ( r 2 , s 2 ) {\displaystyle (r_{2},s_{2})} defines the transition point between the "main" range to the "bright" range. A typical linear stretch is obtained when ( r 1 , s 1 ) = ( r min , 0 ) {\displaystyle (r_{1},s_{1})=(r_{\text{min}},0)} and ( r 2 , s 2 ) = ( r max , 1 ) {\displaystyle (r_{2},s_{2})=(r_{\text{max}},1)} , where r min {\displaystyle r_{\text{min}}} and r max {\displaystyle r_{\text{max}}} denote the minimum and maximum levels in the source image. === Global contrast stretching === Global Contrast Stretching considers all color palate ranges at once to determine the maximum and minimum values for the entire RGB color image. This approach utilizes the combination of RGB colors to derive a single maximum and minimum value for contrast stretching across the entire image. === Local contrast stretching === Local contrast stretching (LCS) is an image enhancement method that focuses on locally adjusting each pixel's value to improve the visualization of structures within an image, particularly in both the darkest and lightest portions. It operates by utilizing sliding windows, known as kernels, which traverse the image. The central pixel within each kernel is adjusted using the following formula: I p ( x , y ) = 255 × [ I 0 ( x , y ) − m i n ] ( m a x − m i n ) {\displaystyle I_{p}(x,y)=255\times {\frac {[I_{0}(x,y)-min]}{(max-min)}}} Where: Ip(x,y) is the color level for the output pixel (x,y) after the contrast stretching process. I0(x,y) is the color level input for data pixel (x, y). max is the maximum value for color level in the input image within the selected kernel. min is the minimum value for color level in the input image within the selected kernel. A piecewise form (see above) may also be used. LCS can be applied to the three color channels of an image separately.