Content Disarm and Reconstruction

Content Disarm and Reconstruction

Content Disarm and Reconstruction (CDR) is a computer security technology for removing potentially malicious code from files. Unlike malware analysis, CDR technology does not determine or detect malware's functionality but removes all file components that are not approved within the system's definitions and policies. It is used to prevent cyber security threats from entering a corporate network perimeter. Channels that CDR can be used to protect include email and website traffic. Advanced solutions can also provide similar protection on computer endpoints, or cloud email and file sharing services. There are three levels of CDR; 1) flattening and converting the original file to a PDF, 2) stripping active content while keeping the original file type, and 3) eliminating all file-borne risk while maintaining file type, integrity and active content. Beyond these three levels, there are also more advanced forms of CDR that is able to perform "soft conversion" and "hard conversion", based on the user's preference in balancing usability and security. == Applications == CDR works by processing all incoming files of an enterprise network, deconstructing them, and removing the elements that do not match the file type's standards or set policies. CDR technology then rebuilds the files into clean versions that can be sent on to end users as intended. Because CDR removes all potentially malicious code, it can be effective against zero-day vulnerabilities that rely on being an unknown threat that other security technologies would need to patch against to maintain protection. CDR can be used to prevent cyber threats from variety of sources: Email Data Diodes Web Browsers Endpoints File Servers FTP Cloud email or webmail programs SMB/CIFS Removable media scanning (CDR Kiosk) CDR can be applied to a variety of file formats including: Images Office documents PDF Audio/video file formats Archives HTML == Open source implementations == DocBleach ExeFilter

Deluxe Paint

Deluxe Paint, often referred to as DPaint, is a bitmap graphics editor created by Dan Silva for Electronic Arts and published for the then-new Amiga 1000 in November 1985. A series of updated versions followed, some of which were ported to other platforms. An MS-DOS release with support for the 256 color VGA standard became popular for creating pixel graphics in video games in the 1990s. Author Dan Silva previously worked on the Cut & Paste word processor (1984), also from Electronic Arts. == History == Deluxe Paint began as an in-house art development tool called Prism. As author Dan Silva added features to Prism, it was developed as a showcase product to coincide with the Amiga's debut in 1985. Upon release, it was quickly embraced by the Amiga community and became the de facto graphics (and later animation) editor for the platform. Amiga manufacturer Commodore International later commissioned EA to create version 4.5 AGA to bundle with the new Advanced Graphics Architecture chipset (A1200, A4000) capable Amigas. Version 5 was the last release after Commodore's bankruptcy in 1994. Early versions of Deluxe Paint were available in protected and non copy-protected versions, the latter retailing for a slightly higher price. The copy protection scheme was later dropped. Deluxe Paint was first in a series of products from the Electronic Arts Tools group—then later moved to the ICE (for Interactivity, Creativity, and Education) group—which included such Amiga programs as Deluxe Music Construction Set (preceded by Music Construction Set for the Apple II), Deluxe Video, and the Studio series of paint programs for the Mac. With the development of Deluxe Paint, EA introduced the ILBM and ANIM file format standards for graphics. While widely used on the Amiga, these formats never gained widespread end user acceptance on other platforms, but were heavily used by game development companies. Deluxe Paint was used by LucasArts to make graphics for their adventure games such as The Secret of Monkey Island, and the name of a particular filename used to store the main protagonist Guybrush Threepwood was probably at the origin of his peculiar name. One of the main artist developer of the game, Mark Ferrari, in an interview for The Making of Monkey Island 30th Anniversary Documentary remembers that "there was a pulldown menu in DPaint called brushes, so character sprites were referred to as brushes", and the male protagonist was simply "the guy.brush" until the artist Steve Purcell suggested to take the very name "Guybrush". The author Ron Gilbert remembers that the PC DOS version of the file was named "guybrush.bbm". == Versions == === Amiga === Deluxe Paint I was released in 1985. A major feature was animation by using color cycling. The Amiga natively supports indexed color, where a pixel's color value does not carry any RGB hue information but instead is an index to a color palette (a collection of unique color values). By adjusting the color value in the palette, all pixels with that palette value change simultaneously in the image or animation, creating cyclic movement in the image. In the Christmas demo files on the Deluxe Paint I disk, this kind of animation (which is toggled by pressing the tab key) is used to depict falling snowflakes, a blinking Christmas tree, and a roaring fire in the fireplace. In 1986, Deluxe Paint II was introduced, which added many convenient features such as pattern and gradient fill, which could be selected by right-clicking on a fill tool. An effects menu with e.g. perspective transformation was also added. The screen format could now be changed from a dedicated selection page. Deluxe Paint III appeared in 1989 and added support for Extra Halfbrite. New editing modes allowed one to stencil certain colors to protect them, so it is possible to e.g. paint a landscape from front to back, with the foreground protected by a stencil. A major new feature of Deluxe Paint III was the ability to create cel-like animation, and animbrushes (1MB of RAM is needed for animation). These let the user pick up a section of an animation as an "animbrush", which can then be placed onto the canvas while it animates. Deluxe Paint III was one of the first paint programs to support animbrushes. This is similar to copy and paste, except one can pick up more than one image. Deluxe Paint IV (introduced in 1991), which did not include Silva as the lead programmer, offered significant new features like non-bitplane-indexed Hold-and-Modify support for creating images with up to 4,096 colors. Animation support was improved by adding a light table, i.e. onion skinning, and AnimBrush morphing. The color mixer was now a HAM region at the bottom of the screen (instead of a floating window as before) and allowed mixing adjacent colors similar to a real palette. Deluxe Paint 4.5 AGA appeared the following year, addressing the stability issues and providing support for the new A1200 and A4000 AGA machines and a revamped screen mode interface. It appeared in both standalone and Commodore-bundled versions. The final release, Deluxe Paint V, in 1995, supported true 24-bit RGB images. However, using only the AGA native chipset, the 24-bit RGB color was only held in computer memory, the on-screen image was displayed in HAM8 (18-bit color). === Apple IIGS === DeluxePaint II for the Apple IIGS was developed by Brent Iverson and released in 1987. === MS-DOS === Deluxe Paint II for MS-DOS was released in 1988, It required MS-DOS 2.0 and 640 kB of RAM. It supports CGA, EGA, MCGA, VGA, Hercules and Tandy IBM PC-compatible graphic cards. Deluxe Paint II Enhanced was released in 1989, requiring MS-DOS 2.11 and 640 kB of RAM. It supports resolutions up to 800x600 pixels with 256 colors. Deluxe Paint II Enhanced 2.0, released in 1994, was the most successful MS-DOS version, and was compatible with PC Paintbrush PCX image files. The MS-DOS conversion was done by Brent Iverson with the enhanced features by Steve Shaw. It supports CGA, EGA, MCGA, VGA, Hercules, Tandy, and Amstrad video cards, as well as early Super VGA video cards enabling it to support up to 800 × 600 with 256 (from 262,144) colors and 1024 × 768 with 16 colors. The sister product Deluxe Paint Animation (only for 320×200 pixels and 256 colors) was widely used, especially in video game development. === Atari ST === Deluxe Paint ST was developed by ArtisTech Development, published by Electronic Arts, and was released in 1990. It supports the Atari STE 4096 color palette and animated graphics. Features advertised for the Atari ST version include 3D perspective, design your own fonts, mirror symmetry, multi-color airbrushing & animations, printing up to poster size, split-screen magnification with variable zoom, and working on animations (including multiple animations). == Workflow == "[" and "]" hotkeys step through the indexed palette, turning indexed-pixel-painting into a fast two-handed mouse+keys process, and the right mouse button paints with the background color. For example, transparency is obtained as simply as selecting a background color index (a single right click on the palette GUI to change). colors could be locked from editing by use of a stencil (a list of color indices whose pixels should not be altered in the image data) and simple color-cycling animations could be created using contiguous entries in the palette. This was easy to change the hue and tone of a section of the image by altering the corresponding colors in the palette. (The specific section needed to use a dedicated part of the palette for this technique to work.) Brushes can be cut from the background by using the box, freehand, or polygon selection tools. They can then be used in the same manner as any other brush or pen. This functionality is simpler to use than the "stamp" tool of Photoshop or Alpha Channels as provided in later programs. Brushes can be rotated and scaled, even in 3D. After a brush is selected, it appears attached to the mouse cursor, providing an exact preview of what will be drawn. This allows precise pixel positioning of brushes. Animations stored in IFF ANIM format are delta compressed making animations both smaller and faster to playback. == Reception == Compute! criticized the documentation of the first release of DeluxePaint as inadequate, but stated that "DeluxePaint is a visual arts program of immense scope and flexibility". In later versions the documentation was much improved; for instance DeluxePaint IV came with a 300-page manual. Deluxe Paint was a hit for EA. The main line of the series, particularly installments one to three, has won a total of at least nine awards from independent publications and organizations, including three Amiga-specific awards. Deluxe Paint III also won Commodore International's Enterprise and Vision award in 1990, becoming the first software to win the award, for what the company's judges believed to be best utilizing the Amiga's graphical capabilities. Deluxe Pai

Confusion matrix

In machine learning, a confusion matrix, also known as error matrix, is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one. In unsupervised learning it is usually called a matching matrix. The term is used specifically in the problem of statistical classification. Each row of the matrix represents the instances in an actual class while each column represents the instances in a predicted class, or vice versa – both variants are found in the literature. The diagonal of the matrix therefore represents all instances that are correctly predicted. The name stems from the fact that it makes it easy to identify whether the system is confusing two classes (i.e., commonly mislabeling one class as another). The confusion matrix has its origins in human perceptual studies of auditory stimuli. It was adapted for machine learning studies and used by Frank Rosenblatt, among other early researchers, to compare human and machine classifications of visual (and later auditory) stimuli. It is a special kind of contingency table, with two dimensions ("actual" and "predicted"), and identical sets of "classes" in both dimensions (each combination of dimension and class is a variable in the contingency table). == Example == Given a sample of 12 individuals, 8 that have been diagnosed with cancer and 4 that are cancer-free, where individuals with cancer belong to class 1 (positive) and non-cancer individuals belong to class 0 (negative), we can display that data as follows: Assume that we have a classifier that distinguishes between individuals with and without cancer in some way, we can take the 12 individuals and run them through the classifier. The classifier then makes 9 accurate predictions and misses 3: 2 individuals with cancer wrongly predicted as being cancer-free (sample 1 and 2), and 1 person without cancer that is wrongly predicted to have cancer (sample 9). Notice, that if we compare the actual classification set to the predicted classification set, there are 4 different outcomes that could result in any particular column: The actual classification is positive and the predicted classification is positive (1,1). This is called a true positive result because the positive sample was correctly identified by the classifier. The actual classification is positive and the predicted classification is negative (1,0). This is called a false negative result because the positive sample is incorrectly identified by the classifier as being negative. The actual classification is negative and the predicted classification is positive (0,1). This is called a false positive result because the negative sample is incorrectly identified by the classifier as being positive. The actual classification is negative and the predicted classification is negative (0,0). This is called a true negative result because the negative sample gets correctly identified by the classifier. We can then perform the comparison between actual and predicted classifications and add this information to the table, making correct results appear in green so they are more easily identifiable. The template for any binary confusion matrix uses the four kinds of results discussed above (true positives, false negatives, false positives, and true negatives) along with the positive and negative classifications. The four outcomes can be formulated in a 2×2 confusion matrix, as follows: The color convention of the three data tables above were picked to match this confusion matrix, in order to easily differentiate the data. Now, we can simply total up each type of result, substitute into the template, and create a confusion matrix that will concisely summarize the results of testing the classifier: In this confusion matrix, of the 8 samples with cancer, the system judged that 2 were cancer-free, and of the 4 samples without cancer, it predicted that 1 did have cancer. All correct predictions are located in the diagonal of the table (highlighted in green), so it is easy to visually inspect the table for prediction errors, as values outside the diagonal will represent them. By summing up the 2 rows of the confusion matrix, one can also deduce the total number of positive (P) and negative (N) samples in the original dataset, i.e. P = T P + F N {\displaystyle P=TP+FN} and N = F P + T N {\displaystyle N=FP+TN} . == Table of confusion == In predictive analytics, a table of confusion (sometimes also called a confusion matrix) is a table with two rows and two columns that reports the number of true positives, false negatives, false positives, and true negatives. This allows more detailed analysis than simply observing the proportion of correct classifications (accuracy). Accuracy will yield misleading results if the data set is unbalanced; that is, when the numbers of observations in different classes vary greatly. For example, if there were 95 cancer samples and only 5 non-cancer samples in the data, a particular classifier might classify all the observations as having cancer. The overall accuracy would be 95%, but in more detail the classifier would have a 100% recognition rate (sensitivity) for the cancer class but a 0% recognition rate for the non-cancer class. F1 score is even more unreliable in such cases, and here would yield over 97.4%, whereas informedness removes such bias and yields 0 as the probability of an informed decision for any form of guessing (here always guessing cancer). According to Davide Chicco and Giuseppe Jurman, the most informative metric to evaluate a confusion matrix is the Matthews correlation coefficient (MCC). Other metrics can be included in a confusion matrix, each of them having their significance and use. Some researchers have argued that the confusion matrix, and the metrics derived from it, do not truly reflect a model's knowledge. In particular, the confusion matrix cannot show whether correct predictions were reached through sound reasoning or merely by chance (a problem known in philosophy as epistemic luck). It also does not capture situations where the facts used to make a prediction later change or turn out to be wrong (defeasibility). This means that while the confusion matrix is a useful tool for measuring classification performance, it may give an incomplete picture of a model’s true reliability. == Confusion matrices with more than two categories == Confusion matrix is not limited to binary classification and can be used in multi-class classifiers as well. The confusion matrices discussed above have only two conditions: positive and negative. For example, the table below summarizes communication of a whistled language between two speakers, with zero values omitted for clarity. == Confusion matrices in multi-label and soft-label classification == Confusion matrices are not limited to single-label classification (where only one class is present) or hard-label settings (where classes are either fully present, 1, or absent, 0). They can also be extended to Multi-label classification (where multiple classes can be predicted at once) and soft-label classification (where classes can be partially present). One such extension is the Transport-based Confusion Matrix (TCM), which builds on the theory of optimal transport and the principle of maximum entropy. TCM applies to single-label, multi-label, and soft-label settings. It retains the familiar structure of the standard confusion matrix: a square matrix sized by the number of classes, with diagonal entries indicating correct predictions and off-diagonal entries indicating confusion. In the single-label case, TCM is identical to the standard confusion matrix. TCM follows the same reasoning as the standard confusion matrix: if class A is overestimated (its predicted value is greater than its label value) and class B is underestimated (its predicted value is less than its label value), A is considered confused with B, and the entry (B, A) is increased. If a class is both predicted and present, it is correctly identified, and the diagonal entry (A, A) increases. Optimal transport and maximum entropy are used to determine the extent to which these entries are updated. TCM enables clearer comparison between predictions and labels in complex classification tasks, while maintaining a consistent matrix format across settings.

The Master Algorithm

The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World is a book by Pedro Domingos released in 2015. Domingos wrote the book in order to generate interest from people outside the field. == Overview == The book outlines five approaches of machine learning: inductive reasoning, connectionism, evolutionary computation, Bayes' theorem and analogical modelling. The author explains these tribes to the reader by referring to more understandable processes of logic, connections made in the brain, natural selection, probability and similarity judgments. Throughout the book, it is suggested that each different tribe has the potential to contribute to a unifying "master algorithm". Towards the end of the book the author pictures a "master algorithm" in the near future, where machine learning algorithms asymptotically grow to a perfect understanding of how the world and people in it work. Although the algorithm doesn't yet exist, he briefly reviews his own invention of the Markov logic network. == In the media == In 2016 Bill Gates recommended the book, alongside Nick Bostrom's Superintelligence, as one of two books everyone should read to understand AI. In 2018 the book was noted to be on Chinese Communist Party general secretary Xi Jinping's bookshelf. === Reception === A computer science educator stated in Times Higher Education that the examples are clear and accessible. In contrast, The Economist agreed Domingos "does a good job" but complained that he "constantly invents metaphors that grate or confuse". Kirkus Reviews praised the book, stating that "Readers unfamiliar with logic and computer theory will have a difficult time, but those who persist will discover fascinating insights." A New Scientist review called it "compelling but rather unquestioning".

Ameca (robot)

Ameca is a robotic humanoid created in 2021 by Engineered Arts, headquarters in Falmouth, Cornwall, United Kingdom. The project commenced in February 2021, and the first public demonstration was at the CES 2022 show in Las Vegas. Ameca's appearance features grey rubber skin on the face and hands, and is specifically designed to appear genderless. In 2024, an Ameca unit was installed in Edinburgh in the UK to reside at the National Robotarium. Ameca generation 3 has been released and showcased at ICRA 2025 along with Ami. == History == The first generation of Ameca was developed at Engineered Arts headquarters in Falmouth, Cornwall, United Kingdom. The project started in February 2021, with the first video revealed publicly on 1 December 2021. Ameca gained widespread attention on Twitter and TikTok ahead of its first public demonstration at the Consumer Electronics Show 2022, where it was covered by CNET and other news outlets. In 2022, Ameca presented an Alternative Christmas message by British TV Channel 4 for Christmas Day. Ameca was associated with the Museum of the Future's robotic family, where it could interact with visitors. In 2024, an Ameca unit was installed in Edinburgh in the UK to reside at the National Robotarium. In January 2026, Ameca served as an ambassador for the European Space Agency (ESA) at the 18th European Space Conference. == Features == It is designed as a platform for further developing robotics technologies involving human-robot interaction. utilizes embedded microphones, binocular eye mounted cameras, a chest camera and facial recognition software to interact with the public. Interactions can be governed by either OpenAI's GPT-3 or human telepresence. It also features articulated motorized arms, fingers, neck and facial features. Ameca's appearance features grey rubber skin on the face and hands, and is specifically designed to appear genderless. == Public appearances == Computer History Museum, California Heinz Nixdorf MuseumsForum, Paderborn, Germany Copernicus Science Center, Warsaw, Poland Museum of the Future, Dubai Consumer Electronics Show 2022 Deutsches Museum Nuremberg OMR Festival 2022 Hosted by Vodafone GITEX 2022 International Conference on Robotics and Automation 2023 International Telecommunication Union AI for Good Global Summit 2023 Sphere (Not Ameca, Custom humanoid named Aura built on Ameca technology)

Language engineering

Language engineering involves the creation of natural language processing systems, whose cost and outputs are measurable and predictable. It is a distinct field contrasted to natural language processing and computational linguistics. A recent trend of language engineering is the use of Semantic Web technologies for the creation, archiving, processing, and retrieval of machine processable language data. Meta-Language Engineering is a proposed extension of Language Engineering first recorded in 2025, associated with the work of Delyone de Paula Canedo Filho. The term is used to designate an approach that, in addition to natural language processing, encompasses the symbolic, cognitive, and epistemological structuring of language systems.

Neural computation

Neural computation is the information processing performed by networks of neurons. Neural computation is affiliated with the philosophical tradition of computationalism, which advances the thesis that neural computation explains cognition. Warren McCulloch and Walter Pitts were the first to propose an account of neural activity as being computational in their seminal 1943 paper "A Logical Calculus of the Ideas Immanent in Nervous Activity." There are three general branches of computationalism, including classicism, connectionism, and computational neuroscience. All three branches agree that cognition is computation, however, they disagree on what sorts of computations constitute cognition. The classicism tradition believes that computation in the brain is digital, analogous to digital computing. Both connectionism and computational neuroscience do not require that the computations that realize cognition are necessarily digital computations. However, the two branches greatly disagree upon which sorts of experimental data should be used to construct explanatory models of cognitive phenomena. Connectionists rely upon behavioral evidence to construct models to explain cognitive phenomena, whereas computational neuroscience leverages neuroanatomical and neurophysiological information to construct mathematical models that explain cognition. When comparing the three main traditions of the computational theory of mind, as well as the different possible forms of computation in the brain, it is helpful to define what we mean by computation in a general sense. Computation is the processing of information, otherwise known as variables or entities, according to a set of rules. A rule in this sense is simply an instruction for executing a manipulation on the current state of the variable, in order to produce a specified output. In other words, a rule dictates which output to produce given a certain input to the computing system. A computing system is a mechanism whose components must be functionally organized to process the information in accordance with the established set of rules. The types of information processed by a computing system determine which type of computations it performs. Traditionally in cognitive science, there have been two proposed types of computation related to neural activity, digital and analog, with the vast majority of theoretical work incorporating a digital understanding of cognition. Computing systems that perform digital computation are functionally organized to execute operations on strings of digits with respect to the type and location of the digit on the string. It has been argued that neural spike train signaling implements some form of digital computation, since neural spikes may be considered as discrete units or digits, like 0 or 1—the neuron either fires an action potential or it does not. Accordingly, neural spike trains could be seen as strings of digits. Alternatively, analog computing systems perform manipulations on non-discrete, irreducibly continuous variables, that is, entities that vary continuously as a function of time. These sorts of operations are characterized by systems of differential equations. Neural computation can be studied by, for example, building models of neural computation. Work on artificial neural networks has been somewhat inspired by knowledge of neural computation.