AI Data Poisoning

AI Data Poisoning — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Corona-Warn-App

    Corona-Warn-App

    Corona-Warn-App was the official and open-source COVID-19 contact tracing app used for digital contact tracing in Germany made by SAP and Deutsche Telekom subsidiary T-Systems. It had been downloaded 22.8 million times as of 19 November 2020 and 26.2 million times as of 18 March 2021. The app has been promoted by billboard and broadcast advertisements, e.g. in cooperation with the German Football Association (DFB) and other prominent companies. The German government has announced that the app would no longer exchange tracing information as of April 30, 2023 & would enter hibernation as of June 1, 2023. == Effectiveness == Experts believe that time saved by using the app can be critical for improving the effectiveness contact tracing efforts. Some virologists say when at least 60% of people in Germany use it, it would be very effective. == Functioning == The app works with the Exposure Notification Framework (what is implemented in Google Play Services for Android and in iOS) by using Bluetooth to exchange codes with app users that are within 1.5 meters of each other for a period of at least 10 minutes. Anyone who tests positive for COVID-19 can share this information voluntarily with the app. Other app users are then notified about when, how long and at what distance they had contact with the infected person within a 14-day period. Testing is available for persons on a voluntary basis. === Server architecture === Based on the Client–server model five servers are operated within the app backend: the Corona-Warn-App server. It stores the authorized keys of infected users, referred to as diagnosis keys, from the past 14 days in its database. Stored diagnosis keys are grouped into regularly updated blocks which are transmitted to the Content Delivery Network. This interface supplies the keys for the app clients to download and locally compute a potential exposure risk. the Verification server. It is responsible for documenting the approval of the user to share their positive test result with the app and also to verify the test result. the Portal Server. It generates a so-called teleTAN token if the user did not give their consent to share their test result with the app at first but then changed their mind or if the local public health authority or test laboratory is not connected to the app system yet. the Test Result Server. It saves the test results provided by the local public health authorities or test laboratories for further use within the backend. the Federation Gateway Server. It connects to the national Corona-Warn-App servers of participating EU countries to enable transnational key exchange. By the distribution of the data on different servers the decoupling of the data becomes possible and results in an obstructed tracing of the app users. ==== Report of a positive COVID-19 test ==== The app provides a function to warn other app users by uploading their positive test result on a voluntarily and anonymous basis to the Corona-Warn-App server. In case the local public health authority or test laboratory is already connected to the app system, the user receives a QR-Code when the swab specimen is taken that can be scanned in the app. After scanning the QR-Code und the user getting authorized by the Verification server, the app receives an individual Registration token which gets stored locally and with which the status and the result of the test can be checked manually as well as automatically. If the local public health authority or test laboratory is not connected to the app system yet and the user wants to share their positive test result with other app users, it is required to request a teleTAN token by calling the verification hotline of the app. In both cases, the user can upload their diagnosis keys of the last 14 days to the Corona-Warn-App server in case their consent to share the information is given. The Corona-Warn-App server then verifies the uploaded keys by asking the Verification server if the keys are valid and if they are, the Corona-Warn-App server stores them in its database. == Privacy == The use of the app is voluntary. The app implements decentralized data storage to ensure data privacy. Employers can require that Corona-Warn be installed on company phones, but can not compel its use on private phones. == Funding == The open source app, which costs €20 million to develop is intended to supplement human contact tracing efforts, which Germany put in place during the early stages of the COVID-19 pandemic in Germany. In August 2022, a spokesperson for the German ministry of health announced that the total costs including all additional developments are now estimated to be closer to €150m. == Interoperability == At its start the app only worked in Germany, and Jens Spahn, than Federal Minister of Health (CDU), has said the development of a Europe-wide system is a future goal. With the update published on 19 October 2020 the app supports key-exchanges with the EU Interoperability Gateway and is therefore able to communicate with contact tracing apps from Ireland and Italy. Austria, Belgium, Czech Republic, Croatia, Cyprus, Denmark, Finland, Ireland, Italy, Latvia, Malta, Netherlands, Norway, Poland, Slovenia, Spain and Switzerland had joined the gateway as well and are also able to exchange keys with Corona-Warn-App. The app can be downloaded in many App stores outside of Germany. However, as of August 2021, the app is still unavailable for those of notable national German minorities like Turks, Russians or Ukrainians, who use App stores of their home countries. == Software variants == An unofficial Corona-Warn-App has been released on F-Droid, making the app available without proprietary components on Android phones. == Literature == Thomas Köllmann: Die Corona-Warn-App – Schnittstelle zwischen Datenschutz- und Arbeitsrecht. In: Neue Zeitschrift für Arbeitsrecht. Nr. 13, 10. Juli 2020, S. 831–836.

    Read more →
  • Sentence extraction

    Sentence extraction

    Sentence extraction is a technique used for automatic summarization of a text. In this shallow approach, statistical heuristics are used to identify the most salient sentences of a text. Sentence extraction is a low-cost approach compared to more knowledge-intensive deeper approaches which require additional knowledge bases such as ontologies or linguistic knowledge. In short, sentence extraction works as a filter that allows only meaningful sentences to pass. The major downside of applying sentence-extraction techniques to the task of summarization is the loss of coherence in the resulting summary. Nevertheless, sentence extraction summaries can give valuable clues to the main points of a document and are frequently sufficiently intelligible to human readers. == Procedure == Usually, a combination of heuristics is used to determine the most important sentences within the document. Each heuristic assigns a (positive or negative) score to the sentence. After all heuristics have been applied, the highest-scoring sentences are included in the summary. The individual heuristics are weighted according to their importance. === Early approaches and some sample heuristics === Seminal papers which laid the foundations for many techniques used today have been published by Hans Peter Luhn in 1958 and H. P Edmundson in 1969. Luhn proposed to assign more weight to sentences at the beginning of the document or a paragraph. Edmundson stressed the importance of title-words for summarization and was the first to employ stop-lists in order to filter uninformative words of low semantic content (e.g. most grammatical words such as of, the, a). He also distinguished between bonus words and stigma words, i.e. words that probably occur together with important (e.g. the word form significant) or unimportant information. His idea of using key-words, i.e. words which occur significantly frequently in the document, is still one of the core heuristics of today's summarizers. With large linguistic corpora available today, the tf–idf value which originated in information retrieval, can be successfully applied to identify the key words of a text: If for example the word cat occurs significantly more often in the text to be summarized (TF = "term frequency") than in the corpus (IDF means "inverse document frequency"; here the corpus is meant by document), then cat is likely to be an important word of the text; the text may in fact be a text about cats.

    Read more →
  • AFNLP

    AFNLP

    AFNLP (Asian Federation of Natural Language Processing Associations) is the organization for coordinating the natural language processing related activities and events in the Asia-Pacific region. == Foundation == AFNLP was founded on 4 October 2000. == Member Associations == ALTA – Australasian Language Technology Association ANLP Japan Association of Natural Language Processing ROCLING Taiwan ROC Computational Linguistics Society SIG-KLC Korea SIG-Korean Language Computing of Korea Information Science Society == Existing Asian Initiatives == NLPRS: Natural Language Processing Pacific Rim Symposium IRAL: International Workshop on Information Retrieval with Asian Languages PACLING: Pacific Association for Computational Linguistics PACLIC: Pacific Asia Conference on Language, Information and Computation PRICAI: Pacific Rim International Conference on AI ICCPOL: International Conference on Computer Processing of Oriental Languages ROCLING: Research on Computational Linguistics Conference == Conferences == IJCNLP-04: The 1st International Joint Conference on Natural Language Processing in Hainan Island, China IJCNLP-05: The 2nd International Joint Conference on Natural Language Processing in Jeju Island, Korea IJCNLP-08: The 3rd International Joint Conference on Natural Language Processing in Hyderabad, India ACL-IJCNLP-2009: Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics (ACL) and 4th International Joint Conference on Natural Language Processing (IJCNLP) in Singapore IJNCLP-11: The 5th International Joint Conference on Natural Language Processing in Chiang Mai, Thailand

    Read more →
  • Graph cuts in computer vision and artificial intelligence

    Graph cuts in computer vision and artificial intelligence

    As applied in the field of computer vision, graph cut optimization can be employed to efficiently solve a wide variety of low-level computer vision problems (early vision), such as image smoothing, the stereo correspondence problem, image segmentation, object co-segmentation, numerous military applications (eg Automatic target recognition) and many other problems that can be formulated in terms of energy minimization (eg Climate Science and Environmental modelling). Graph cut techniques are now increasingly being used in combination with more general spatial Artificial intelligence techniques (eg to enforce structure in Large language model output to sharpen tumour boundaries and similarly for various Augmented reality, Self-driving car, Robotics, Google Maps applications etc). Many of these energy minimization problems can be approximated by solving a maximum flow problem in a graph (and thus, by the max-flow min-cut theorem, define a minimal cut of the graph). Under most formulations of such problems in computer vision, the minimum energy solution corresponds to the maximum a posteriori estimate of a solution. Although many computer vision algorithms involve cutting a graph (e.g. normalized cuts), the term "graph cuts" is applied specifically to those models which employ a max-flow/min-cut optimization (other graph cutting algorithms may be considered as graph partitioning algorithms). "Binary" problems (such as denoising a binary image) can be solved exactly using this approach; problems where pixels can be labeled with more than two different labels (such as stereo correspondence, or denoising of a grayscale image) cannot be solved exactly, but solutions produced are usually near the global optimum. == History == The foundational theory of graph cuts in computer vision was first developed by Margaret Greig, Bruce Porteous and Allan Seheult (GPS) of Durham University in a now legendary discussion contribution to Julian Besag's 1986 paper and a more detailed follow on paper in 1989. In the Bayesian statistical context of smoothing noisy images, using a Markov random field as the image prior distribution, they showed with a mathematically beautiful proof how the maximum a posteriori estimate of a binary image can be obtained exactly by maximizing the flow through an associated image network, or graph, involving the introduction of a source and sink and Log-likelihood ratios. The problem was shown to be efficiently solvable exactly, an unexpected result as the problem was believed to be computationally intractable (NP hard). GPS also addressed the computational cost of the max-flow algorithm on large graphs, a significant concern at the time. They proposed a partitioning algorithm (see Section 4 of GPS) involving the recursive amalgamation of non-overlapping blocks, or tiles, which gave a 12X increase in speed. This approach recursively solved and amalgamated independent sub-graphs until the whole graph was solved. While contemporaries like Geman and Geman had advocated Parallel computing in the context of Simulated annealing, the GPS blocking strategy offered a deterministic structure amenable to parallelisation and anticipated modern artificial intelligence design across multiple GPUs. However, until recently, this aspect of the paper was largely ignored and subsequent research focused on Serial computer global search trees, such as the Boykov-Kolmogorov algorithm. Although the general k {\displaystyle k} -colour problem is NP hard for k > 2 , {\displaystyle k>2,} the GPS approach has turned out to have very wide applicability in general computer vision problems. This was first demonstrated by Boykov, Veksler and Zabih who, in a seminal paper published more than 10 years after the original GPS paper, and in other important works, lit the blue touch paper for the general adoption of graph cut techniques in computer vision. They showed that, for general problems, the GPS approach can be applied iteratively to sequences of binary problems, using their now ubiquitous alpha-expansion algorithm, yielding near optimal solutions. Prior to these results, approximate local optimisation techniques such as simulated annealing (as proposed by the Geman brothers) or iterated conditional modes (a type of greedy algorithm suggested by Julian Besag) were used to solve such image smoothing problems. Building on these advancements, GPS graph cut optimization was subsequently adapted for interactive image segmentation, most notably through the "GrabCut" algorithm introduced by Carsten Rother, Vladimir Kolmogorov, and Andrew Blake of Microsoft Research, Cambridge. GrabCut extended earlier interactive graph cut methods by replacing monochrome image histograms with Gaussian mixture models to estimate colour distributions, and by employing an iterative GPS energy minimisation scheme. This approach significantly simplified user interaction, requiring only a rough bounding box around the target object rather than detailed user-drawn strokes, and it quickly became a standard tool in both academic research and commercial image editing software. The GPS paper connected and bridged profound ideas from Mathematical statistics (Bayes' theorem, Markov random field), Physics (Ising model), Optimisation (Energy function) and Computer science (Network flow problem) and led the move away from approximate local and slow optimisation approaches (eg simulated annealing) to more powerful exact, or near exact, faster global optimisation techniques. It is now recognised as seminal as it was well ahead of its time and, in particular, was published years before the computing power revolution of Moore's law and GPUs. Significantly, GPS was published in a mathematical statistics (rather than a computer vision) journal, and this led to it being overlooked by the computer vision community for many years. It is unofficially known as "The Velvet Underground" paper of computer vision (ie although very few computer vision people read the paper [bought the record], those that did, most importantly Boykov, Veksler and Zabih, started new and important research [formed a band]). This is confirmed by GPS' very large amplification ratio (2nd order citations/first order citations), estimated at well in excess of 100. Despite the foundational nature of the GPS work, formal recognition from the computer vision community has predominantly gone to the researchers who followed to extend and popularise the graph cut method. For example, Boykov, Veksler and Zabih deservedly received a Helmholtz Prize from the ICCV in 2011. This prize recognises ICCV papers from 10 or more years earlier that have had a significant impact on computer vision research. In 2011, Couprie et al. proposed a general image segmentation framework, called the "Power Watershed", that minimized a real-valued indicator function from [0,1] over a graph, constrained by user seeds (or unary terms) set to 0 or 1, in which the minimization of the indicator function over the graph is optimized with respect to an exponent p {\displaystyle p} . When p = 1 {\displaystyle p=1} , the Power Watershed is optimized by graph cuts, when p = 0 {\displaystyle p=0} the Power Watershed is optimized by shortest paths, p = 2 {\displaystyle p=2} is optimized by the random walker algorithm and p = ∞ {\displaystyle p=\infty } is optimized by the watershed algorithm. In this way, the Power Watershed may be viewed as a generalization of graph cuts that provides a straightforward connection with other energy optimization segmentation/clustering algorithms. == Binary segmentation of images == === Notation === Image: x ∈ { R , G , B } N {\displaystyle x\in \{R,G,B\}^{N}} Output: Segmentation (also called opacity) S ∈ R N {\displaystyle S\in R^{N}} (soft segmentation). For hard segmentation S ∈ { 0 for background , 1 for foreground/object to be detected } N {\displaystyle S\in \{0{\text{ for background}},1{\text{ for foreground/object to be detected}}\}^{N}} Energy function: E ( x , S , C , λ ) {\displaystyle E(x,S,C,\lambda )} where C is the color parameter and λ is the coherence parameter. E ( x , S , C , λ ) = E c o l o r + E c o h e r e n c e {\displaystyle E(x,S,C,\lambda )=E_{\rm {color}}+E_{\rm {coherence}}} Optimization: The segmentation can be estimated as a global minimum over S: arg ⁡ min S E ( x , S , C , λ ) {\displaystyle {\arg \min }_{S}E(x,S,C,\lambda )} === Existing methods === Standard Graph cuts: optimize energy function over the segmentation (unknown S value). Iterated Graph cuts: First step optimizes over the color parameters using K-means. Second step performs the usual graph cuts algorithm. These 2 steps are repeated recursively until convergence Dynamic graph cuts:Allows to re-run the algorithm much faster after modifying the problem (e.g. after new seeds have been added by a user). === Energy function === Pr ( x ∣ S ) = K − E {\displaystyle \Pr(x\mid S)=K^{-E}} where the energy E {\displaystyle E} is composed of two different mod

    Read more →
  • Biometric device

    Biometric device

    A biometric device is a security identification and authentication device. Such devices use automated methods of verifying or recognising the identity of a living person based on a physiological or behavioral characteristic. These characteristics include fingerprints, facial images, iris and voice recognition. == History == Biometric devices have been in use for thousands of years. Non-automated biometric devices have been in use since 500 BC, when ancient Babylonians would sign their business transactions by pressing their fingertips into clay tablets. Automation in biometric devices was first seen in the 1960s. The Federal Bureau of Investigation (FBI) in the 1960s, introduced the Indentimat, which started checking for fingerprints to maintain criminal records. The first systems measured the shape of the hand and the length of the fingers. Although discontinued in the 1980s, the system set a precedent for future Biometric Devices. == Subgroups == The characteristic of the human body is used to access information by the users. According to these characteristics, the sub-divided groups are Chemical biometric devices: Analyses the segments of the DNA to grant access to the users. Visual biometric devices: Analyses the visual features of the humans to grant access which includes iris recognition, face recognition, Finger recognition, and Retina Recognition. Behavioral biometric devices: Analyses the Walking Ability and Signatures (velocity of sign, width of sign, pressure of sign) distinct to every human. Olfactory biometric devices: Analyses the odor to distinguish between varied users. Auditory biometric devices: Analyses the voice to determine the identity of a speaker for accessing control. == Uses == === Workplace === Biometrics are being used to establish better and accessible records of the hour's employee's work. With the increase in "Buddy Punching" (a case where employees clocked out coworkers and fraudulently inflated their work hours) employers have looked towards new technology like fingerprint recognition to reduce such fraud. Additionally, employers are also faced with the task of proper collection of data such as entry and exit times. Biometric devices make for largely fool proof and reliable ways of enabling to collect data as employees have to be present to enter biometric details which are unique to them. === Immigration === As the demand for air travel grows and more people travel, modern-day airports have to implement technology in such a way that there are no long queues. Biometrics are being implemented in more and more airports as they enable quick recognition of passengers and hence lead to lower volume of people standing in queues. One such example is of the Dubai International Airport which plans to make immigration counters a relic of the past as they implement IRIS on the move technology (IOM) which should help the seamless departures and arrivals of passengers at the airport. === Handheld and personal devices === Fingerprint sensors can be found on mobile devices. The fingerprint sensor is used to unlock the device and authorize actions, like money and file transfers, for example. It can be used to prevent a device from being used by an unauthorized person. It is also used in attendance in number of colleges and universities. == Present day biometric devices == === Personal signature verification systems === This is one of the most highly recognised and acceptable biometrics in corporate surroundings. This verification has been taken one step further by capturing the signature while taking into account many parameters revolving around this like the pressure applied while signing, the speed of the hand movement and the angle made between the surface and the pen used to make the signature. This system also has the ability to learn from users as signature styles vary for the same user. Hence by taking a sample of data, this system is able to increase its own accuracy. === Iris recognition system === Iris recognition involves the device scanning the pupil of the subject and then cross referencing that to data stored on the database. It is one of the most secure forms of authentication, as while fingerprints can be left behind on surfaces, iris prints are extremely hard to be stolen. Iris recognition is widely applied by organisations dealing with the masses, one being the Aadhaar identification system issued by the Government of India to keep records of its population. The reason for this is that iris recognition makes use of iris prints of humans, which change little over the course of one's lifetime. == Problems with present day biometric devices == === Biometric spoofing === Biometric spoofing is a method of fooling a biometric identification management system, where a counterfeit mold is presented in front of the biometric scanner. This counterfeit mold emulates the unique biometric attributes of an individual so as to confuse the system between the artifact and the real biological target and gain access to sensitive data/materials. One such high-profile case of Biometric spoofing came to the limelight when it was found that German Defence Minister, Ursula von der Leyen's fingerprint had been successfully replicated by Chaos Computer Club. The group used high quality camera lenses and shot images from 6 feet away. They used a professional finger software and mapped the contours of the Ministers thumbprint. Although progress has been made to stop spoofing. Using the principle of pulse oximetry — the liveliness of the test subject is taken into account by measure of blood oxygenation and the heart rate. This reduces attacks like the ones mentioned above, although these methods aren't commercially applicable as costs of implementation are high. This reduces their real world application and hence makes biometrics insecure until these methods are commercially viable. === Accuracy === Accuracy is a major issue with biometric recognition. Passwords are still extremely popular, because a password is static in nature, while biometric data can be subject to change (such as one's voice becoming heavier due to puberty, or an accident to the face, which could lead to improper reading of facial scan data). When testing voice recognition as a substitute to PIN-based systems, Barclays reported that their voice recognition system is 95 percent accurate. This statistic means that many of its customers' voices might still not be recognised even when correct. This uncertainty revolving around the system could lead to slower adoption of biometric devices, continuing the reliance of traditional password-based methods. == Benefits of biometric devices over traditional methods of authentication == Biometric data cannot be lent and hacking of Biometric data is complicated hence it makes it safer to use than traditional methods of authentication like passwords which can be lent and shared. Passwords do not have the ability to judge the user but rely only on the data provided by the user, which can easily be stolen while Biometrics work on the uniqueness of each individual. Passwords can be forgotten and recovering them can take time, whereas Biometric devices rely on biometric data which tends to be unique to a person, hence there is no risk of forgetting the authentication data. A study conducted among Yahoo! users found that at least 1.5 percent of Yahoo users forgot their passwords every month, hence this makes accessing services more lengthy for consumers as the process of recovering passwords is lengthy. These shortcomings make Biometric devices more efficient and reduces effort for the end user. == Future == Researchers are targeting the drawbacks of present-day biometric devices and developing to reduce problems like biometric spoofing and inaccurate intake of data. Technologies which are being developed are- The United States Military Academy are developing an algorithm that allows identification through the ways each individual interacts with their own computers; this algorithm considers unique traits like typing speed, rhythm of writing and common spelling mistakes. This data allows the algorithm to create a unique profile for each user by combining their multiple behavioral and stylometric information. This can be very difficult to replicate collectively. A recent innovation by Kenneth Okereafor and, presented an optimized and secure design of applying biometric liveness detection technique using a trait randomization approach. This novel concept potentially opens up new ways of mitigating biometric spoofing more accurately, and making impostor predictions intractable or very difficult in future biometric devices. A simulation of Kenneth Okereafor's biometric liveness detection algorithm using a 3D multi-biometric framework consisting of 15 liveness parameters from facial print, finger print and iris pattern traits resulted in a system efficiency of the 99.2% over a cardinality of 125 distinct randomization combinat

    Read more →
  • Neural processing unit

    Neural processing unit

    A neural processing unit (NPU), also known as an AI accelerator or deep learning processor, is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence and machine learning applications, including artificial neural networks and computer vision. == Use == Their purpose is either to efficiently execute already trained AI models (inference) or to train AI models. NPUs can be more efficient in terms of speed or power consumption. NPU applications include algorithms for robotics, Internet of things, and data-intensive or sensor-driven tasks. They are often manycore or spatial designs and focus on low-precision arithmetic, novel dataflow architectures, or in-memory computing capability. As of 2024, a widely used datacenter-grade AI integrated circuit chip, the Nvidia H100 GPU, contains tens of billions of MOSFETs. === Consumer devices === AI accelerators are used in Apple silicon, Qualcomm, Samsung, Huawei, and Google Tensor smartphone processors. Vision processing units are accelerators specialized for machine vision algorithms such as CNN (convolutional neural networks) and SIFT (scale-invariant feature transform). They are used in devices that need to keep track of objects visually such as AR headsets and drones. It is more recently (circa 2017) added to processors from Apple and (circa 2022) to processors from Intel and AMD. All models of Intel Meteor Lake processors have a built-in versatile processor unit (VPU) for accelerating inference for computer vision and deep learning. On consumer devices, the NPU is intended to be small, power-efficient, but reasonably fast when used to run small models. To do this they are designed to support low-bitwidth operations using data types such as INT4, INT8, FP8, and FP16. A common metric is trillions of operations per second (TOPS). Although TOPS does not explicitly specify the kind of operations, it is typically INT8 additions and multiplications. === Datacenters === Accelerators are used in cloud computing servers: e.g., tensor processing units (TPU) for Google Cloud Platform, and Trainium and Inferentia chips for Amazon Web Services. Many vendor-specific terms exist for devices in this category, and it is an emerging technology without a dominant design. Since the late 2010s, graphics processing units designed by companies such as Nvidia and AMD often include AI-specific hardware in the form of dedicated functional units for low-precision matrix-multiplication operations. These GPUs are commonly used as AI accelerators, both for training and inference. === Scientific computation === Although NPUs are tailored for low-precision (e.g., FP16, INT8) matrix multiplication operations, they can be used to emulate higher-precision matrix multiplications in scientific computing. As modern GPUs place much focus on making the NPU part fast, using emulated FP64 (Ozaki scheme) on NPUs can potentially outperform native FP64. This has been demonstrated using FP16-emulated FP64 on NVIDIA TITAN RTX and using INT8-emulated FP64 on NVIDIA consumer GPUs and the A100 GPU. Consumer GPUs especially benefited as they have limited FP64 hardware capacity, showing a 6× speedup. Since CUDA Toolkit 13.0 Update 2, cuBLAS automatically uses INT8-emulated FP64 matrix multiplication of the equivalent precision if it is faster than native. This is in addition to the FP16-emulated FP32 feature introduced in version 12.9. == Programming == An operating system or a higher-level library may provide application programming interfaces such as TensorFlow with LiteRT Next (Android), CoreML (iOS, macOS) or DirectML (Windows). Formats such as ONNX are used to represent trained neural networks. Consumer CPU-integrated NPUs are accessible through vendor-specific APIs. AMD (Ryzen AI), Intel (OpenVINO), Apple silicon (CoreML), and Qualcomm (SNPE) each have their own APIs, which can be built upon by a higher-level library. GPUs generally use existing GPGPU pipelines such as CUDA and OpenCL adapted for lower precisions and specialized matrix-multiplication operations. Vulkan is also being used. Custom-built systems such as the Google TPU use private interfaces. There are a large number of separate underlying acceleration APIs and compilers/runtimes in use in the AI field, causing a great increase in software development effort due to the many combinations involved. As of 2025, the open standard organization Khronos Group is pursuing standardization of AI-related interfaces to reduce the amount of work needed. Khronos is working on three separate fronts: expansion of data types and intrinsic operations in OpenCL and Vulkan, inclusion of compute graphs in SPIR-V, and a NNEF/SkriptND file format for describing a neural network.

    Read more →
  • Database application

    Database application

    A database application is a computer program whose primary purpose is retrieving information from a computerized database. From here, information can be inserted, modified or deleted which is subsequently conveyed back into the database. Early examples of database applications were accounting systems and airline reservations systems, such as SABRE, developed starting in 1957. A characteristic of modern database applications is that they facilitate simultaneous updates and queries from multiple users. Systems in the 1970s might have accomplished this by having each user in front of a 3270 terminal to a mainframe computer. By the mid-1980s it was becoming more common to give each user a personal computer and have a program running on that PC that is connected to a database server. Information would be pulled from the database, transmitted over a network, and then arranged, graphed, or otherwise formatted by the program running on the PC. Starting in the mid-1990s it became more common to build database applications with a Web interface. Rather than develop custom software to run on a user's PC, the user would use the same Web browser program for every application. A database application with a Web interface had the advantage that it could be used on devices of different sizes, with different hardware, and with different operating systems. Examples of early database applications with Web interfaces include amazon.com, which used the Oracle relational database management system, the photo.net online community, whose implementation on top of Oracle was described in the book Database-Backed Web Sites (Ziff-Davis Press; May 1997), and eBay, also running Oracle. Electronic medical records are referred to on emrexperts.com, in December 2010, as "a software database application". A 2005 O'Reilly book uses the term in its title: Database Applications and the Web. Some of the most complex database applications remain accounting systems, such as SAP, which may contain thousands of tables in only a single module. Many of today's most widely used computer systems are database applications, for example, Facebook, which was built on top of MySQL. The etymology of the phrase "database application" comes from the practice of dividing computer software into systems programs, such as the operating system, compilers, the file system, and tools such as the database management system, and application programs, such as a payroll check processor. On a standard PC running Microsoft Windows, for example, the Windows operating system contains all of the systems programs while games, word processors, spreadsheet programs, photo editing programs, etc. would be application programs. As "application" is short for "application program", "database application" is short for "database application program". Not every program that uses a database would typically be considered a "database application". For example, many physics experiments, e.g., the Large Hadron Collider, generate massive data sets that programs subsequently analyze. The data sets constitute a "database", though they are not typically managed with a standard relational database management system. The computer programs that analyze the data are primarily developed to answer hypotheses, not to put information back into the database and therefore the overall program would not be called a "database application". == Examples of database applications == Amazon Student Data CNN eBay Facebook Fandango Filemaker (Mac OS) LibreOffice Base Microsoft Access Oracle relational database SAP (Systems, Applications & Products in Data Processing) Ticketmaster Wikipedia Yelp YouTube Google MySQL

    Read more →
  • Open information extraction

    Open information extraction

    In natural language processing, open information extraction (OIE) is the task of generating a structured, machine-readable representation of the information in text, usually in the form of triples or n-ary propositions. == Overview == A proposition can be understood as truth-bearer, a textual expression of a potential fact (e.g., "Dante wrote the Divine Comedy"), represented in an amenable structure for computers [e.g., ("Dante", "wrote", "Divine Comedy")]. An OIE extraction normally consists of a relation and a set of arguments. For instance, ("Dante", "passed away in" "Ravenna") is a proposition formed by the relation "passed away in" and the arguments "Dante" and "Ravenna". The first argument is usually referred as the subject while the second is considered to be the object. The extraction is said to be a textual representation of a potential fact because its elements are not linked to a knowledge base. Furthermore, the factual nature of the proposition has not yet been established. In the above example, transforming the extraction into a full fledged fact would first require linking, if possible, the relation and the arguments to a knowledge base. Second, the truth of the extraction would need to be determined. In computer science transforming OIE extractions into ontological facts is known as relation extraction. In fact, OIE can be seen as the first step to a wide range of deeper text understanding tasks such as relation extraction, knowledge-base construction, question answering, semantic role labeling. The extracted propositions can also be directly used for end-user applications such as structured search (e.g., retrieve all propositions with "Dante" as subject). OIE was first introduced by TextRunner developed at the University of Washington Turing Center headed by Oren Etzioni. Other methods introduced later such as Reverb, OLLIE, ClausIE or CSD helped to shape the OIE task by characterizing some of its aspects. At a high level, all of these approaches make use of a set of patterns to generate the extractions. Depending on the particular approach, these patterns are either hand-crafted or learned. == OIE systems and contributions == Reverb suggested the necessity to produce meaningful relations to more accurately capture the information in the input text. For instance, given the sentence "Faust made a pact with the devil", it would be erroneous to just produce the extraction ("Faust", "made", "a pact") since it would not be adequately informative. A more precise extraction would be ("Faust", "made a pact with", "the devil"). Reverb also argued against the generation of overspecific relations. OLLIE stressed two important aspects for OIE. First, it pointed to the lack of factuality of the propositions. For instance, in a sentence like "If John studies hard, he will pass the exam", it would be inaccurate to consider ("John", "will pass", "the exam") as a fact. Additionally, the authors indicated that an OIE system should be able to extract non-verb mediated relations, which account for significant portion of the information expressed in natural language text. For instance, in the sentence "Obama, the former US president, was born in Hawaii", an OIE system should be able to recognize a proposition ("Obama", "is", "former US president"). ClausIE introduced the connection between grammatical clauses, propositions, and OIE extractions. The authors stated that as each grammatical clause expresses a proposition, each verb mediated proposition can be identified by solely recognizing the set of clauses expressed in each sentence. This implies that to correctly recognize the set of propositions in an input sentence, it is necessary to understand its grammatical structure. The authors studied the case in the English language that only admits seven clause types, meaning that the identification of each proposition only requires defining seven grammatical patterns. The finding also established a separation between the recognition of the propositions and its materialization. In a first step, the proposition can be identified without any consideration of its final form, in a domain-independent and unsupervised way, mostly based on linguistic principles. In a second step, the information can be represented according to the requirements of the underlying application, without conditioning the identification phase. Consider the sentence "Albert Einstein was born in Ulm and died in Princeton". The first step will recognize the two propositions ("Albert Einstein", "was born", "in Ulm") and ("Albert Einstein", "died", "in Princeton"). Once the information has been correctly identified, the propositions can take the particular form required by the underlying application [e.g., ("Albert Einstein", "was born in", "Ulm") and ("Albert Einstein", "died in", "Princeton")]. CSD introduced the idea of minimality in OIE. It considers that computers can make better use of the extractions if they are expressed in a compact way. This is especially important in sentences with subordinate clauses. In these cases, CSD suggests the generation of nested extractions. For example, consider the sentence "The Embassy said that 6,700 Americans were in Pakistan". CSD generates two extractions [i] ("6,700 Americans", "were", "in Pakistan") and [ii] ("The Embassy", "said", "that [i]"). This is usually known as reification.

    Read more →
  • MoltenVK

    MoltenVK

    MoltenVK is a software library which allows Vulkan applications to run on top of Metal on Apple's macOS, iOS, and tvOS operating systems. It is the first software component to be released for the Vulkan Portability Initiative, a project to have a subset of Vulkan run on platforms lacking native Vulkan drivers. There are some limitations compared with a native Vulkan implementation. == History == MoltenVK was first released as a proprietary and commercially licensed product by The Brenwill Workshop on July 27, 2016. On July 31, 2017, Khronos announced the formation of the Vulkan Portability Technical Subgroup. === Open source === On February 26, 2018, Khronos announced that Vulkan became available on macOS and iOS products through the MoltenVK library. Valve announced that Dota 2 will run on macOS using the Vulkan API with the aid of MoltenVK, and that they had made an arrangement with developer The Brenwill Workshop Ltd to release MoltenVK as open-source software under the Apache License version 2.0. On May 30, 2018, Qt was updated with Vulkan for Qt on macOS using MoltenVK. On May 31, 2018, optional Vulkan support for Dota 2 on macOS was released. Benchmarks for the game were available the following day, showing better performance using Vulkan and MoltenVK compared to OpenGL. On July 20, 2018, Wine was updated with Vulkan support on macOS using MoltenVK. On 29 July 2018, the first app using MoltenVK was accepted onto the App Store, after initially being rejected. On 6 August 2018, Google open-sourced Filament, a crossplatform real-time physically based rendering engine with MoltenVK for macOS/iOS. On November 28, 2018, Valve released Artifact, their first Vulkan-only game on macOS using MoltenVK. === Version 1.0 === On 29 January 2019, MoltenVK 1.0.32 was released with early prototype of Vulkan Portability Extensions. RPCS3 and Dolphin emulators were updated with Vulkan support on macOS using MoltenVK. On 13 April 2019, MoltenVK 1.0.34 was released with support for tessellation. On July 30, 2019, MoltenVK 1.0.36 was released targeting Metal 3.0. On July 31, 2020, MoltenVK 1.0.44 was released, adding support for the tvOS platform. On January 23, 2020, MoltenVK was updated to support for some of the new features of Vulkan 1.2, as of Vulkan SDK 1.2.121. === Version 1.1 === On October 1, 2020, MoltenVK 1.1.0 was released, adding full support for Vulkan 1.1, as of Vulkan SDK 1.2.154. On 9 December 2020, MoltenVK 1.1.1 was released, providing support for Vulkan on Apple silicon GPUs and support for the Mac Catalyst platform for porting iOS/iPadOS apps to macOS. === Version 1.2 === On October 18, 2022, MoltenVK 1.2.0 was released, adding full support for Vulkan 1.2 as of Vulkan SDK 1.3.231. In January 2023, MoltenVK 1.2.2 added support for Vulkan as of SDK 1.3.239, while this version of Vulkan SDK fixed some issues with the interconnectivity with Metal API, while version 1.2.3 supported some additional extensions. === Version 1.3 === On May 1, 2025, MoltenVK 1.3 was released with support for Vulkan 1.3. === Version 1.4 === On August 20, 2025, MoltenVK 1.4 was released with support for Vulkan 1.4.

    Read more →
  • Salience (neuroscience)

    Salience (neuroscience)

    Salience (also called saliency, from Latin saliō meaning "leap, spring") is the property by which some thing stands out. Salient events are an attentional mechanism by which organisms learn and survive; those organisms can focus their limited perceptual and cognitive resources on the pertinent (that is, salient) subset of the sensory data available to them. Saliency typically arises from contrasts between items and their neighborhood. They might be represented, for example, by a red dot surrounded by white dots, or by a flickering message indicator of an answering machine, or a loud noise in an otherwise quiet environment. Saliency detection is often studied in the context of the visual system, but similar mechanisms operate in other sensory systems. Just what is salient can be influenced by training: for example, for human subjects particular letters can become salient by training. There can be a sequence of necessary events, each of which has to be salient, in turn, in order for successful training in the sequence; the alternative is a failure, as in an illustrated sequence when tying a bowline; in the list of illustrations, even the first illustration is a salient: the rope in the list must cross over, and not under the bitter end of the rope (which can remain fixed, and not free to move); failure to notice that the first salient has not been satisfied means the knot will fail to hold, even when the remaining salient events have been satisfied. When attention deployment is driven by salient stimuli, it is considered to be bottom-up, memory-free, and reactive. Conversely, attention can also be guided by top-down, memory-dependent, or anticipatory mechanisms, such as when looking ahead of moving objects or sideways before crossing streets. Humans and other animals have difficulty paying attention to more than one item simultaneously, so they are faced with the challenge of continuously integrating and prioritizing different bottom-up and top-down influences. == Neuroanatomy == The brain component named the hippocampus helps with the assessment of salience and context by using past memories to filter new incoming stimuli, and placing those that are most important into long term memory. The entorhinal cortex is the pathway into and out of the hippocampus, and is an important part of the brain's memory network; research shows that it is a brain region that suffers damage early on in Alzheimer's disease, one of the effects of which is altered (diminished) salience. The pulvinar nuclei (in the thalamus) modulate physical/perceptual salience in attentional selection. One group of neurons (i.e., D1-type medium spiny neurons) within the nucleus accumbens shell (NAcc shell) assigns appetitive motivational salience ("want" and "desire", which includes a motivational component), aka incentive salience, to rewarding stimuli, while another group of neurons (i.e., D2-type medium spiny neurons) within the NAcc shell assigns aversive motivational salience to aversive stimuli. The primary visual cortex (V1) generates a bottom-up saliency map from visual inputs to guide reflexive attentional shifts or gaze shifts. According to V1 Saliency Hypothesis, the saliency of a location is higher when V1 neurons give higher responses to that location relative to V1 neurons' responses to other visual locations. For example, a unique red item among green items, or a unique vertical bar among horizontal bars, is salient since it evokes higher V1 responses and attracts attention or gaze. The V1 neural responses are sent to the superior colliculus to guide gaze shifts to the salient locations. A fingerprint of the saliency map in V1 is that attention or gaze can be captured by the location of an eye-of-origin singleton in visual inputs, e.g., a bar uniquely shown to the left eye in a background of many other bars shown to the right eye, even when observers cannot tell the difference between the singleton and the background bars. == In psychology == The term is widely used in the study of perception and cognition to refer to any aspect of a stimulus that, for any of many reasons, stands out from the rest. Salience may be the result of emotional, motivational or cognitive factors and is not necessarily associated with physical factors such as intensity, clarity or size. Although salience is thought to determine attentional selection, salience associated with physical factors does not necessarily influence selection of a stimulus. === Salience bias === Salience bias (also referred to as perceptual salience) is a cognitive bias that predisposes individuals to focus on or attend to items, information, or stimuli that are more prominent, visible, or emotionally striking. This is as opposed to stimuli that are unremarkable, or less salient, even though this difference is often irrelevant by objective standards. The American Psychological Association (APA) defines the salience hypothesis as a theory regarding perception where "motivationally significant" information is more readily perceived than information with little or less significant motivational importance. Perceptual salience (salience bias) is linked to the vividness effect, whereby a more pronounced response is produced by a more vivid perception of a stimulus than the mere knowledge of the stimulus. Salience bias assumes that more dynamic, conspicuous, or distinctive stimuli engage attention more than less prominent stimuli, disproportionately impacting decision making, it is a bias which favors more salient information. ==== Application ==== ===== Cognitive Psychology ===== Salience bias, like all other cognitive biases, is an applicable concept to various disciplines. For example, cognitive psychology investigates cognitive functions and processes, such as perception, attention, memory, problem solving, and decision making, all of which could be influenced by salience bias. Salience bias acts to combat cognitive overload by focusing attention on prominent stimuli, which affects how individuals perceive the world as other, less vivid stimuli that could add to or change this perception, are ignored. Human attention gravitates towards novel and relevant stimuli and unconsciously filters out less prominent information, demonstrating salience bias, which influences behavior as human behavior is affected by what is attended to. Behavioral economists Tversky and Kahneman also suggest that the retrieval of instances is influenced by their salience, such as how witnessing or experiencing an event first-hand has a greater impact than when it is less salient, like if it were read about, implying that memory is affected by salience. ===== Language ===== It is also relevant in language understanding and acquisition. Focusing on more salient phenomena allows people to detect language patterns and dialect variations more easily, making dialect categorization more efficient. ===== Social Behavior ===== Furthermore, social behaviors and interactions can also be influenced by perceptual salience. Changes in the perceptual salience of an individual heavily influences their social behavior and subjective experience of their social interactions, confirming a "social salience effect". Social salience relates to how individuals perceive and respond to other people. ===== Behavioral Science ===== The connection between salience bias and other heuristics, like availability and representativeness, links it to the fields of behavioral science and behavioral economics. Salience bias is closely related to the availability heuristic in behavioral economics, based on the influence of information vividness and visibility, such as recency or frequency, on judgements, for example:Accessibility and salience are closely related to availability, and they are important as well. If you have personally experienced a serious earthquake, you're more likely to believe that an earthquake is likely than if you read about it in a weekly magazine. Thus, vivid and easily imagined causes of death (for example, tornadoes) often receive inflated estimates of probability, and less-vivid causes (for example, asthma attacks) receive low estimates, even if they occur with a far greater frequency (here, by a factor of twenty). Timing counts too: more recent events have a greater impact on our behavior, and on our fears, than earlier ones.Humans have bounded rationality, which refers to their limited ability to be rational in decision making, due to a limited capacity to process information and cognitive ability. Heuristics, such as availability, are employed to reduce the complexity of cognitive and social tasks or judgements, in order to decrease the cognitive load that result from bounded rationality. Despite the effectiveness of heuristics in doing so, they are limited by systematic errors that occur, often the result of influencing biases, such as salience. This can lead to misdirected or misinformed judgements, based on an overemphasis or overweighting of

    Read more →
  • Open Syllabus Project

    Open Syllabus Project

    The Open Syllabus Project (OSP) is an online open-source platform that catalogs and analyzes millions of college syllabi. Founded by researchers from the American Assembly at Columbia University, the OSP has amassed the most extensive collection of searchable syllabi. Since its beta launch in 2016, the OSP has collected over 7 million course syllabi from over 80 countries, primarily by scraping publicly accessible university websites. The project is directed by Joe Karaganis. == History == The OSP was formed by a group of data scientists, sociologists, and digital-humanities researchers at the American Assembly, a public-policy institute based at Columbia University. The OSP was partly funded by the Sloan Foundation and the Arcadia Fund. Joe Karaganis, former vice-president of the American Assembly, serves as the project director of the OSP. The project builds on prior attempts to archive syllabi, such as H-Net, MIT OpenCourseWare, and historian Dan Cohen's defunct Syllabus Finder website (Cohen now sits on the OSP's advisory board). The OSP became a non-profit and independent of the American Assembly in November 2019. In January 2016, the OSP launched a beta version of their "Syllabus Explorer," which they had collected data for since 2013. The Syllabus Explorer allows users to browse and search texts from over one million college course syllabi. The OSP launched a more comprehensive version 2.0 of the Syllabus Explorer in July 2019. The newer version includes an interactive visualization that displays texts as dots on a knowledge map. As of 2022, the OSP has collected over 7 million course syllabi. The Syllabus Explorer represents the "largest collection of searchable syllabi ever amassed." == Methodology == The OSP has collected syllabi data from over 80 countries dating to 2000. The syllabi stem from over 4,000 worldwide institutions. Most of the OSP's data originates from the United States. Canada, Australia, and the U.K also have large datasets. The OSP primarily collects syllabi by scraping publicly accessible university websites. The OSP also allows syllabi submissions from faculty, students, and administrators. The OSP developers use machine learning and natural language processing to extract metadata from such syllabi. Since only metadata is collected, no individual syllabus or personal identifying information is found in the OSP database. The OSP classifies the syllabi into 62 subject fields – corresponding to the U.S. Department of Education's Classification of Instructional Programs (CIP). Additionally, the OSP assigns each text a "teaching score" from 0–100. This score represents the text's percentile rank among citations in the total citation count and is a numerical indicator of the relative frequency of which a particular work is taught. The OSP also has data on which texts are most likely to be assigned together. The developers behind the OSP admit that the database is incomplete and likely contains "a fair number of errors." Karaganis estimates that 80–100 million syllabi exist in the United States alone. The OSP is unable to access syllabi behind private course-management software like Blackboard. == Notable findings == === Anthropology === Using data from the OSP, anthropologist Laurence Ralph uncovered that black anthropologists are "woefully under-represented in (if not erased from) most anthropology syllabi." Black authors wrote less than 1 percent of the top 1,000 assigned works. === Economics === The database indicates Greg Mankiw is the most frequently cited author for college economics courses. === English literature === The OSP found that Mary Shelley's Frankenstein was the most widely taught novel in college courses. Additionally, the majority of novels published after 1945 taught in English classes were historical fiction. === Female writers === The most read female writer on college campuses is Kate L. Turabian for her A Manual for Writers of Research Papers, Theses, and Dissertations . Turabian is followed by Diana Hacker, Toni Morrison, Jane Austen, and Virginia Woolf. === Film === The most assigned film according to the OSP is the 1929 Soviet documentary film, Man with a Movie Camera. English filmmaker Alfred Hitchcock is the most assigned director in college courses. === History === Historians George Brown Tindall and David Emory Shi's America: A Narrative History is the number one assigned textbook for history, followed by Anne Moody's memoir, Coming of Age in Mississippi. === Philosophy === The most assigned texts in the field of philosophy include Aristotle's Nicomachean Ethics, John Stuart Mill's Utilitarianism, and Plato's Republic. Plato's Republic was also the second most assigned text in universities in the English-speaking world (only behind Strunk and White's Elements of Style). === Physics === David Halliday's et al. Fundamentals of Physics is the number one ranked physics textbook in the OSP's database. === Political science === Data from the OSP indicates that the dominant political science texts are written almost exclusively by white men and scholars based in the West. In the top 200 most-frequently assigned works, 15 are authored by at least one woman. === Public administration === American president Woodrow Wilson's article "The Study of Administration" was the most frequently assigned text in public affairs and administration syllabi. == Reception == According to William Germano et al., the OSP is a "fascinating resource but is also prone to misrepresenting or at least distracting us from the most important business of a syllabus: communicating with students." Historian William Caferro remarks that the OSP is a "tacit experience of sharing, but a useful one." English professor Bart Beaty writes that, "Despite the many reservations about the completeness of its data, the OSP provides a rare opportunity for scholars to move beyond the anecdotal in discussions of canon-formation in teaching." Media theorist Elizabeth Losh opines that "big data approaches", like the OSP, may "raise troubling questions for instructors about informed consent, pedagogical privacy, and quantified metrics."

    Read more →
  • Image fusion

    Image fusion

    The image fusion process is defined as gathering all the important information from multiple images, and their inclusion into fewer images, usually a single one. This single image is more informative and accurate than any single source image, and it consists of all the necessary information. The purpose of image fusion is not only to reduce the amount of data but also to construct images that are more appropriate and understandable for the human and machine perception. In computer vision, multisensor image fusion is the process of combining relevant information from two or more images into a single image. The resulting image will be more informative than any of the input images. In remote sensing applications, the increasing availability of space borne sensors gives a motivation for different image fusion algorithms. Several situations in image processing require high spatial and high spectral resolution in a single image. Most of the available equipment is not capable of providing such data convincingly. Image fusion techniques allow the integration of different information sources. The fused image can have complementary spatial and spectral resolution characteristics. However, the standard image fusion techniques can distort the spectral information of the multispectral data while merging. In satellite imaging, two types of images are available. The panchromatic image acquired by satellites is transmitted with the maximum resolution available and the multispectral data are transmitted with coarser resolution. This will usually be two or four times lower. At the receiver station, the panchromatic image is merged with the multispectral data to convey more information. Many methods exist to perform image fusion. The very basic one is the high-pass filtering technique. Later techniques are based on Discrete Wavelet Transform, uniform rational filter bank, and Laplacian pyramid. == Motivation == Multi sensor data fusion has become a discipline which demands more general formal solutions to a number of application cases. Several situations in image processing require both high spatial and high spectral information in a single image. This is important in remote sensing. However, the instruments are not capable of providing such information either by design or because of observational constraints. One possible solution for this is data fusion. == Methods == Image fusion methods can be broadly classified into two groups – spatial domain fusion and transform domain fusion. The fusion methods such as averaging, Brovey method, principal component analysis (PCA) and IHS based methods fall under spatial domain approaches. Another important spatial domain fusion method is the high-pass filtering based technique. Here the high frequency details are injected into upsampled version of MS images. The disadvantage of spatial domain approaches is that they produce spatial distortion in the fused image. Spectral distortion becomes a negative factor while we go for further processing, such as classification problem. Spatial distortion can be very well handled by frequency-domain approaches on image fusion. The multiresolution analysis has become a very useful tool for analysing remote sensing images. The discrete wavelet transform has become a very useful tool for fusion. Some other fusion methods are also there, such as Laplacian pyramid based, curvelet transform based etc. These methods show a better performance in spatial and spectral quality of the fused image compared to other spatial methods of fusion. The images used in image fusion should already be registered. Misregistration is a major source of error in image fusion. Some well-known image fusion methods are: High-pass filtering technique IHS transform based image fusion PCA-based image fusion Wavelet transform image fusion Pair-wise spatial frequency matching Comparative analysis of image fusion methods demonstrates that different metrics support different user needs, sensitive to different image fusion methods, and need to be tailored to the application. Categories of image fusion metrics are based on information theory features, structural similarity, or human perception. === Multi-focus image fusion === Multi-focus image fusion is used to collect useful and necessary information from input images with different focus depths in order to create an output image that ideally has all information from input images. In visual sensor network (VSN), sensors are cameras which record images and video sequences. In many applications of VSN, a camera can’t give a perfect illustration including all details of the scene. This is because of the limited depth of focus exists in the optical lens of cameras. Therefore, just the object located in the focal length of camera is focused and cleared and the other parts of image are blurred. VSN has an ability to capture images with different depth of focuses in the scene using several cameras. Due to the large amount of data generated by camera compared to other sensors such as pressure and temperature sensors and some limitation such as limited band width, energy consumption and processing time, it is essential to process the local input images to decrease the amount of transmission data. The aforementioned reasons emphasize the necessary of multi-focus images fusion. Multi-focus image fusion is a process which combines the input multi-focus images into a single image including all important information of the input images and it’s more accurate explanation of the scene than every single input image. == Applications == === In remote sensing === Image fusion in remote sensing has several application domains. An important domain is the multi-resolution image fusion (commonly referred to pan-sharpening). In satellite imagery we can have two types of images: Panchromatic images – An image collected in the broad visual wavelength range but rendered in black and white. Multispectral images – Images optically acquired in more than one spectral or wavelength interval. Each individual image is usually of the same physical area and scale but of a different spectral band. The SPOT PAN satellite provides high resolution (10m pixel) panchromatic data. While the LANDSAT TM satellite provides low resolution (30m pixel) multispectral images. Image fusion attempts to merge these images and produce a single high resolution multispectral image. The standard merging methods of image fusion are based on Red–Green–Blue (RGB) to Intensity–Hue–Saturation (IHS) transformation. The usual steps involved in satellite image fusion are as follows: Resize the low resolution multispectral images to the same size as the panchromatic image. Transform the R, G and B bands of the multispectral image into IHS components. Modify the panchromatic image with respect to the multispectral image. This is usually performed by histogram matching of the panchromatic image with Intensity component of the multispectral images as reference. Replace the intensity component by the panchromatic image and perform inverse transformation to obtain a high resolution multispectral image. Pan-sharpening can be done with Photoshop. Other applications of image fusion in remote sensing are available. === In medical imaging === Image fusion has become a common term used within medical diagnostics and treatment. The term is used when multiple images of a patient are registered and overlaid or merged to provide additional information. Fused images may be created from multiple images from the same imaging modality, or by combining information from multiple modalities, such as magnetic resonance image (MRI), computed tomography (CT), positron emission tomography (PET), and single-photon emission computed tomography (SPECT). In radiology and radiation oncology, these images serve different purposes. For example, CT images are used more often to ascertain differences in tissue density while MRI images are typically used to diagnose brain tumors. For accurate diagnosis, radiologists must integrate information from multiple image formats. Fused, anatomically consistent images are especially beneficial in diagnosing and treating cancer. With the advent of these new technologies, radiation oncologists can take full advantage of intensity modulated radiation therapy (IMRT). Being able to overlay diagnostic images into radiation planning images results in more accurate IMRT target tumor volumes.

    Read more →
  • Security and Privacy in Computer Systems

    Security and Privacy in Computer Systems

    Security and Privacy in Computer Systems is a paper by Willis Ware that was first presented to the public at the 1967 Spring Joint Computer Conference. == Significance == Ware's presentation was the first public conference session about information security and privacy in respect of computer systems, especially networked or remotely-accessed ones. The IEEE Annals of the History of Computing said that Ware's 1967 Spring Joint Computer Conference session, together with 1970's Ware report, marked the start of the field of computer security.

    Read more →
  • Computational semantics

    Computational semantics

    Computational semantics is a subfield of computational linguistics. Its goal is to elucidate the cognitive mechanisms supporting the generation and interpretation of meaning in humans. It usually involves the creation of computational models that simulate particular semantic phenomena, and the evaluation of those models against data from human participants. While computational semantics is a scientific field, it has many applications in real-world settings and substantially overlaps with Artificial Intelligence. Broadly speaking, the discipline can be subdivided into areas that mirror the internal organization of linguistics. For example, lexical semantics and frame semantics have active research communities within computational linguistics. Some popular methodologies are also strongly inspired by traditional linguistics. Most prominently, the area of distributional semantics, which underpins investigations into embeddings and the internals of Large Language Models, has roots in the work of Zellig Harris. Some traditional topics of interest in computational semantics are: construction of meaning representations, semantic underspecification, anaphora resolution, presupposition projection, and quantifier scope resolution. Methods employed usually draw from formal semantics or statistical semantics. Computational semantics has points of contact with the areas of lexical semantics (word-sense disambiguation and semantic role labeling), discourse semantics, knowledge representation and automated reasoning (in particular, automated theorem proving). Since 1999 there has been an ACL special interest group on computational semantics, SIGSEM.

    Read more →
  • Physics-informed neural networks

    Physics-informed neural networks

    In machine learning, physics-informed neural networks (PINNs), also referred to as theory-trained neural networks (TTNs), are a type of universal function approximator that can embed the knowledge of any physical laws that govern a given data-set in the learning process, and can be described by partial differential equations (PDEs). Low data availability for some biological and engineering problems limit the robustness of conventional machine learning models used for these applications. The prior knowledge of general physical laws acts in the training of neural networks (NNs) as a regularization agent that limits the space of admissible solutions, increasing the generalizability of the function approximation. This way, embedding this prior information into a neural network results in enhancing the information content of the available data, facilitating the learning algorithm to capture the right solution and to generalize well even with a low amount of training examples. Because they process continuous spatial and time coordinates and output continuous PDE solutions, they can be categorized as neural fields. == Function approximation == Most of the physical laws that govern the dynamics of a system can be described by partial differential equations. For example, the Navier–Stokes equations are a set of partial differential equations derived from the conservation laws (i.e., conservation of mass, momentum, and energy) that govern fluid mechanics. The solution of the Navier–Stokes equations with appropriate initial and boundary conditions allows the quantification of flow dynamics in a precisely defined geometry. However, these equations cannot be solved exactly and therefore numerical methods must be used (such as finite differences, finite elements and finite volumes). In this setting, these governing equations must be solved while accounting for prior assumptions, linearization, and adequate time and space discretization. Recently, solving the governing partial differential equations of physical phenomena using deep learning has emerged as a new field of scientific machine learning (SciML), leveraging the universal approximation theorem and high expressivity of neural networks. In general, deep neural networks could approximate any high-dimensional function given that sufficient training data are supplied. However, such networks do not consider the physical characteristics underlying the problem, and the level of approximation accuracy provided by them is still heavily dependent on careful specifications of the problem geometry as well as the initial and boundary conditions. Without this preliminary information, the solution is not unique and may lose physical correctness. To remedy this, Physics-Informed Neural Networks (PINNs) leverage governing physical equations in neural network training. Namely, PINNs are designed to be trained to satisfy the given training data as well as the imposed governing equations. In this fashion, a neural network can be guided with training datasets that do not necessarily need to be large or complete. An accurate solution of partial differential equations can potentially be found without knowing the boundary conditions. Therefore, with some knowledge about the physical characteristics of the problem and some form of training data (even sparse and incomplete), PINNs may be used for finding an optimal solution with high fidelity. PINNs can be applied to a wide range of problems in computational science, and are a pioneering technology leading to the development of new classes of numerical solvers for PDEs. PINNs can be thought of as a mesh-free alternative to traditional approaches (e.g., CFD for fluid dynamics), and new data-driven approaches for model inversion and system identification. Notably, a trained PINN network can be used to predict values on simulation grids of different resolutions without needing to be retrained. Additionally, the derivatives used in the partial differential equations can be computed using automatic differentiation (AD), which is assessed to be superior to numerical or symbolic differentiation. == Modeling and computation == A general nonlinear partial differential equation can be written as: u t + N [ u ; λ ] = 0 , x ∈ Ω , t ∈ [ 0 , T ] {\displaystyle u_{t}+{\mathcal {N}}[u;\lambda ]=0,\quad x\in \Omega ,\quad t\in [0,T]} where u ( t , x ) {\displaystyle u(t,x)} denotes the solution, N [ ⋅ ; λ ] {\displaystyle {\mathcal {N}}[\cdot ;\lambda ]} is a nonlinear operator parameterized by λ {\displaystyle \lambda } , and Ω {\displaystyle \Omega } is a subset of R D {\displaystyle \mathbb {R} ^{D}} . This general form of governing equations summarizes a wide range of problems in mathematical physics, such as conservative laws, diffusion process, advection-diffusion systems, and kinetic equations. Given noisy measurements of a generic dynamic system described by the equation above, PINNs can be designed to solve two classes of problems: data-driven solutions of partial differential equations data-driven discovery of partial differential equations === Data-driven solution of partial differential equations === The data-driven solution of PDE computes the hidden state u ( t , x ) {\displaystyle u(t,x)} of the system given boundary data and/or measurements z {\displaystyle z} , and fixed model parameters λ {\displaystyle \lambda } . We solve: u t + N [ u ] = 0 , x ∈ Ω , t ∈ [ 0 , T ] {\displaystyle u_{t}+{\mathcal {N}}[u]=0,\quad x\in \Omega ,\quad t\in [0,T]} . by defining the residual f ( t , x ) {\displaystyle f(t,x)} as: f := u t + N [ u ] {\displaystyle f:=u_{t}+{\mathcal {N}}[u]} , and approximating u ( t , x ) {\displaystyle u(t,x)} by a deep neural network. This network can be differentiated using automatic differentiation. The parameters of u ( t , x ) {\displaystyle u(t,x)} and f ( t , x ) {\displaystyle f(t,x)} can be then learned by minimizing the following loss function L tot {\displaystyle L_{\text{tot}}} : L tot = L u + L f {\displaystyle L_{\text{tot}}=L_{u}+L_{f}} where: L u = ‖ u − z ‖ Γ {\displaystyle L_{u}=\Vert u-z\Vert _{\Gamma }} is the error between the PINN u ( t , x ) {\displaystyle u(t,x)} and the set of boundary conditions and measured data on the set of points Γ {\displaystyle \Gamma } where the boundary conditions and data are defined. L f = ‖ f ‖ Γ {\displaystyle L_{f}=\Vert f\Vert _{\Gamma }} is the mean-squared error of the residual function. This second term encourages the PINN to learn the structural information expressed by the PDE during the training process. This approach has been used to yield computationally efficient physics-informed surrogate models with applications in the forecasting of physical processes, model predictive control, multi-physics and multi-scale modeling, and simulation. It has been shown to converge to the solution of the PDE. === Data-driven discovery of partial differential equations === Given noisy and incomplete measurements z {\displaystyle z} of the state of the system, the data-driven discovery of PDEs results in computing the unknown state u ( t , x ) {\displaystyle u(t,x)} and learning model parameters λ {\displaystyle \lambda } that best describe the observed data: u t + N [ u ; λ ] = 0 , x ∈ Ω , t ∈ [ 0 , T ] {\displaystyle u_{t}+{\mathcal {N}}[u;\lambda ]=0,\quad x\in \Omega ,\quad t\in [0,T]} By defining f ( t , x ) {\displaystyle f(t,x)} as: f := u t + N [ u ; λ ] = 0 {\displaystyle f:=u_{t}+{\mathcal {N}}[u;\lambda ]=0} , and approximating u ( t , x ) {\displaystyle u(t,x)} by a deep neural network, f ( t , x ) {\displaystyle f(t,x)} results in a PINN. This network can be derived using automatic differentiation. The parameters of u ( t , x ) {\displaystyle u(t,x)} and f ( t , x ) {\displaystyle f(t,x)} , together with the parameter λ {\displaystyle \lambda } of the differential operator can be then learned by minimizing the following loss function L tot {\displaystyle L_{\text{tot}}} : L tot = L u + L f {\displaystyle L_{\text{tot}}=L_{u}+L_{f}} where: L u = ‖ u − z ‖ Γ {\displaystyle L_{u}=\Vert u-z\Vert _{\Gamma }} , with u {\displaystyle u} and z {\displaystyle z} state solutions and measurements at sparse location Γ {\displaystyle \Gamma } , respectively. L f = ‖ f ‖ Γ {\displaystyle L_{f}=\Vert f\Vert _{\Gamma }} is the residual function. This second term requires the structured information represented by the partial differential equations to be satisfied in the training process. This strategy allows for discovering dynamic models described by nonlinear PDEs assembling computationally efficient and fully differentiable surrogate models that may find application in predictive forecasting, control, and data assimilation. == Extensions and applications == === For piece-wise function approximation === PINNs are unable to approximate PDEs that have strong non-linearity or sharp gradients (such as those that commonly occur in practical fluid flow problems). Piecewise approximation has been an old practic

    Read more →