AI For Economics Students

AI For Economics Students — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Pocketbook (application)

    Pocketbook (application)

    Pocketbook was a Sydney-based free budget planner and personal finance app launched in 2012. The app helped users setup and manage budgets, track spending and manage bills. As of 2016 Pocketbook claimed to support over 250,000 Australians, in January 2018 that number was 435,000. After being acquired by Zip Co Ltd in 2016, it was announced in 2022 that the app was to be shut down and all user accounts deleted. == History == Pocketbook was founded by Alvin Singh and Bosco Tan in 2012. It was conceived in 2011 in a Wolli Creek apartment as a tool for Alvin and Bosco to take control of their money. In 2013, Pocketbook raised $500,000 from technology fund Tank Stream Ventures, and a group of investors including TV personality David Koch, Geoff Levy, David Shein and Peter Cooper. In September 2016 Digital retail finance and payment industry player zipMoney (now trading as Zip Co Limited) acquired Pocketbook in a $7.5m deal == Features == The app synced with the bank account of users and would organize spending into different categories. Users could also be reminded of bill payments, analyse spending and set spending limits. They can also be alerted of fraudulent transactions and deductions. The app employs security measures like end to end encryption, CloudFlare protection, fraud detection, identity protection etc. Pocketbook was available via web and mobile version. == Awards == Personal Finance Innovator of the Year by Fintech Business Awards 2017 Innovator of the Year by OPTUS MyBusiness Awards 2017 Best Finance App of 2016 by Australian Fintech Best Personal Finance App: Pocketbook won the 2016 Finder Innovation Awards, presented at a gala dinner hosted by media personality and The New Inventors presenter James O'Loghlin. Best Mobile App of the Year Winner: StartCon hosted the first annual Australasian Startup Awards. Over 200 nominations in 14 categories and an overall winner were reviewed, and winners were determined by public voting, with over 63,000 votes in total. Best New Startup 2014 by StartupSmart. Finalist in the SWIFT Innotribe startup competition in Dubai in 2013.

    Read more →
  • Inverse depth parametrization

    Inverse depth parametrization

    In computer vision, the inverse depth parametrization is a parametrization used in methods for 3D reconstruction from multiple images such as simultaneous localization and mapping (SLAM). Given a point p {\displaystyle \mathbf {p} } in 3D space observed by a monocular pinhole camera from multiple views, the inverse depth parametrization of the point's position is a 6D vector that encodes the optical centre of the camera c 0 {\displaystyle \mathbf {c} _{0}} when in first observed the point, and the position of the point along the ray passing through p {\displaystyle \mathbf {p} } and c 0 {\displaystyle \mathbf {c} _{0}} . Inverse depth parametrization generally improves numerical stability and allows to represent points with zero parallax. Moreover, the error associated to the observation of the point's position can be modelled with a Gaussian distribution when expressed in inverse depth. This is an important property required to apply methods, such as Kalman filters, that assume normality of the measurement error distribution. The major drawback is the larger memory consumption, since the dimensionality of the point's representation is doubled. == Definition == Given 3D point p = ( x , y , z ) {\displaystyle \mathbf {p} =(x,y,z)} with world coordinates in a reference frame ( e 1 , e 2 , e 3 ) {\displaystyle (e_{1},e_{2},e_{3})} , observed from different views, the inverse depth parametrization y {\displaystyle \mathbf {y} } of p {\displaystyle \mathbf {p} } is given by: y = ( x 0 , y 0 , z 0 , θ , ϕ , ρ ) {\displaystyle \mathbf {y} =(x_{0},y_{0},z_{0},\theta ,\phi ,\rho )} where the first five components encode the camera pose in the first observation of the point, being c 0 = ( x 0 , y 0 , z 0 ) {\displaystyle \mathbf {c_{0}} =(x_{0},y_{0},z_{0})} the optical centre, ϕ {\displaystyle \phi } the azimuth, θ {\displaystyle \theta } the elevation angle, and ρ = 1 ‖ p − c 0 ‖ {\displaystyle \rho ={\frac {1}{\left\Vert \mathbf {p} -\mathbf {c} _{0}\right\Vert }}} the inverse depth of p {\displaystyle p} at the first observation.

    Read more →
  • Artificial intelligence in fraud detection

    Artificial intelligence in fraud detection

    Artificial intelligence is used by many different businesses and organizations. It is widely used in the financial sector, especially by accounting firms, to help detect fraud. In 2022, PricewaterhouseCoopers reported that fraud has impacted 46% of all businesses in the world. The shift from working in person to working from home has brought increased access to data. According to an FTC (Federal Trade Commission) study from 2022, customers reported fraud of approximately $5.8 billion in 2021, an increase of 70% from the year before. The majority of these scams were imposter scams and online shopping frauds. Furthermore, artificial intelligence plays a crucial role in developing advanced algorithms and machine learning models that enhance fraud detection systems, enabling businesses to stay ahead of evolving fraudulent tactics in an increasingly digital landscape. == Tools == === Expert systems === Expert systems were first designed in the 1970s as an expansion into artificial intelligence technologies. Their design is based on the premise of decreasing potential user error in decision-making and emulating mental reasoning used by experts in a particular field. They differentiate themselves from traditional linear reasoning models by separating identified points in data and processing them individually at the same time. Though, these systems do not rely purely on machine-learned intelligence. Information regarding rules, practices, and procedures in the form of "if-then" statements are implemented into the programming of the system. Users interact with the system by feeding information into the system either through direct entry or import of external data. An inference system compares the information provided by the user with corresponding rules that are believed to specifically apply to the situation. Using this information and the corresponding rules will be used to create a solution to the user's query. Expert systems will generally not operate properly when the common procedures for a specified situation are ambiguous due to the need for well-defined rules. Implementation of expert systems in accounting procedures is feasible in areas where professional judgment is required. Situations where expert systems are applicable include investigations into transactions that involve potential fraudulent entries, instances of going concern, and the evaluation of risk in the planning stages of an audit. === Continuous auditing === Continuous auditing is a set of processes that assess various aspects of information gathered in an audit to classify areas of risk and potential weaknesses in financial Internal controls at a more frequent rate than traditional methods. Instead of analyzing recorded transactions and journal entries periodically, continuous auditing focuses on interpreting the character of these actions more frequently. The frequency of these processes being undertaken as well as highlighting areas of importance is up to the discretion of their implementer, who commonly makes such decisions based on the level of risk in the accounts being evaluated and the goals of implementing the system. Performance of these processes can occur as frequently as being nearly instantaneous with an entry being posted. The processes involved with analyzing financial data in continuous auditing can include the creation of spreadsheets to allow for interactive information gathering, calculation of financial ratios for comparison with previously created models, and detection of errors in entered figures. A primary goal of this practice is to allow for quicker and easier detection of instances of faulty controls, errors, and instances of fraud. === Machine learning and deep learning === The ability of machine learning and deep learning to swiftly and effectively sort through vast volumes of data in the forms of various documents relevant to companies and documents being audited makes them applicable to the domains of audit and fraud detection. Examples of this include recognizing key language in contracts, identifying levels of risk of fraud in transactions, and assessing journal entries for misstatement. == Applications == === 'Big 4' Accounting Firms === Deloitte created an Al-enabled document-reviewing system in 2014. The system automates the method of reviewing and extracting relevant information from different business documents. Deloitte claims that this innovation has made a difference by reducing time spent going through lawful contract documents, invoices, money-related articulations, and board minutes by up to 50%. Working with IBM's Watson, Deloitte is developing cognitive-technology-enhanced commerce arrangements for its clients. LeasePoint is fueled by IBM TRIRIGA (this product evolved into IBM Maximo Real Estate and Facilities) and uses Deloitte's industrial information to create an end-to-end leasing portfolio. Automated Cognitive Resource Assessment employs IBM's Maximo innovation to progress the proficiency of asset inspection. Ernst and Young (EY) connected Al to the investigation of lease contracts. EY (Australia) has also received Al-enabled auditing technology. Collaborating with H20.ai, PwC developed an Al-enabled framework (GL.ai) capable of analyzing reports and preparing reports. PwC claims to have made a significant investment in normal dialect processing (NLP), an Al-enabled innovation to process unstructured information efficiently. KPMG built a portfolio of Al instruments, called KPMG Ignite, to upgrade trade decisions and forms. Working with Microsoft and IBM Watson, KPMG is creating instruments to coordinate Al, data analytics, Cognitive Technologies, and RPA. == Advantages == === Efficiency === The process of auditing an entity in an attempt to detect fraudulent activity requires the repeating of investigatory processes until an error or misstatement may be identified. Under traditional methods, these processes would be carried out by a human being. Proponents of artificial intelligence in fraud detection have stated that these traditional methods are inefficient and can be more quickly accomplished with the aid of an intelligent computing system. A survey of 400 chief executive officers created by KPMG in 2016 found that approximately 58% believed that artificial intelligence would play a key role in making audits more efficient in the future. === Data interpretation === Higher levels of fraud detection entail the use of professional judgement to interpret data. Supporters of artificial intelligence being used in financial audits have claimed that increased risks from instances of higher data interpretation can be minimized through such technologies. One necessary element of an audit of financial statements that requires professional judgement is the implementation of thresholds for materiality. Materiality entails the distinction between errors and transactions in financial statements that would impact decisions made by users of those financial statements. The threshold for materiality in an audit is set by the auditor based on various factors. Artificial intelligence has been used to interpret data and suggest materiality thresholds to be implemented through the use of expert systems. === Decreased costs === Those in favor of using artificial intelligence to complete investigations of fraud have stated that such technologies decrease the amount of time required to complete tasks that are repetitive. The claim further states that such efficiencies allow for lowered resource requirements, which can then be further spent on tasks that have not been fully automated. The audit firm Ernst & Young has posited these claims by declaring that their deep learning systems have been used to reduce time spent on administrative tasks by analyzing relevant audit documents. According to the firm, this has allowed their employees to focus more on judgement and analysis. == Disadvantages == === Job Displacement === The inescapable reception of computer based intelligence and robotization advancements might prompt critical work relocation across different enterprises. As artificial intelligence frameworks become more equipped for performing undertakings customarily completed by people, there is a worry that specific work jobs could become out of date, prompting joblessness and financial imbalance. === Initial investment requirement === Along with a knowledge of coding and building systems through computer programs, we are seeing the advantages of these systems, but since they are so new, they require a large investment to start building such a system. Any firm that is planning on implementing an AI system to detect fraud must hire a team of data scientists, along with upgrading their cloud system and data storage. The system must be consistently monitored and updated to be the most efficient form of itself, otherwise the likelihood of fraud being involved in those transactions increases. If one does not initially invest in such a syst

    Read more →
  • BERT (language model)

    BERT (language model)

    Bidirectional encoder representations from transformers (BERT) is a language model introduced in October 2018 by researchers at Google. It learns to represent text as a sequence of vectors using self-supervised learning. It uses the encoder-only transformer architecture. BERT dramatically improved the state of the art for large language models. As of 2020, BERT is a ubiquitous baseline in natural language processing (NLP) experiments. BERT is trained by masked token prediction and next sentence prediction. With this training, BERT learns contextual, latent representations of tokens in their context, similar to ELMo and GPT-2. It found applications for many natural language processing tasks, such as coreference resolution and polysemy resolution. It improved on ELMo and spawned the study of "BERTology", which attempts to interpret what is learned by BERT. BERT was originally implemented in the English language at two model sizes, BERTBASE (110 million parameters) and BERTLARGE (340 million parameters). Both were trained on the Toronto BookCorpus (800M words) and English Wikipedia (2,500M words). The weights were released on GitHub. On March 11, 2020, 24 smaller models were released, the smallest being BERTTINY with just 4 million parameters. == Architecture == BERT is an "encoder-only" transformer architecture. At a high level, BERT consists of 4 modules: Tokenizer: This module converts a piece of English text into a sequence of integers ("tokens"). Embedding: This module converts the sequence of tokens into an array of real-valued vectors representing the tokens. It represents the conversion of discrete token types into a lower-dimensional Euclidean space. Encoder: a stack of Transformer blocks with self-attention, but without causal masking. Task head: This module converts the final representation vectors into one-shot encoded tokens again by producing a predicted probability distribution over the token types. It can be viewed as a simple decoder, decoding the latent representation into token types, or as an "un-embedding layer". The task head is necessary for pre-training, but it is often unnecessary for so-called "downstream tasks," such as question answering or sentiment classification. Instead, one removes the task head and replaces it with a newly initialized module suited for the task, and finetune the new module. The latent vector representation of the model is directly fed into this new module, allowing for sample-efficient transfer learning. === Embedding === This section describes the embedding used by BERTBASE. The other one, BERTLARGE, is similar, just larger. The tokenizer of BERT is WordPiece, which is a sub-word strategy like byte-pair encoding. Its vocabulary size is 30,000, and any token not appearing in its vocabulary is replaced by [UNK] ("unknown"). The first layer is the embedding layer, which contains three components: token type embeddings, position embeddings, and segment type embeddings. Token type: The token type is a standard embedding layer, translating a one-hot vector into a dense vector based on its token type. Position: The position embeddings are based on a token's position in the sequence. BERT uses absolute position embeddings, where each position in a sequence is mapped to a real-valued vector. Each dimension of the vector consists of a sinusoidal function that takes the position in the sequence as input. Segment type: Using a vocabulary of just 0 or 1, this embedding layer produces a dense vector based on whether the token belongs to the first or second text segment in that input. In other words, type-1 tokens are all tokens that appear after the [SEP] special token. All prior tokens are type-0. The three embedding vectors are added together representing the initial token representation as a function of these three pieces of information. After embedding, the vector representation is normalized using a LayerNorm operation, outputting a 768-dimensional vector for each input token. After this, the representation vectors are passed forward through 12 Transformer encoder blocks, and are decoded back to 30,000-dimensional vocabulary space using a basic affine transformation layer. === Architectural family === The encoder stack of BERT has 2 free parameters: L {\displaystyle L} , the number of layers, and H {\displaystyle H} , the hidden size. There are always H / 64 {\displaystyle H/64} self-attention heads, and the feed-forward/filter size is always 4 H {\displaystyle 4H} . By varying these two numbers, one obtains an entire family of BERT models. For BERT: the feed-forward size and filter size are synonymous. Both of them denote the number of dimensions in the middle layer of the feed-forward network. the hidden size and embedding size are synonymous. Both of them denote the number of real numbers used to represent a token. The notation for encoder stack is written as L/H. For example, BERTBASE is written as 12L/768H, BERTLARGE as 24L/1024H, and BERTTINY as 2L/128H. == Training == === Pre-training === BERT was pre-trained simultaneously on two tasks: Masked language modeling (MLM): In this task, BERT ingests a sequence of words, where one word may be randomly changed ("masked"), and BERT tries to predict the original words that had been changed. For example, in the sentence "The cat sat on the [MASK]," BERT would need to predict "mat." This helps BERT learn bidirectional context, meaning it understands the relationships between words not just from left to right or right to left but from both directions at the same time. Next sentence prediction (NSP): In this task, BERT is trained to predict whether one sentence logically follows another. For example, given two sentences, "The cat sat on the mat" and "It was a sunny day", BERT has to decide if the second sentence is a valid continuation of the first one. This helps BERT understand relationships between sentences, which is important for tasks like question answering or document classification. ==== Masked language modeling ==== In masked language modeling, 15% of tokens would be randomly selected for masked-prediction task, and the training objective was to predict the masked token given its context. In more detail, the selected token is: replaced with a [MASK] token with probability 80%, replaced with a random word token with probability 10%, not replaced with probability 10%. The reason not all selected tokens are masked is to avoid the dataset shift problem. The dataset shift problem arises when the distribution of inputs seen during training differs significantly from the distribution encountered during inference. A trained BERT model might be applied to word representation (like Word2Vec), where it would be run over sentences not containing any [MASK] tokens. It is later found that more diverse training objectives are generally better. As an illustrative example, consider the sentence "my dog is cute". It would first be divided into tokens like "my1 dog2 is3 cute4". Then a random token in the sentence would be picked. Let it be the 4th one "cute4". Next, there would be three possibilities: with probability 80%, the chosen token is masked, resulting in "my1 dog2 is3 [MASK]4"; with probability 10%, the chosen token is replaced by a uniformly sampled random token, such as "happy", resulting in "my1 dog2 is3 happy4"; with probability 10%, nothing is done, resulting in "my1 dog2 is3 cute4". After processing the input text, the model's 4th output vector is passed to its decoder layer, which outputs a probability distribution over its 30,000-dimensional vocabulary space. ==== Next sentence prediction ==== Given two sentences, the model predicts if they appear sequentially in the training corpus, outputting either [IsNext] or [NotNext]. During training, the algorithm sometimes samples two sentences from a single continuous span in the training corpus, while at other times, it samples two sentences from two discontinuous spans. The first sentence starts with a special token, [CLS] (for "classify"). The two sentences are separated by another special token, [SEP] (for "separate"). After processing the two sentences, the final vector for the [CLS] token is passed to a linear layer for binary classification into [IsNext] and [NotNext]. For example: Given "[CLS] my dog is cute [SEP] he likes playing [SEP]", the model should predict [IsNext]. Given "[CLS] my dog is cute [SEP] how do magnets work [SEP]", the model should predict [NotNext]. === Fine-tuning === BERT is meant as a general pretrained model for various applications in natural language processing. That is, after pre-training, BERT can be fine-tuned with fewer resources on smaller datasets to optimize its performance on specific tasks such as natural language inference and text classification, and sequence-to-sequence-based language generation tasks such as question answering and conversational response generation. The original BERT paper published results demonstrating that a small amount of fine

    Read more →
  • BeyondCorp

    BeyondCorp

    BeyondCorp is an implementation of zero-trust computer security concepts creating a zero trust network. It is created by Google. == Background == It was created in response to the 2009 Operation Aurora. An open source implementation inspired by Google's research paper on an access proxy is known as "transcend". Google documented its Zero Trust journey from 2014 to 2018 through a series of articles in the journal ;login:. Google called their ZT network "BeyondCorp". Google implemented a Zero Trust architecture on a large scale, and relied on user and device credentials, regardless of location. Data was encrypted and protected from managed devices. Unmanaged devices, such as BYOD, were not given access to the BeyondCorp resources. == Design and technology == BeyondCorp utilized a zero trust security model, which is a relatively new security model that it assumes that all devices and users are potentially compromised. This is in contrast to traditional security models, which rely on firewalls and other perimeter defenses to protect sensitive data. === Trust === The corporate network grants no inherent trust, and all internal apps are accessed via the BeyondCorp system, regardless of whether the user is in a Google office or working remotely. BeyondCorp is related to Zero Trust architecture as it implements a true Zero Trust network, where all access is granted on identity, device, and authentication, based on robust underlying device and identity data sources. BeyondCorp works by using a number of security policies including authentication, authorization, and access control to ensure that only authorized users can access corporate resources. Authentication verifies the identity of the user, authorization determines whether the user has permission to access the requested resource, and access control policies restrict what the user can do with the resource. ==== Trust Inferrer ==== One of the main components in BeyondCorp's implementation is the Trust Inferrer. The Trust Inferrer is a security component (typically software) that looks at information about a user's device, like a computer or phone, to decide how much it can be trusted to access certain resources like important company documents. The Trust Inferrer checks things like the security of the device, whether it has the right software installed, and if it belongs to an authorized user. Based on all this information, the Trust Inferrer decides what the device can access and what it can't. === Security mechanisms === Unlike traditional VPNs, BeyondCorp's access policies are based on information about a device, its state, and its associated user. BeyondCorp considers both internal networks and external networks to be completely untrusted, and gates access to applications by dynamically asserting and enforcing levels, or “tiers,” of access. === Device Inventory Database === BeyondCorp utilized a Device Inventory Database and Device Identity that uniquely identifies a device through a digital certificate. Any changes to the device are recorded in the Device Inventory Database. The certificate is used to uniquely identify a device; however, additional information is required to grant access privileges to a resource. === Access Control Engine === Another important component of BeyondCorp's implementation is the Access Control Engine. Think of this as the brain of the Zero Trust architecture. The Access Control Engine is like a traffic cop standing at an intersection. Its job is to make sure that only authorized devices and users are allowed to access specific resources (like files or applications) on the network. It checks the access policy (the rules that say who can access what), the device's state (like whether it has the right software updates or security settings), and the resources being requested. Then it makes a decision on whether to grant or deny access based on all of this information. It helps ensure that only the right people and devices are allowed access to the network, which helps keep things secure. The Access Control Engine utilizes the output from the Trust Inferrer and other data that is fed into its system. == Usage == One of the first things Google did to implement a Zero Trust architecture was to capture and analyze network traffic. The purpose of analyzing the traffic was to build a baseline of what typical network traffic looked like. In doing so, BeyondCorp also discovered unusual, unexpected, and unauthorized traffic. This was very useful because it gave the BeyondCorp engineers critical information that assisted them in reengineering the system in a secure manner. Some of the benefits BeyondCorp realized by adopting a Zero Trust architecture include the ability to allow their employees to work securely from any location. It reduces the risk of data breaches since data and applications are protected and users and devices are constantly being verified. The Zero Trust architecture is scalable and can be adapted to the changing needs of the businesses and their users. Especially relevant in today's work-from-home era, BeyondCorp allows employees to access enterprise resources securely from any location, without the need for traditional VPNs.

    Read more →
  • Microapp

    Microapp

    A microapp is a super-specialized application designed to perform one task or use case with the only objective of doing it well. They follow the single responsibility principle, which states that "a class should have one and only one reason to change." Micro applications help developers create less complex applications while reducing costs by breaking down monolithic systems into groups of independent services acting as one system. A good example of Microapps would be https://docs.citrix.com/en-us/legacy-archive/downloads/microapps.pdfthat provide single purpose action from Salesforce and over 40 applications on its workspace. == Requirements and characteristics == Microapps usually are accessible on any device, display, or operating system without installation on the viewer's device. To qualify as a microapp, the entity must: be built and deployed as an independent software module bring together various media types into a single experience have advanced security and compliance features be functionally-extensible comply with granular data demands be agnostic single use case oriented Microapps differentiate from traditional web or mobile applications by how the end-user interacts with them. Consequently, they can be embedded in websites or viewed online to bypass app stores and are typically built to provide a focused experience to the user. == Usage == Microapps are typically used for commercial purposes to reduce development costs for projects not requiring the large scope of a traditional web or mobile application. In addition, they are often used to showcase in-depth information or enrich marketing material with interactivity. Lately, micro apps are being used to boost productivity by providing quick tools to people to reuse best practices. Users have been interacting with microapps for a while with suites like Microsoft 365 and Google Workspace, where each one of their end-user services could be considered as a microapp. All these microapps share a unique identity manager to provide a unified user experience. == Benefits == Replacing monolith systems with microapps provide several advantages like: Reduce complexity for developers and users. Smaller, more cohesive, and maintainable codebases Scalable organizations with decoupled, autonomous teams Allows for hyper-specialization Independent deployment Multi-stack == Cloud-native microapps == Technologies like Kubernetes, or OpenShift, allow companies to replace their monolith and legacy systems with modular software taking advantage of microapps on reducing costs and improve reliability and security. == Microapps vs. microservices == There is a widespread misunderstanding between these two concepts, which is the key difference. Microservices is an architectural style that is systems-centric, meaning it decouples the presentation and data layer using web services APIs. On the other side, micro apps behave more as a super-architecture style (that embraces microservices among other types), and it is user-centric, meaning they decouple the whole monolith system onto modules that are designed to interact with final users. Both architectural styles rely on modularity to provide high performance, scalability, and resilience. == Considerations == Developing Micro apps requires a different approach than traditional software, and user experience is crucial. The following considerations are essential for switching to microapps. To run multiple microapps is required a single identity management system. Microservices are well suited to make microapps more powerful Apps with different levels of maturity might create a non-unified user experience. Duplication of dependencies can create security issues and inefficiencies. Suitable for well-organized teams

    Read more →
  • Digital image processing

    Digital image processing

    Digital image processing is the use of a digital computer to process digital images through an algorithm. As a subcategory or field of digital signal processing, digital image processing has many advantages over analog image processing. It allows a much wider range of algorithms to be applied to the input data and can avoid problems such as the build-up of noise and distortion during processing. Since images are defined over two dimensions (perhaps more), digital image processing may be modeled in the form of multidimensional systems. The generation and development of digital image processing are mainly affected by three factors: first, the development of computers; second, the development of mathematics (especially the creation and improvement of discrete mathematics theory); and third, the demand for a wide range of applications in environment, agriculture, military, industry and medical science has increased. == History == Many of the techniques of digital image processing, or digital picture processing as it often was called, were developed in the 1960s, at Bell Laboratories, the Jet Propulsion Laboratory, Massachusetts Institute of Technology, University of Maryland, and a few other research facilities, with application to satellite imagery, wire-photo standards conversion, medical imaging, videophone, character recognition, and photograph enhancement. The purpose of early image processing was to improve the quality of the image. In image processing, the input is a low-quality image, and the output is an image with improved quality. Common image processing includes image enhancement, restoration, encoding, and compression. The first successful application was the American Jet Propulsion Laboratory (JPL). They used image processing techniques such as geometric correction, gradation transformation, noise removal, etc. on the thousands of lunar photos sent back by the Space Detector Ranger 7 in 1964, taking into account the position of the Sun and the environment of the Moon. The impact of the successful mapping of the Moon's surface map by the computer has been a success. Later, more complex image processing was performed on the nearly 100,000 photos sent back by the spacecraft, so that the topographic map, color map and panoramic mosaic of the Moon were obtained, which achieved extraordinary results and laid a solid foundation for human landing on the Moon. The cost of processing was fairly high, however, with the computing equipment of that era. That changed in the 1970s, when digital image processing proliferated as cheaper computers and dedicated hardware became available. This led to images being processed in real-time, for some dedicated problems such as television standards conversion. As general-purpose computers became faster, they started to take over the role of dedicated hardware for all but the most specialized and computer-intensive operations. With the fast computers and signal processors available in the 2000s, digital image processing has become the most common form of image processing, and is generally used because it is not only the most versatile method, but also the cheapest. === Image sensors === The basis for modern image sensors is metal–oxide–semiconductor (MOS) technology, invented at Bell Labs between 1955 and 1960, This led to the development of digital semiconductor image sensors, including the charge-coupled device (CCD) and later the CMOS sensor. The charge-coupled device was invented by Willard S. Boyle and George E. Smith at Bell Labs in 1969. While researching MOS technology, they realized that an electric charge was the analogy of the magnetic bubble and that it could be stored on a tiny MOS capacitor. As it was fairly straightforward to fabricate a series of MOS capacitors in a row, they connected a suitable voltage to them so that the charge could be stepped along from one to the next. The CCD is a semiconductor circuit that was later used in the first digital video cameras for television broadcasting. The NMOS active-pixel sensor (APS) was invented by Olympus in Japan during the mid-1980s. This was enabled by advances in MOS semiconductor device fabrication, with MOSFET scaling reaching smaller micron and then sub-micron levels. The NMOS APS was fabricated by Tsutomu Nakamura's team at Olympus in 1985. The CMOS active-pixel sensor (CMOS sensor) was later developed by Eric Fossum's team at the NASA Jet Propulsion Laboratory in 1993. By 2007, sales of CMOS sensors had surpassed CCD sensors. MOS image sensors are widely used in optical mouse technology. The first optical mouse, invented by Richard F. Lyon at Xerox in 1980, used a 5 μm NMOS integrated circuit sensor chip. Since the first commercial optical mouse, the IntelliMouse introduced in 1999, most optical mouse devices use CMOS sensors. === Image compression === An important development in digital image compression technology was the discrete cosine transform (DCT), a lossy compression technique first proposed by Nasir Ahmed in 1972. DCT compression became the basis for JPEG, which was introduced by the Joint Photographic Experts Group in 1992. JPEG compresses images down to much smaller file sizes, and has become the most widely used image file format on the Internet. Its highly efficient DCT compression algorithm was largely responsible for the wide proliferation of digital images and digital photos, with several billion JPEG images produced every day as of 2015. Medical imaging techniques produce very large amounts of data, especially from CT, MRI and PET modalities. As a result, storage and communications of electronic image data are prohibitive without the use of compression. JPEG 2000 image compression is used by the DICOM standard for storage and transmission of medical images. The cost and feasibility of accessing large image data sets over low or various bandwidths are further addressed by use of another DICOM standard, called JPIP, to enable efficient streaming of the JPEG 2000 compressed image data. === Digital signal processor (DSP) === Electronic signal processing was revolutionized by the wide adoption of MOS technology in the 1970s. MOS integrated circuit technology was the basis for the first single-chip microprocessors and microcontrollers in the early 1970s, and then the first single-chip digital signal processor (DSP) chips in the late 1970s. DSP chips have since been widely used in digital image processing. The discrete cosine transform (DCT) image compression algorithm has been widely implemented in DSP chips, with many companies developing DSP chips based on DCT technology. DCTs are widely used for encoding, decoding, video coding, audio coding, multiplexing, control signals, signaling, analog-to-digital conversion, formatting luminance and color differences, and color formats such as YUV444 and YUV411. DCTs are also used for encoding operations such as motion estimation, motion compensation, inter-frame prediction, quantization, perceptual weighting, entropy encoding, variable encoding, and motion vectors, and decoding operations such as the inverse operation between different color formats (YIQ, YUV and RGB) for display purposes. DCTs are also commonly used for high-definition television (HDTV) encoder/decoder chips. == Tasks == Digital image processing allows the use of much more complex algorithms, and hence, can offer both more sophisticated performance at simple tasks, and the implementation of methods which would be impossible by analogue means. In particular, digital image processing is a concrete application of, and a practical technology based on: Classification Feature extraction Multi-scale signal analysis Pattern recognition Projection Some techniques that are used in digital image processing include: Anisotropic diffusion Hidden Markov models Image editing Image restoration Independent component analysis Linear filtering Neural networks Partial differential equations Pixelation Point feature matching Principal components analysis Self-organizing maps Wavelets == Digital image transformations == === Filtering === Digital filters are used to blur and sharpen digital images. Filtering can be performed by: convolution with specifically designed kernels (filter array) in the spatial domain masking specific frequency regions in the frequency (Fourier) domain The following examples show both methods: ==== Image padding in Fourier domain filtering ==== Images are typically padded before being transformed to the Fourier space, the highpass filtered images below illustrate the consequences of different padding techniques: Notice that the highpass filter shows extra edges when zero padded compared to the repeated edge padding. ==== Filtering code examples ==== MATLAB example for spatial domain highpass filtering. === Affine transformations === Affine transformations enable basic image transformations including scale, rotate, translate, mirror and shear as is shown in the following examples: To apply the affine

    Read more →
  • Powerset (company)

    Powerset (company)

    Powerset was an American company based in San Francisco, California, that, in 2006, was developing a natural language search engine for the Internet. On July 1, 2008, Powerset was acquired by Microsoft for an estimated $100 million (~$143 million in 2024). Powerset was working on building a natural language search engine that could find targeted answers to user questions (as opposed to keyword based search). For example, when confronted with a question like "Which U.S. state has the highest income tax?", conventional search engines ignore the question phrasing and instead do a search on the keywords "state", "highest", "income", and "tax". Powerset on the other hand, attempts to use natural language processing to understand the nature of the question and return pages containing the answer. The company was in the process of "building a natural language search engine that reads and understands every sentence on the Web". The company has licensed natural language technology from PARC, the former Xerox Palo Alto Research Center. On May 11, 2008, the company unveiled a tool for searching a fixed subset of English Wikipedia using conversational phrases rather than keywords. Acquisition by Microsoft: One significant milestone in Powerset's history was its acquisition by Microsoft on July 1, 2008, for an estimated $100 million. This acquisition was part of Microsoft's broader strategy to enhance its search capabilities and compete more effectively with other search engine providers, particularly Google. Natural Language Search Engine: Powerset's primary focus was on developing a natural language search engine capable of understanding and interpreting user queries in a more human-like manner. Instead of simply matching keywords, Powerset aimed to comprehend the meaning behind the words, allowing for more accurate and contextually relevant search results. Technology and Partnerships: Powerset had licensed natural language technology from PARC, the Xerox Palo Alto Research Center. This technology likely played a crucial role in the development of Powerset's NLP capabilities. Wikipedia Search Tool: In May 2008, Powerset unveiled a search tool that allowed users to search a fixed subset of English Wikipedia using conversational phrases rather than traditional keywords. This demonstrated the potential of Powerset's NLP technology in providing more precise and relevant search results. == Powerlabs == In a form of beta testing, Powerset opened an online community called Powerlabs on September 17, 2007. Business Week said: "The company hopes the site will marshal thousands of people to help build and improve its search engine before it goes public next year." Said The New York Times: "[Powerset Labs] goes far beyond the 'alpha' or 'beta' testing involved in most software projects, when users put a new product through rigorous testing to find its flaws. Powerset doesn’t have a product yet, but rather a collection of promising natural language technologies, which are the fruit of years of research at Xerox PARC." Powerlabs' initial search results are taken from Wikipedia. == Notable people == Barney Pell (born March 18, 1968, in Hollywood, California) was co-founder and CEO of Powerset. Pell received his Bachelor of Science degree in symbolic systems from Stanford University in 1989, where he graduated Phi Beta Kappa and was a National Merit Scholar. Pell received a PhD in computer science from Cambridge University in 1993, where he was a Marshall Scholar. He has worked at NASA, as chief strategist and vice president of business development at StockMaster.com (acquired by Red Herring in March, 2000) and at Whizbang! Labs. Prior to joining Powerset, Pell was an Entrepreneur-in-Residence at Mayfield Fund, a venture capital firm in Silicon Valley. Pell is also a founder of Moon Express, Inc., a U.S. company awarded a $10M commercial lunar contract by NASA and a competitor in the Google Lunar X PRIZE. Steve Newcomb was the COO and co-founder of Powerset. Prior to joining Powerset, he was a co-founder of Loudfire, General Manager at Promptu, and was on the board of directors at Jaxtr. He left Powerset in October 2007 to form Virgance, a social startup incubator. Lorenzo Thione (born in Como, Italy) was the product architect and co-founder of Powerset. Prior to joining Powerset, he worked at FXPAL in natural language processing and related research fields. Thione earned his master's degree in software engineering from the University of Texas at Austin. Ronald Kaplan, former manager of research in Natural Language Theory and Technology at PARC, served as the company's CTO and CSO. Ryan Ferrier is a member of the founding team of Powerset. He managed personnel and internal operations. After 2008 he went on to co-found Serious Business, which made Facebook applications and was later bought by Zynga. Another Powerset alumnus, Alex Le, became CTO of Serious Business and went on to become an executive producer at Zynga when it bought the company. Siqi Chen founded a stealth startup in mobile computing after leaving Powerset. Tom Preston-Werner worked at Powerset and left after the acquisition to found GitHub. == Investors == Powerset attracted a wide range of investors, many of whom had considerable experience in the venture capital field. The company received $12.5 million (~$18.2 million in 2024) in Series A funding during November 2007, co-led by the venture capital firms Foundation Capital and The Founders Fund. Among the better-known investors: Esther Dyson, founding chairman of ICANN, founder of the newsletter Release 1.0 and editor at Cnet Peter Thiel, founder and former CEO of PayPal Luke Nosek, founder of PayPal Todd Parker. Managing Partner, Hidden River Ventures Reid Hoffman, executive vice president of PayPal and founder of LinkedIn First Round Capital, seed-stage venture firm

    Read more →
  • DataViva

    DataViva

    DataViva is an information visualization engine created by the Strategic Priorities Office of the government of Minas Gerais. DataViva makes official data about exports, industries, locations and occupations available for the entirety of Brazil through eight apps and more than 100 million possible visualizations. The first set of datum – also available at ALICEWEB – is provided by MDIC (Ministry of Development, Industry and Foreign Trade) / SECEX (Secretariat of Foreign Trade), an official institution of the Government of Brazil and shows foreign trade statistics for all exporting municipalities in the country. The other database, provided by Ministério do Trabalho e Emprego (MTE – Ministry of Labor and Employment), shows information about all the industries and occupations in Brazil (RAIS – Annual Social Information Report). The platform consists of eight core applications, each of which allows different ways of visualizing the data available. Some applications are descriptive, that is, showing data aggregated at various levels in a simple and comparative way, such as Treemapping. Others are prescriptive, using calculations that allow an analytic visualization of the data, based on theories such as the Product Space. All the applications are generated using D3plus, an open source JavaScript library built on top of D3.js by Alexander Simoes and Dave Landry. Inspired by The Observatory of Economic Complexity, DataViva is an open data, open-source, and free to use tool. It was developed in a partnership with Datawheel, co-founded by MIT Media Lab Professor César Hidalgo, and is maintained by the Government of Minas Gerais.

    Read more →
  • Jais (language model)

    Jais (language model)

    Jais is an open-source large language model launched in August 2023. Developed as a collaboration between Emirati AI company G42, the Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), and US-based Cerebras Systems, Jais was designed to produce high-quality Arabic text and was also trained on English data. The model's creation was motivated by the underrepresentation of the Arabic language in the field of generative artificial intelligence. It aims to provide a more culturally and linguistically accurate model for the world's 400 million Arabic speakers. Its name is a reference to Jebel Jais, the highest mountain in the UAE. == Background and development == Jais was developed in response to the limited availability of advanced generative artificial intelligence models for the Arabic language, despite it being spoken by over 400 million people. Existing models were often trained on limited or low-quality Arabic web content, resulting in poor performance. The project represents a significant investment by the United Arab Emirates in the field of AI as part of its national strategy. The model was created through a partnership between Inception (now Core42), a subsidiary of the Abu Dhabi-based AI company G42; the Mohamed bin Zayed University of Artificial Intelligence (MBZUAI); and Cerebras Systems, a US company specializing in AI hardware. The model is named after Jebel Jais, the highest peak in the UAE. == Training == The initial version of Jais released in August 2023 had 13 billion parameters. In November 2023, Core42 released Jais 30B, an improved version with 30 billion parameters. Both models were trained on a subset of the Cerebras Condor Galaxy 1 supercomputer. The training dataset consisted of a mix of Arabic, English, and computer code. According to Timothy Baldwin, a professor of natural language processing at MBZUAI, training the model on a diverse Arabic dataset allows it to switch between dialects. == Features == Jais is designed to generate text in both English and Arabic. The project has also released instruction-tuned "Chat" variants for both the 13B and 30B models, which are specifically optimized for conversational applications. Additional functionality for working with images, graphs, and tabular data is planned for future releases.

    Read more →
  • 1 Second Everyday

    1 Second Everyday

    1 Second Everyday (1SE) is an application developed by Cesar Kuriyama. The application allows the user to record one second of video every day and then chronologically edits (mashes) them together into a single film. It is compatible with iOS and Android. The idea of the application was developed by Kuriyama's 1 Second Everyday — Age 30 video. The application was launched in January 2013. 1 Second Everyday played a part in the plot of Chef and also became the inspiration for the 2014 short animated clip Feast. == Background == === Kuriyama's video === In February 2011, when Cesar Kuriyama turned 30, after saving money, he quit his job in an advertising firm and took a year off to travel. During this time, he started working on a project he called 1 Second Everyday. As part of the project, every day he recorded one second of video – something that was supposed to help him remember that day. He started the project because he was frustrated with his memory. He planned to stockpile the 365 one-second clips into one film to serve as a memento of his year. While working on the project Kuriyama realized that recording one second every day impacted the decisions he made in a positive way. After a year he made a 365-second clip out of his recordings. The video called 1 Second Everyday – Age 30, went viral. According to Kuriyama, he was initially inspired to take a year off from work by a TED talk given by Stefan Sagmeister called "The Power of Time Off." Kuriyama also delivered a TED talk about 1 Second Everyday in 2012 at TED 2012 in Long Beach California. === Kickstarter campaign === After completing his own video, Kuriyama decided to develop an application that would allow the users to record one second every day and compile their own videos. He developed a prototype of the application and then in 2012, he launched a Kickstarter campaign to raise funds for completing the application. The campaign became one of the most backed app campaigns in the history of Kickstarter. It was backed by 11,281 backers who pledged a total of $56,959 on an initial goal of $20,000. Following the completion of the Kickstarter campaign, he partnered with an application design studio in Brooklyn to develop the application. 1 Second Everyday was released two weeks after the completion of its Kickstarter campaign. == Application == The application was released for iOS on 10 January 2013. An Android-compatible version of the application was developed later. Using it, the user can record the videos in the application or they can select one second portions from their libraries. 1 Second Everyday dates every snippet. The user can also set alarms to remember to record their daily video. In order to compile a video, the user selects the seconds they want and the application creates a compilation video. The user can keep multiple timelines. It also allows users to post directly on social networks. The main interface in 1 Second Everyday is a calendar, which shows the user which days have snippets and which they can still fill in. In the beginning, 1 Second Everyday restricted the recording to one second. However, the developers later released Super Seconds, which allowed users to record an additional half a second video. In 2014, 1 Second Everyday Crowds was launched, which is an area in the application featuring compilations of second clips from different users. == In the media == The Kickstarter campaign of 1 Second Everyday was featured in Entrepreneur's 3 Innovative Tech Startups on Kickstarter Right Now in 2012. The application was featured in The New York Times, The Washington Post, Gawker and other media outlets. By the end of the launch day, it was in Top 10 Free Apps on App Store. It was also selected as the App of the Week on GeekWire in 2013. Several other one-second compilation videos were also posted on the Internet after Kuriyama's video gained media attention. Sam Cornwell, an English photographer documented his son Indigo's growth using a montage of one-second iPhone clips. He shot these clips every single day from the moment of birth right up to the baby's first birthday. According to Cornwell, he was inspired by Kuriyama's project. The video of Cornwell's son gained considerable media attention after it was posted on YouTube. Save the Children also made a video commercial based on a similar format that showed a British girl oblivious of the Syrian war end up being a refugee. 1SE was a finalist for the Fast Company Innovation by Design Award in 2015, but lost to Google Maps. In 2015, Google Android created a gallery, Leap Second 2015, with the help of Droga5 and Kuriyama. The gallery showcased how people around the world enjoyed the one extra second of their lives. Through the 1 Second Everyday app available at Google Play, people were able to submit their extra second, which were then vetted and added to the gallery. The viewers were able to view other celebratory seconds from around the world as well as searching for them using different hashtags.

    Read more →
  • Inception (deep learning architecture)

    Inception (deep learning architecture)

    Inception is a family of convolutional neural network (CNN) for computer vision, introduced by researchers at Google in 2014 as GoogLeNet (later renamed Inception v1). The series was historically important as an early CNN that separates the stem (data ingest), body (data processing), and head (prediction), an architectural design that persists in all modern CNN. == Version history == === Inception v1 === In 2014, a team at Google developed the GoogLeNet architecture, an instance of which won the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The name came from the LeNet of 1998, since both LeNet and GoogLeNet are CNNs. They also called it "Inception" after a "we need to go deeper" internet meme, a phrase from Inception (2010) the film. Because later, more versions were released, the original Inception architecture was renamed again as "Inception v1". The models and the code were released under Apache 2.0 license on GitHub. The Inception v1 architecture is a deep CNN composed of 22 layers. Most of these layers were "Inception modules". The original paper stated that Inception modules are a "logical culmination" of Network in Network and (Arora et al, 2014). Since Inception v1 is deep, it suffered from the vanishing gradient problem. The team solved it by using two "auxiliary classifiers", which are linear-softmax classifiers inserted at 1/3-deep and 2/3-deep within the network, and the loss function is a weighted sum of all three: L = 0.3 L a u x , 1 + 0.3 L a u x , 2 + L r e a l {\displaystyle L=0.3L_{aux,1}+0.3L_{aux,2}+L_{real}} These were removed after training was complete. This was later solved by the ResNet architecture. The architecture consists of three parts stacked on top of one another: The stem (data ingestion): The first few convolutional layers perform data preprocessing to downscale images to a smaller size. The body (data processing): The next many Inception modules perform the bulk of data processing. The head (prediction): The final fully-connected layer and softmax produces a probability distribution for image classification. This structure is used in most modern CNN architectures. === Inception v2 === Inception v2 was released in 2015, in a paper that is more famous for proposing batch normalization. It had 13.6 million parameters. It improves on Inception v1 by adding batch normalization, and removing dropout and local response normalization which they found became unnecessary when batch normalization is used. === Inception v3 === Inception v3 was released in 2016. It improves on Inception v2 by using factorized convolutions. As an example, a single 5×5 convolution can be factored into 3×3 stacked on top of another 3×3. Both has a receptive field of size 5×5. The 5×5 convolution kernel has 25 parameters, compared to just 18 in the factorized version. Thus, the 5×5 convolution is strictly more powerful than the factorized version. However, this power is not necessarily needed. Empirically, the research team found that factorized convolutions help. It also uses a form of dimension-reduction by concatenating the output from a convolutional layer and a pooling layer. As an example, a tensor of size 35 × 35 × 320 {\displaystyle 35\times 35\times 320} can be downscaled by a convolution with stride 2 to 17 × 17 × 320 {\displaystyle 17\times 17\times 320} , and by maxpooling with pool size 2 × 2 {\displaystyle 2\times 2} to 17 × 17 × 320 {\displaystyle 17\times 17\times 320} . These are then concatenated to 17 × 17 × 640 {\displaystyle 17\times 17\times 640} . Other than this, it also removed the lowest auxiliary classifier during training. They found that the auxiliary head worked as a form of regularization. They also proposed label-smoothing regularization in classification. For an image with label c {\displaystyle c} , instead of making the model to predict the probability distribution δ c = ( 0 , 0 , … , 0 , 1 ⏟ c -th entry , 0 , … , 0 ) {\displaystyle \delta _{c}=(0,0,\dots ,0,\underbrace {1} _{c{\text{-th entry}}},0,\dots ,0)} , they made the model predict the smoothed distribution ( 1 − ϵ ) δ c + ϵ / K {\displaystyle (1-\epsilon )\delta _{c}+\epsilon /K} where K {\displaystyle K} is the total number of classes. === Inception v4 === In 2017, the team released Inception v4, Inception ResNet v1, and Inception ResNet v2. Inception v4 is an incremental update with even more factorized convolutions, and other complications that were empirically found to improve benchmarks. Inception ResNet v1 and v2 are both modifications of Inception v4, where residual connections are added to each Inception module, inspired by the ResNet architecture. === Xception === Xception ("Extreme Inception") was published in 2017. It is a linear stack of depthwise separable convolution layers with residual connections. The design was proposed on the hypothesis that in a CNN, the cross-channels correlations and spatial correlations in the feature maps can be entirely decoupled. Training each network took 3 days on 60 K80 GPUs, or approximately 0.5 petaFLOP-days.

    Read more →
  • Control-flow integrity

    Control-flow integrity

    Control-flow integrity (CFI) is a general term for computer security techniques that prevent a wide variety of malware attacks from redirecting the flow of execution (the control flow) of a program. == Background == A computer program commonly changes its control flow to make decisions and use different parts of the code. Such transfers may be direct, in that the target address is written in the code itself, or indirect, in that the target address itself is a variable in memory or a CPU register. In a typical function call, the program performs a direct call, but returns to the caller function using the stack – an indirect backward-edge transfer. When a function pointer is called, such as from a virtual table, we say there is an indirect forward-edge transfer. Attackers seek to inject code into a program to make use of its privileges or to extract data from its memory space. Before executable code was commonly made read-only, an attacker could arbitrarily change the code as it is run, targeting direct transfers or even do with no transfers at all. After W^X became widespread, an attacker wants to instead redirect execution to a separate, unprotected area containing the code to be run, making use of indirect transfers: one could overwrite the virtual table for a forward-edge attack or change the call stack for a backward-edge attack (return-oriented programming). CFI is designed to protect indirect transfers from going to unintended locations. == Techniques == Associated techniques include code-pointer separation (CPS), code-pointer integrity (CPI), stack canaries, shadow stacks (SS), and vtable pointer verification. These protections can be classified into either coarse-grained or fine-grained based on the number of targets restricted. A coarse-grained forward-edge CFI implementation, could, for example, restrict the set of indirect call targets to any function that may be indirectly called in the program, while a fine-grained one would restrict each indirect call site to functions that have the same type as the function to be called. Similarly, for a backward edge scheme protecting returns, a coarse-grained implementation would only allow the procedure to return to a function of the same type (of which there could be many, especially for common prototypes), while a fine-grained one would enforce precise return matching (so it can return only to the function that called it). == Implementations == Related implementations are available in Clang (LLVM front-end),, GNU Compiler Collection, Microsoft's Control Flow Guard and Return Flow Guard, Google's Indirect Function-Call Checks and Reuse Attack Protector (RAP). === LLVM/Clang === The LLVM compiler's C/C++ front-end Clang provides a number of "CFI" schemes that works on the forward edge by checking for errors in virtual tables and type casts. Not all of the schemes are supported on all platforms and most of them, the exception being two "kcfi" schemes intended for low-level kernel software, depends on link-time optimization (LTO) to know what functions are supposed to be called in normal cases. Also provided is a separate "shadow call stack" (SCS) instrumentation pass that defends on the backward edge by checking for call stack modifications, available only for the aarch64 and RISC-V ISAs. And due to use of a shared processor register SCS is only enforceable on certain ABIs or if in other ways it is ensured that any other software using the register set (thread/processor) does not interfere with this use. Google has shipped Android with the Linux kernel compiled by Clang with link-time optimization (LTO) and CFI enabled since 2018. Even though SCS is available for the Linux kernel as an option, and support is also available for Android's system components it is recommended only to enable it for components for which it can be ensured that no third party code is loaded. === GCC === The GNU Compiler Collection implemented a "shadow call stack" compatible with Clang for aarch64 in v12 released in 2022. This feature is primarily intended for building the Linux kernel as support is missing from GCC user space libraries. === Intel Control-flow Enforcement Technology === Intel Control-flow Enforcement Technology (CET) detects compromises to control flow integrity with a shadow stack (SS) and indirect branch tracking (IBT). The kernel must map a region of memory for the shadow stack not writable to user space programs except by special instructions. The shadow stack stores a copy of the return address of each CALL. On a RET, the processor checks if the return address stored in the normal stack and shadow stack are equal. If the addresses are not equal, the processor generates an INT #21 (Control Flow Protection Fault). Indirect branch tracking detects indirect JMP or CALL instructions to unauthorized targets. It is implemented by adding a new internal state machine in the processor. The behavior of indirect JMP and CALL instructions is changed so that they switch the state machine from IDLE to WAIT_FOR_ENDBRANCH. In the WAIT_FOR_ENDBRANCH state, the next instruction to be executed is required to be the new ENDBRANCH instruction (ENDBR32 in 32-bit mode or ENDBR64 in 64-bit mode), which changes the internal state machine from WAIT_FOR_ENDBRANCH back to IDLE. Thus every authorized target of an indirect JMP or CALL must begin with ENDBRANCH. If the processor is in a WAIT_FOR_ENDBRANCH state (meaning, the previous instruction was an indirect JMP or CALL), and the next instruction is not an ENDBRANCH instruction, the processor generates an INT #21 (Control Flow Protection Fault). On processors not supporting CET indirect branch tracking, ENDBRANCH instructions are interpreted as NOPs and have no effect. === Microsoft Control Flow Guard === Control Flow Guard (CFG) was first released for Windows 8.1 Update 3 (KB3000850) in November 2014. Developers can add CFG to their programs by adding the /guard:cf linker flag before program linking in Visual Studio 2015 or newer. As of Windows 10 Creators Update (Windows 10 version 1703), the Windows kernel is compiled with CFG. The Windows kernel uses Hyper-V to prevent malicious kernel code from overwriting the CFG bitmap. CFG operates by creating a per-process bitmap, where a set bit indicates that the address is a valid destination. Before performing each indirect function call, the application checks if the destination address is in the bitmap. If the destination address is not in the bitmap, the program terminates. This makes it more difficult for an attacker to exploit a use-after-free by replacing an object's contents and then using an indirect function call to execute a payload. ==== Implementation details ==== For all protected indirect function calls, the _guard_check_icall function is called, which performs the following steps: Convert the target address to an offset and bit number in the bitmap. The highest 3 bytes are the byte offset in the bitmap The bit offset is a 5-bit value. The first four bits are the 4th through 8th low-order bits of the address. The 5th bit of the bit offset is set to 0 if the destination address is aligned with 0x10 (last four bits are 0), and 1 if it is not. Examine the target's address value in the bitmap If the target address is in the bitmap, return without an error. If the target address is not in the bitmap, terminate the program. ==== Bypass techniques ==== There are several generic techniques for bypassing CFG: Set the destination to code located in a non-CFG module loaded in the same process. Find an indirect call that was not protected by CFG (either CALL or JMP). Use a function call with a different number of arguments than the call is designed for, causing a stack misalignment, and code execution after the function returns (patched in Windows 10). Use a function call with the same number of arguments, but one of pointers passed is treated as an object and writes to a pointer-based offset, allowing overwriting a return address. Overwrite the function call used by the CFG to validate the address (patched in March 2015) Set the CFG bitmap to all 1's, allowing all indirect function calls Use a controlled-write primitive to overwrite an address on the stack (since the stack is not protected by CFG) === Microsoft eXtended Flow Guard === eXtended Flow Guard (XFG) has not been officially released yet, but is available in the Windows Insider preview and was publicly presented at Bluehat Shanghai in 2019. XFG extends CFG by validating function call signatures to ensure that indirect function calls are only to the subset of functions with the same signature. Function call signature validation is implemented by adding instructions to store the target function's hash in register r10 immediately prior to the indirect call and storing the calculated function hash in the memory immediately preceding the target address's code. When the indirect call is made, the XFG validation function compares the value in r10 to the target

    Read more →
  • FloodAlerts

    FloodAlerts

    FloodAlerts is a software application, developed by software specialists Shoothill, which takes real-time flooding information, and displays the data on an interactive Bing map, updating and warning its users when they, their premises or the routes they need to travel could be at risk of flooding. == History == FloodAlerts was launched in 2012, originally as the world's first Facebook flood warning app. == Operation == FloodAlerts is made available free of charge to individuals. Users are able to set up their own monitored locations and receive alerts via the application or their Facebook wall if the locations they are monitoring are at imminent risk of flooding. Hosted in the Cloud, using the Microsoft Windows Azure platform, the FloodAlerts application processes the data received from the Environment Agency, automatically creates the required map tiles, pins and alerts and displays them on an interactive Bing map, updating the content every 15 minutes. Users are able to see the latest information on the map without having to refresh their browser. FloodAlerts can also be provided as a customised risk management solution to businesses that require infrastructure or asset safety monitoring in areas where water levels are rising or receding. == Awards and recognition == FloodAlerts has received The Guardian and Virgin Media Business's 2012 Innovation Nation Awards and was shortlisted as a finalist for a further two national awards: the UK IT Industry Awards for Innovation and Entrepreneurship and The Institution of Engineering and Technology Innovation Awards for Information Technology. == In the press == The FloodAlerts application was reviewed on the BBC website. It was also reviewed on BBC Click.

    Read more →
  • Umoove

    Umoove

    Umoove is a high tech startup company that has developed and patented a software-only face and eye tracking technology. The idea was first conceived as an attempt to aid people with disabilities but has since evolved. The only compatibility qualification for tablet computers and smartphones to run Umoove software is a front-facing camera. Umoove headquarters are in Israel on Jerusalem’s Har Hotzvim. Umoove has 15 employees and received two million dollars in financing in 2012. The company's original founders invested around $800,000 to start the business in 2010. In 2013 Umoove was named one of the top three most promising Israeli start ups by Newsgeeks magazine. The company also participated in the 2013 LeWeb conference in Paris, France, where innovative technology startups are showcased. == Technology == The technology uses information extracted from previous frames, such as the angle of the user's head to predict where to look for facial targets in the next frame. This anticipation minimizes the amount of computation needed to scan each image. Umoove accounts for variances in environment, lighting conditions and user hand shake/movement. The technology is designed to provide a consistent experience, whether you're in a brightly lit area or a darkened basement, and to work fluidly between them by adapting its processing when it detects color and brightness shifts. It uses an active stabilization technique to filter out natural body movements from an unstable camera in order to minimize false-positive motion detection. Running the Umoove software on a Samsung Galaxy S3 is said to take up only 2% CPU. Umoove works exclusively with software and there is no hardware add-on necessary. It can be run on any smartphone or tablet computer that has a front-facing camera. Umoove claims that even a low-quality camera on an old device will run their software flawlessly. == Umoove Experience == In January 2014 Umoove released its first game onto the app store. The Umoove Experience game lets users control where they are 'flying' in the game through simple gestures and motions with their head. The avatar will basically go toward wherever the user looks. The game was created to showcase the technology for game developers but that did not stop some from criticizing its simplicity. Umoove also announced that they raised another one million dollars and that they are opening offices in Silicon Valley, California. In February 2014, Umoove announced that their face-tracking software development kit is available for Android developers as well as iOS. == Reviews == The Umoove Experience garnered mostly positive reviews from bloggers and mainstream media with some predicting that it could be the future of mobile gaming. Mashable wrote that Umoove's technology could be the emergence of gesture recognition technology in the mobile space, similar to Kinect with console gaming and what Leap Motion has done with desktop computers. Some, however, remain skeptical. CNET, for example, did not give the game a positive review and called the eye tracking technology 'freaky but cool'. They also noted that pioneering technologies have been known to fall short of expectations, citing Apple Inc’s Siri as an example. The technology blog GigaOM said that the Umoove Experience is ’awesome’ and technology evangelist Robert Scoble has called Umoove "brilliant". == uHealth == In January 2015, Umoove released uHealth, a mobile application that uses eye tracking game-like exercise to challenge the user's ability to be attentive, continuously focus, follow commands and avoid distractions. The app is designed in the form of two games, one to improve attention and another that hones focus. uHealth is a training tool, not a diagnostic. Umoove has stated that they want to use their technology for diagnosing neurological disorders but this will depend on clinical tests and FDA approval. The company cites the direct relationship between eye movements and brain activity as well as various vision-based therapies have been backed by many scientific studies conducted over the past decades. uHealth is the first time this type of therapy is delivered right to the end user through a simple download. == Collaboration rumors == In March 2013 there were rumors on the internet that Umoove would be the functioning software embedded into the Samsung Galaxy S4, which was due to launch that month. This rumor was perpetrated by, among others, New York Times, Techcrunch and Yahoo. Once Samsung launched without the Umoove technology rumors about a potential collaboration with Apple Inc hit the web. It has been said that due to the fact that Apple Inc is losing market share and stock value to Samsung they will be more aggressive and eye tracking is a logical place to make that move.

    Read more →