AI Content On Linkedin

AI Content On Linkedin — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • SEMAT

    SEMAT

    SEMAT (Software Engineering Method and Theory) is an initiative to reshape software engineering such that software engineering qualifies as a rigorous discipline. The initiative was launched in December 2009 by Ivar Jacobson, Bertrand Meyer, and Richard Soley with a call for action statement and a vision statement. The initiative was envisioned as a multi-year effort for bridging the gap between the developer community and the academic community and for creating a community giving value to the whole software community. The work is now structured in four different but strongly related areas: Practice, Education, Theory, and Community. The Practice area primarily addresses practices. The Education area is concerned with all issues related to training for both the developers and the academics including students. The Theory area is primarily addressing the search for a General Theory in Software Engineering. Finally, the Community area works with setting up legal entities, creating websites and community growth. It was expected that the Practice area, the Education area and the Theory area would at some point in time integrate in a way of value to all of them: the Practice area would be a "customer" of the Theory area, and direct the research to useful results for the developer community. The Theory area would give a solid and practical platform for the Practice area. And, the Education area would communicate the results in proper ways. == Practice area == The first step was here to develop a common ground or a kernel including the essence of software engineering – things we always have, always do, always produce when developing software. The second step was envisioned to add value on top of this kernel in the form of a library of practices to be composed to become specific methods, specific for all kinds of reasons such as the preferences of the team using it, kind of software being built, etc. The first step is as of this writing just about to be concluded. The results are a kernel including universal elements for software development – called the Essence Kernel, and a language – called the Essence Language - to describe these elements (and elements built on top of the kernel (practices, methods, and more). Essence, including both the kernel and language, has been published as an OMG standard in beta status in July 2013 and is expected to become a formally adopted standard in early 2014. The second step has just started, and the Practice area will be divided into a number of separate but interconnected tracks: the practice (library track), the tool track are so far identified and work has started or is about to get started. The practice track is currently working on a Users Guide. == Education area == The area focuses on leveraging the work of SEMAT in software engineering education, both within academia and industry. It promotes global education based on a common ground called Essence. The area's target groups are instructors such as university professors and industrial coaches as well as their students and learning practitioners. The goal of the area is to create educational courses and course materials that are internationally viable, identify pedagogical approaches that are appropriate and effective for specific target groups and disseminate experience and lessons learned. The area includes members from a number of universities and institutes worldwide. Most members have already been involved in leveraging aspects of SEMAT in the context of their software engineering courses. They are gathering their resources and starting a common venture towards defining a new generation of SEMAT-powered software engineering curricula. As of 2018, some studies of utilizing Essence in educational settings exist. One example of the use of Essence in university education was a software engineering course carried out in Norwegian University of Science and Technology. A study was conducted by introducing Essence into a project-based software engineering course, with the aim of understanding what difficulties the students faced in using Essence, and whether they considered it to have been useful. The results indicated that Essence could also be useful for novice software engineers by (1) encouraging them to look up and study new practices and methods in order to create their own, (2) encouraging them to adjust their way-of-working reflectively and in a situation-specific manner, (3) helping them structure their way of working. The findings of another study introducing students to Essence through a digital game supported these findings: the students felt that Essence will be useful to them in future, real-world projects, and that they wish to utilize it in them. == Theory area == An important part of SEMAT is that a general theory of software engineering is planned to emerge with significant benefits. A series of workshops held under the title SEMAT Workshop on a General Theory of Software Engineering (GTSE) are a key component in awareness building around general theories. In addition to community awareness building, SEMAT also aims to contribute with a specific general theory of software engineering. This theory should be solidly based on the SEMAT Essence language and kernel, and should support software engineering practitioners' goal-oriented decision making. As argued elsewhere, such support is predicated on the predictive capabilities of the theory. Thus, the SEMAT Essence should be augmented to allow the prediction of critical software engineering phenomena. The GTSE workshop series assists in the development of the SEMAT general software engineering theory by engaging a larger community in the search for, development of, and evaluation of promising theories, which may be used as a base for the SEMAT theory. == Organizational structure == === Main organization === SEMAT is chaired by Sumeet S. Malhotra of Tata Consultancy Services. The CEO of the organization is Ste Nadin of Fujitsu. The Executive Management Committee of SEMAT are Ivar Jacobson, Ste Nadin, Sumeet S. Malhotra, Paul E. McMahon, Michael Goedicke and Cecile Peraire. === Japan Chapter === Japan Chapter was established in April 2013, and it has more than 250 members as of November 2013. Member activities include carrying out seminars about SEMAT, considering utilization of SEMAT Essence for integrating different requirements engineering techniques and body of knowledges (BoKs), and translating articles into Japanese. === Korea Chapter === The chapter was inaugurated with about 50 members in October 2013. Member activities include: 2e Consulting started rewriting their IT service engagement methods using the Essence kernel, and uEngine Solutions started developing a tool to orchestrate Essence-kernel based practices into a project method. Korean government supported KAIST to conduct research in Essence. === Latin American Chapter === Semat Latin American Chapter was created in August 2011 in Medellin (Colombia) by Ivar Jacobson during the Latin American Software Engineering Symposium. This Chapter has 9 Executive Committee members from Colombia, Venezuela, Peru, Brazil, Argentina, Chile, and Mexico, chaired by Dr. Carlos Zapata from Colombia. More than 80 people signed the initial declaration of the Chapter and nowadays the Chapter members are in charge of disseminating the Semat ideas in all Latin America. Chapter members have participated in various Latin American conferences, including the Latin American Conference on Informatics (CLEI), the Ibero American Software Engineering and Knowledge Engineering Journeys (JIISIC), the Colombian Computing Conference (CCC), and the Chilean Computing Meeting (ECC). The Chapter contributed in the submission sent in response to the OMG call for proposals and currently studies didactic strategies for teaching the Semat kernel by games, theoretical studies about some kernel elements, and practical representations of several software development and quality methods by using the Semat kernel. Some of the members also translated the Essence book and some other Semat materials and papers into Spanish. === Russia Chapter === Russian Chapter has about 20 members. A few universities have incorporated SEMAT in their training courses , including Moscow State University, Moscow Institute of Physics and Technology, Higher School of Economics, Moscow State University of Economics, Statistics, and Informatics. The chapter and some commercial companies are carrying out seminars about SEMAT. INCOSE Russian Chapter is working on an extension of SEMAT to systems engineering. EC-leasing is working on an extension of the Kernel for Software Life Cycle. Russian Chapter attended in two conferences: Actual Problems of System and Software Engineering and SECR with SEMAT section and articles. Translation of the Essence book into Russian is in progress. == Practical Applications of SEMAT == Ideas developed by the SEMAT community have been applied by both industry and ac

    Read more →
  • Truth discovery

    Truth discovery

    Truth discovery (also known as truth finding) is the process of choosing the actual true value for a data item when different data sources provide conflicting information on it. Several algorithms have been proposed to tackle this problem, ranging from simple methods like majority voting to more complex ones able to estimate the trustworthiness of data sources. Truth discovery problems can be divided into two sub-classes: single-truth and multi-truth. In the first case only one true value is allowed for a data item (e.g birthday of a person, capital city of a country). While in the second case multiple true values are allowed (e.g. cast of a movie, authors of a book). Typically, truth discovery is the last step of a data integration pipeline, when the schemas of different data sources have been unified and the records referring to the same data item have been detected. == General principles == The abundance of data available on the web makes more and more probable to find that different sources provide (partially or completely) different values for the same data item. This, together with the fact that we are increasing our reliance on data to derive important decisions, motivates the need of developing good truth discovery algorithms. Many currently available methods rely on a voting strategy to define the true value of a data item. Nevertheless, recent studies, have shown that, if we rely only on majority voting, we could get wrong results even in 30% of the data items. The solution to this problem is to assess the trustworthiness of the sources and give more importance to votes coming from trusted sources. Ideally, supervised learning techniques could be exploited to assign a reliability score to sources after hand-crafted labeling of the provided values; unfortunately, this is not feasible since the number of needed labeled examples should be proportional to the number of sources, and in many applications the number of sources can be prohibitive. == Single-truth vs multi-truth discovery == Single-truth and multi-truth discovery are two very different problems. Single-truth discovery is characterized by the following properties: only one true value is allowed for each data item; different values provided for a given data item oppose to each other; values and sources can either be correct or erroneous. While in the multi-truth case the following properties hold: the truth is composed by a set of values; different values could provide a partial truth; claiming one value for a given data item does not imply opposing to all the other values; the number of true values for each data item is not known a priori. Multi-truth discovery has unique features that make the problem more complex and should be taken into consideration when developing truth-discovery solutions. The examples below point out the main differences of the two methods. Knowing that in both examples the truth is provided by source 1, in the single truth case (first table) we can say that sources 2 and 3 oppose to the truth and as a result provide wrong values. On the other hand, in the second case (second table), sources 2 and 3 are neither correct nor erroneous, they instead provide a subset of the true values and at the same time they do not oppose the truth. == Source trustworthiness == The vast majority of truth discovery methods are based on a voting approach: each source votes for a value of a certain data item and, at the end, the value with the highest vote is select as the true one. In the more sophisticated methods, votes do not have the same weight for all the data sources, more importance is indeed given to votes coming from trusted sources. Source trustworthiness usually is not known a priori but estimated with an iterative approach. At each step of the truth discovery algorithm the trustworthiness score of each data source is refined, improving the assessment of the true values that in turn leads to a better estimation of the trustworthiness of the sources. This process usually ends when all the values reach a convergence state. Source trustworthiness can be based on different metrics, such as accuracy of provided values, copying values from other sources and domain coverage. Detecting copying behaviors is very important, in fact, copy allows to spread false values easily making truth discovery very hard, since many sources would vote for the wrong values. Usually systems decrease the weight of votes associated to copied values or even don’t count them at all. == Single-truth methods == Most of the currently available truth discovery methods have been designed to work well only in the single-truth case. Below are reported some of the characteristics of the most relevant typologies of single-truth methods and how different systems model source trustworthiness. === Majority voting === Majority voting is the simplest method, the most popular value is selected as the true one. Majority voting is commonly used as a baseline when assessing the performances of more complex methods. === Web-link based === These methods estimate source trustworthiness exploiting a similar technique to the one used to measure authority of web pages based on web links. The vote assigned to a value is computed as the sum of the trustworthiness of the sources that provide that particular value, while the trustworthiness of a source is computed as the sum of the votes assigned to the values that the source provides. === Information-retrieval based === These methods estimate source trustworthiness using similarity measures typically used in information retrieval. Source trustworthiness is computed as the cosine similarity (or other similarity measures) between the set of values provided by the source and the set of values considered true (either selected in a probabilistic way or obtained from a ground truth). === Bayesian based === These methods use Bayesian inference to define the probability of a value being true conditioned on the values provided by all the sources. P ( v ∣ ψ ( o ) ) = P ( ψ ( o ) ∣ v ) ⋅ P ( v ) P ( ψ ( o ) ) {\displaystyle P(v\mid \psi (o))={\frac {P(\psi (o)\mid v)\cdot P(v)}{P(\psi (o))}}} where v {\displaystyle \textstyle v} is a value provided for a data item o {\displaystyle \textstyle o} and ψ ( o ) {\displaystyle \textstyle \psi (o)} is the set of the observed values provided by all the sources for that specific data item. The trustworthiness of a source is then computed based on the accuracy of the values that provides. Other more complex methods exploit Bayesian inference to detect copying behaviors and use these insights to better assess source trustworthiness. == Multi-truth methods == Due to its complexity, less attention has been devoted to the study of the multi-truth discovery Below are reported two typologies of multi-truth methods and their characteristics. === Bayesian based === These methods use Bayesian inference to define the probability of a group of values being true conditioned on the values provided by all the data sources. In this case, since there could be multiple true values for each data item, and sources can provide multiple values for a single data item, it is not possible to consider values individually. An alternative is to consider mappings and relations between set of provided values and sources providing them. The trustworthiness of a source is then computed based on the accuracy of the values that provides. More sophisticated methods also consider domain coverage and copying behaviors to better estimate source trustworthiness. === Probabilistic Graphical Models based === These methods use probabilistic graphical models to automatically define the set of true values of given data item and also to assess source quality without need of any supervision. == Applications == Many real-world applications can benefit from the use of truth discovery algorithms. Typical domains of application include: healthcare, crowd/social sensing, crowdsourcing aggregation, information extraction and knowledge base construction. Truth discovery algorithms could be also used to revolutionize the way in which web pages are ranked in search engines, going from current methods based on link analysis like PageRank, to procedures that rank web pages based on the accuracy of the information they provide.

    Read more →
  • MovieRide FX

    MovieRide FX

    MovieRide FX is a patented automated special visual effects video compositing engine used in the MovieRide FX mobile application for Android (requires Android 2.3 or later) and iOS (compatible with iPhone 4 and up, iPad, and iPod Touch (new generation), requires iOS 7 or later). MovieRide FX allows the user to personalize a "Hollywood-style" movie clip by inserting themself into the clip as the "actor". == Features == The MovieRide FX app uses the relevant mobile device's camera to record a video of the user and insert it into a pre-packaged "Hollywood style" movie clip. The "actor" is extracted from their recorded video clip through various known effects such as masking, keying, and motion tracking. The "actor" is then inserted into one of the pre-packaged movie clips created by the MovieRide FX visual effects artists. This is done through an automated process requiring little or no artistic or technical skill from the user. The custom movie clips pre-packaged with MovieRide FX offer the user a variety of movie scenarios. Additional clips based on popular television and movie themes are continually being developed and are available on a freemium basis. == Sharing == Once the user's footage has automatically been composited into a movie clip and rendered as an .mp4 file, it can be shared via social media, such as Facebook, YouTube, and Twitter, and by e-mail. == History == === 2012 === MovieRide FX was created by Grant Waterston and Johann Mynhardt, who started development in 2012. === 2013 === The beta version was released on Google Play in July 2013. In August 2013 MovieRide FX was a New Media Award winner in the "New Media" category of the Accolade International Awards in Los Angeles. In October 2013 MovieRide FX was awarded exhibitor space in the ‘start-up village’ at the Apps-World Expo in London. === 2014 === MovieRide FX reached the 100 000 – 500 000 downloads category on the Google Play Store in June 2014. The official Android version was launched in July 2014. iOS version released in August 2014. MovieRide FX was selected as one of the "Top 150" startups at the Pioneer Festival in Vienna in September 2014. In November 2014 MovieRide FX was shortlisted for the Appster Awards in the "Best Entertainment App" and "Most Innovative App" categories and was awarded exhibitor space at the ‘start-up village’ at the Apps-World Expo in London. Patent applications were filed in South Africa, the EU and USA in April 2014. === 2015 === In September 2015 MovieRide FX was shortlisted for "Best Software innovation" at The Technology Expo Awards in London. === 2016 === In April 2016 MovieRide FX was nominated for a National Science and Technology Forum (NSTF) award for 'Research leading to Innovation by a corporate organization' In August 2016 Movie Ride FX won two Gold Awards at the 2016 Mobile Marketing Awards (MMA Smarties SA). These two Gold awards were for the 'Innovation' and 'Best in Show’ categories. In December 2016 FlicJam Inc. was formed in the US to access the larger global market. EU patent application was published in March 2016. === 2017 === South African patent was granted in February 2017. === 2018 === US patent was granted in March 2018.

    Read more →
  • The Visualization Handbook

    The Visualization Handbook

    The Visualization Handbook is a textbook by Charles D. Hansen and Christopher R. Johnson that serves as a survey of the field of scientific visualization by presenting the basic concepts and algorithms in addition to a current review of visualization research topics and tools. It is commonly used as a textbook for scientific visualization graduate courses. It is also commonly cited as a reference for scientific visualization and computer graphics in published papers, with almost 500 citations documented on Google Scholar. == Table of Contents == PART I - Introduction Overview of Visualization - William J. Schroeder and Kenneth M. Martin PART II - Scalar Field Visualization: Isosurfaces Accelerated Isosurface Extraction Approaches -Yarden Livnat Time-Dependent Isosurface Extraction - Han-Wei Shen Optimal Isosurface Extraction - Paolo Cignoni, Claudio Montani, Robert Scopigno, and Enrico Puppo Isosurface Extraction Using Extrema Graphs - Takayuki Itoh and Koji Koyamada Isosurfaces and Level-Sets - Ross Whitaker PART III - Scalar Field Visualization: Volume Rendering Overview of Volume Rendering - Arie E. Kaufman and Klaus Mueller Volume Rendering Using Splatting - Roger Crawfis, Daqing Xue, and Caixia Zhang Multidimensional Transfer Functions for Volume Rendering - Joe Kniss, Gordon Kindlmann, and Charles D. Hansen Pre-Integrated Volume Rendering - Martin Kraus and Thomas Ertl Hardware-Accelerated Volume Rendering - Hanspeter Pfister PART IV - Vector Field Visualization Overview of Flow Visualization - Daniel Weiskopf and Gordon Erlebacher Flow Textures: High-Resolution Flow Visualization - Gordon Erlebacher, Bruno Jobard, and Daniel Weiskopf Detection and Visualization of Vortices - Ming Jiang, Raghu Machiraju, and David Thompson PART V - Tensor Field Visualization Oriented Tensor Reconstruction - Leonid Zhukov and Alan H. Barr Diffusion Tensor MRI Visualization - Song Zhang, David Laidlaw, and Gordon Kindlmann Topological Methods for Flow Visualization - Gerik Scheuermann and Xavier Tricoche PART VI - Geometric Modeling for Visualization 3D Mesh Compression - Jarek Rossignac Variational Modeling Methods for Visualization - Hans Hagen and Ingrid Hotz Model Simplification - Jonathan D. Cohen and Dinesh Manocha PART VII - Virtual Environments for Visualization Direct Manipulation in Virtual Reality - Steve Bryson The Visual Haptic Workbench - Milan Ikits and J. Dean Brederson Virtual Geographic Information Systems - William Ribarsky Visualization Using Virtual Reality - R. Bowen Loftin, Jim X. Chen, and Larry Rosenblum PART VIII - Large-Scale Data Visualization Desktop Delivery: Access to Large Datasets - Philip D. Heermann and Constantine Pavlakos Techniques for Visualizing Time-Varying Volume Data - Kwan-Liu Ma and Eric B. Lum Large-Scale Data Visualization and Rendering: A Problem-Driven Approach - Patrick McCormick and James Ahrens Issues and Architectures in Large-Scale Data Visualization - Constantine Pavlakos and Philip D. Heermann Consuming Network Bandwidth with Visapult - Wes Bethel and John Shalf PART IX - Visualization Software and Frameworks The Visualization Toolkit - William J. Schroeder and Kenneth M. Martin Visualization in the SCIRun Problem-Solving Environment - David M. Weinstein, Steven Parker, Jenny Simpson, Kurt Zimmerman, and Greg M. Jones Numerical Algorithms Group IRIS Explorer - Jeremy Walton AVS and AVS/Express - Jean M. Favre and Mario Valle Vis5D, Cave5D, and VisAD - Bill Hibbard Visualization with AVS - W. T. Hewitt, Nigel W. John, Matthew D. Cooper, K. Yien Kwok, George W. Leaver, Joanna M. Leng, Paul G. Lever, Mary J. McDerby, James S. Perrin, Mark Riding, I. Ari Sadarjoen, Tobias M. Schiebeck, and Colin C. Venters ParaView: An End-User Tool for Large-Data Visualization - James Ahrens, Berk Geveci, and Charles Law The Insight Toolkit: An Open-Source Initiative in Data Segmentation and Registration - Terry S. Yoo amira: A Highly Interactive System for Visual Data Analysis - Detlev Stalling, Malte Westerhoff, and Hans-Christian Hege PART X - Perceptual Issues in Visualization Extending Visualization to Perceptualization: The Importance of Perception in Effective Communication of Information - David S. Ebert Art and Science in Visualization - Victoria Interrante Exploiting Human Visual Perception in Visualization - Alan Chalmers and Kirsten Cater PART XI - Selected Topics and Applications Scalable Network Visualization - Stephen G. Eick Visual Data-Mining Techniques - Daniel A. Keim, Mike Sips, and Mihael Ankerst Visualization in Weather and Climate Research - Don Middleton, Tim Scheitlin, and Bob Wilhelmson Painting and Visualization - Robert M. Kirby, Daniel F. Keefe, and David Laidlaw Visualization and Natural Control Systems for Microscopy - Russell M. Taylor II, David Borland, Frederick P. Brooks, Jr., Mike Falvo, Kevin Jeffay, Gail Jones, David Marshburn, Stergios J. Papadakis, Lu-Chang Qin, Adam Seeger, F. Donelson Smith, Dianne Sonnenwald, Richard Superfine, Sean Washburn, Chris Weigle, Mary Whitton, Leandra Vicci, Martin Guthold, Tom Hudson, Philip Williams, and Warren Robinett Visualization for Computational Accelerator Physics - Kwan-Liu Ma, Greg Schussman, and Brett Wilson

    Read more →
  • Artipic

    Artipic

    Artipic is a graphics editor developed for Microsoft Windows. An older version for macOS is still available but unsupported. Artipic features drawing, editing, retouching, transforming and composing images including color corrections, effects and layer-based operations. It converts all common image formats and imports camera raw formats. In the global image editing ecosystem Artipic can be positioned somewhere in the middle. It differs from simple free photo editors by more advanced capabilities, however it does not cover the complete professional-level functionality pack provided by industry leaders like Adobe Photoshop. == History == Artipic developed by Swedish company Artipic AB. Artipic 1.0 was released in March 2014 as a free version. The first commercial version on Microsoft Windows was released in November 2014, on macOS – in October 2015. == Features == Supports Microsoft Windows and macOS Standard tools: select, crop, move, rotate, transform, stamp, color picking, text Advanced tools: custom brushes, gradients, shapes, paths, layers and masks Special tools: healing brush, red-eye effect reduction, dodge and burn brushes Adjustments: Brightness & Contrast, Hue & Saturation, Curves, Levels, Color Balance, Gamma Correction, Exposure, Color Temperature, Tint, Color Enhancer, Photo Filter Simulation, Posterization, Thresholding Filters: Smoothen, Sharpen, Vignetting, High-pass, Diffuse Glow, Shadow, Gaussian Blur Reversible (non-destructive) stylization presets Batch processing White balance RAW-converter including Gray Card Adobe Photoshop images supported == Version history ==

    Read more →
  • Truth discovery

    Truth discovery

    Truth discovery (also known as truth finding) is the process of choosing the actual true value for a data item when different data sources provide conflicting information on it. Several algorithms have been proposed to tackle this problem, ranging from simple methods like majority voting to more complex ones able to estimate the trustworthiness of data sources. Truth discovery problems can be divided into two sub-classes: single-truth and multi-truth. In the first case only one true value is allowed for a data item (e.g birthday of a person, capital city of a country). While in the second case multiple true values are allowed (e.g. cast of a movie, authors of a book). Typically, truth discovery is the last step of a data integration pipeline, when the schemas of different data sources have been unified and the records referring to the same data item have been detected. == General principles == The abundance of data available on the web makes more and more probable to find that different sources provide (partially or completely) different values for the same data item. This, together with the fact that we are increasing our reliance on data to derive important decisions, motivates the need of developing good truth discovery algorithms. Many currently available methods rely on a voting strategy to define the true value of a data item. Nevertheless, recent studies, have shown that, if we rely only on majority voting, we could get wrong results even in 30% of the data items. The solution to this problem is to assess the trustworthiness of the sources and give more importance to votes coming from trusted sources. Ideally, supervised learning techniques could be exploited to assign a reliability score to sources after hand-crafted labeling of the provided values; unfortunately, this is not feasible since the number of needed labeled examples should be proportional to the number of sources, and in many applications the number of sources can be prohibitive. == Single-truth vs multi-truth discovery == Single-truth and multi-truth discovery are two very different problems. Single-truth discovery is characterized by the following properties: only one true value is allowed for each data item; different values provided for a given data item oppose to each other; values and sources can either be correct or erroneous. While in the multi-truth case the following properties hold: the truth is composed by a set of values; different values could provide a partial truth; claiming one value for a given data item does not imply opposing to all the other values; the number of true values for each data item is not known a priori. Multi-truth discovery has unique features that make the problem more complex and should be taken into consideration when developing truth-discovery solutions. The examples below point out the main differences of the two methods. Knowing that in both examples the truth is provided by source 1, in the single truth case (first table) we can say that sources 2 and 3 oppose to the truth and as a result provide wrong values. On the other hand, in the second case (second table), sources 2 and 3 are neither correct nor erroneous, they instead provide a subset of the true values and at the same time they do not oppose the truth. == Source trustworthiness == The vast majority of truth discovery methods are based on a voting approach: each source votes for a value of a certain data item and, at the end, the value with the highest vote is select as the true one. In the more sophisticated methods, votes do not have the same weight for all the data sources, more importance is indeed given to votes coming from trusted sources. Source trustworthiness usually is not known a priori but estimated with an iterative approach. At each step of the truth discovery algorithm the trustworthiness score of each data source is refined, improving the assessment of the true values that in turn leads to a better estimation of the trustworthiness of the sources. This process usually ends when all the values reach a convergence state. Source trustworthiness can be based on different metrics, such as accuracy of provided values, copying values from other sources and domain coverage. Detecting copying behaviors is very important, in fact, copy allows to spread false values easily making truth discovery very hard, since many sources would vote for the wrong values. Usually systems decrease the weight of votes associated to copied values or even don’t count them at all. == Single-truth methods == Most of the currently available truth discovery methods have been designed to work well only in the single-truth case. Below are reported some of the characteristics of the most relevant typologies of single-truth methods and how different systems model source trustworthiness. === Majority voting === Majority voting is the simplest method, the most popular value is selected as the true one. Majority voting is commonly used as a baseline when assessing the performances of more complex methods. === Web-link based === These methods estimate source trustworthiness exploiting a similar technique to the one used to measure authority of web pages based on web links. The vote assigned to a value is computed as the sum of the trustworthiness of the sources that provide that particular value, while the trustworthiness of a source is computed as the sum of the votes assigned to the values that the source provides. === Information-retrieval based === These methods estimate source trustworthiness using similarity measures typically used in information retrieval. Source trustworthiness is computed as the cosine similarity (or other similarity measures) between the set of values provided by the source and the set of values considered true (either selected in a probabilistic way or obtained from a ground truth). === Bayesian based === These methods use Bayesian inference to define the probability of a value being true conditioned on the values provided by all the sources. P ( v ∣ ψ ( o ) ) = P ( ψ ( o ) ∣ v ) ⋅ P ( v ) P ( ψ ( o ) ) {\displaystyle P(v\mid \psi (o))={\frac {P(\psi (o)\mid v)\cdot P(v)}{P(\psi (o))}}} where v {\displaystyle \textstyle v} is a value provided for a data item o {\displaystyle \textstyle o} and ψ ( o ) {\displaystyle \textstyle \psi (o)} is the set of the observed values provided by all the sources for that specific data item. The trustworthiness of a source is then computed based on the accuracy of the values that provides. Other more complex methods exploit Bayesian inference to detect copying behaviors and use these insights to better assess source trustworthiness. == Multi-truth methods == Due to its complexity, less attention has been devoted to the study of the multi-truth discovery Below are reported two typologies of multi-truth methods and their characteristics. === Bayesian based === These methods use Bayesian inference to define the probability of a group of values being true conditioned on the values provided by all the data sources. In this case, since there could be multiple true values for each data item, and sources can provide multiple values for a single data item, it is not possible to consider values individually. An alternative is to consider mappings and relations between set of provided values and sources providing them. The trustworthiness of a source is then computed based on the accuracy of the values that provides. More sophisticated methods also consider domain coverage and copying behaviors to better estimate source trustworthiness. === Probabilistic Graphical Models based === These methods use probabilistic graphical models to automatically define the set of true values of given data item and also to assess source quality without need of any supervision. == Applications == Many real-world applications can benefit from the use of truth discovery algorithms. Typical domains of application include: healthcare, crowd/social sensing, crowdsourcing aggregation, information extraction and knowledge base construction. Truth discovery algorithms could be also used to revolutionize the way in which web pages are ranked in search engines, going from current methods based on link analysis like PageRank, to procedures that rank web pages based on the accuracy of the information they provide.

    Read more →
  • ImageNet

    ImageNet

    The ImageNet project is a large visual database designed for use in visual object recognition software research. More than 14 million images have been hand-annotated by the project to indicate what objects are pictured and in at least one million of the images, bounding boxes are also provided. ImageNet contains more than 20,000 categories, with a typical category, such as "balloon" or "strawberry", consisting of several hundred images. The database of annotations of third-party image URLs is freely available directly from ImageNet, though the actual images are not owned by ImageNet. Since 2010, the ImageNet project runs an annual software contest, the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), where software programs compete to correctly classify and detect objects and scenes. The challenge uses a "trimmed" list of one thousand non-overlapping classes. == History == AI researcher Fei-Fei Li began working on the idea for ImageNet in 2006. At a time when most AI research focused on models and algorithms, Li wanted to expand and improve the data available to train AI algorithms. In 2007, Li met with Princeton professor Christiane Fellbaum, one of the creators of WordNet, to discuss the project. As a result of this meeting, Li went on to build ImageNet starting from the roughly 22,000 nouns of WordNet and using many of its features. She was also inspired by a 1987 estimate that the average person recognizes roughly 30,000 different kinds of objects. As an assistant professor at Princeton, Li assembled a team of researchers to work on the ImageNet project. They used Amazon Mechanical Turk to help with the classification of images. Labeling started in July 2008 and ended in April 2010. It took 49K workers from 167 countries filtering and labeling over 160M candidate images. They had enough budget to have each of the 14 million images labelled three times. The original plan called for 10,000 images per category, for 40,000 categories at 400 million images, each verified 3 times. They found that humans can classify at most 2 images/sec. At this rate, it was estimated to take 19 human-years of labor (without rest). They presented their database for the first time as a poster at the 2009 Conference on Computer Vision and Pattern Recognition (CVPR) in Florida, titled "ImageNet: A Preview of a Large-scale Hierarchical Dataset". The poster was reused at Vision Sciences Society 2009. In 2009, Alex Berg suggested adding object localization as a task. Li approached PASCAL Visual Object Classes contest in 2009 for a collaboration. It resulted in the subsequent ImageNet Large Scale Visual Recognition Challenge starting in 2010, which has 1000 classes and object localization, as compared to PASCAL VOC which had just 20 classes and 19,737 images (in 2010). === Significance for deep learning === On 30 September 2012, a convolutional neural network (CNN) called AlexNet achieved a top-5 error of 15.3% in the ImageNet 2012 Challenge, more than 10.8 percentage points lower than that of the runner-up. Using convolutional neural networks was feasible due to the use of graphics processing units (GPUs) during training, an essential ingredient of the deep learning revolution. According to The Economist, "Suddenly people started to pay attention, not just within the AI community but across the technology industry as a whole." In 2015, AlexNet was outperformed by Microsoft's very deep CNN with over 100 layers, which won the ImageNet 2015 contest, having 3.57% error on the test set. Andrej Karpathy estimated in 2014 that with concentrated effort, he could reach 5.1% error rate, and ~10 people from his lab reached ~12-13% with less effort. It was estimated that with maximal effort, a human could reach 2.4%. == Dataset == ImageNet crowdsources its annotation process. Image-level annotations indicate the presence or absence of an object class in an image, such as "there are tigers in this image" or "there are no tigers in this image". Object-level annotations provide a bounding box around the (visible part of the) indicated object. ImageNet uses a variant of the broad WordNet schema to categorize objects, augmented with 120 categories of dog breeds to showcase fine-grained classification. In 2012, ImageNet was the world's largest academic user of Mechanical Turk. The average worker identified 50 images per minute. The original plan of the full ImageNet would have roughly 50M clean, diverse and full resolution images spread over approximately 50K synsets. This was not achieved. The summary statistics given on April 30, 2010: Total number of non-empty synsets: 21841 Total number of images: 14,197,122 Number of images with bounding box annotations: 1,034,908 Number of synsets with SIFT features: 1000 Number of images with SIFT features: 1.2 million === Categories === The categories of ImageNet were filtered from the WordNet concepts. Each concept, since it can contain multiple synonyms (for example, "kitty" and "young cat"), so each concept is called a "synonym set" or "synset". There were more than 100,000 synsets in WordNet 3.0, majority of them are nouns (80,000+). The ImageNet dataset filtered these to 21,841 synsets that are countable nouns that can be visually illustrated. Each synset in WordNet 3.0 has a "WordNet ID" (wnid), which is a concatenation of part of speech and an "offset" (a unique identifying number). Every wnid starts with "n" because ImageNet only includes nouns. For example, the wnid of synset "dog, domestic dog, Canis familiaris" is "n02084071". The categories in ImageNet fall into 9 levels, from level 1 (such as "mammal") to level 9 (such as "German shepherd"). === Image format === The images were scraped from online image search (Google, Picsearch, MSN, Yahoo, Flickr, etc) using synonyms in multiple languages. For example: German shepherd, German police dog, German shepherd dog, Alsatian, ovejero alemán, pastore tedesco, 德国牧羊犬. ImageNet consists of images in RGB format with varying resolutions. For example, in ImageNet 2012, "fish" category, the resolution ranges from 4288 x 2848 to 75 x 56. In machine learning, these are typically preprocessed into a standard constant resolution, and whitened, before further processing by neural networks. For example, in PyTorch, ImageNet images are by default normalized by dividing the pixel values so that they fall between 0 and 1, then subtracting by [0.485, 0.456, 0.406], then dividing by [0.229, 0.224, 0.225]. These are the mean and standard deviations for ImageNet, so this whitens the input data. === Labels and annotations === Each image is labelled with exactly one wnid. Dense SIFT features (raw SIFT descriptors, quantized codewords, and coordinates of each descriptor/codeword) for ImageNet-1K were available for download, designed for bag of visual words. The bounding boxes of objects were available for about 3000 popular synsets with on average 150 images in each synset. Furthermore, some images have attributes. They released 25 attributes for ~400 popular synsets: Color: black, blue, brown, gray, green, orange, pink, red, violet, white, yellow Pattern: spotted, striped Shape: long, round, rectangular, square Texture: furry, smooth, rough, shiny, metallic, vegetation, wooden, wet === ImageNet-21K === The full original dataset is referred to as ImageNet-21K. ImageNet-21k contains 14,197,122 images divided into 21,841 classes. Some papers round this up and name it ImageNet-22k. The full ImageNet-21k was released in Fall of 2011, as fall11_whole.tar. There is no official train-validation-test split for ImageNet-21k. Some classes contain only 1-10 samples, while others contain thousands. === ImageNet-1K === There are various subsets of the ImageNet dataset used in various context, sometimes referred to as "versions". One of the most highly used subsets of ImageNet is the "ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2012–2017 image classification and localization dataset". This is also referred to in the research literature as ImageNet-1K or ILSVRC2017, reflecting the original ILSVRC challenge that involved 1,000 classes. ImageNet-1K contains 1,281,167 training images, 50,000 validation images and 100,000 test images. Each category in ImageNet-1K is a leaf category, meaning that there are no child nodes below it, unlike ImageNet-21K. For example, in ImageNet-21K, there are some images categorized as simply "mammal", whereas in ImageNet-1K, there are only images categorized as things like "German shepherd", since there are no child-words below "German shepherd". === Later developments === In the WordNet they built ImageNet on, there were 2832 synsets in the "person" subtree. During 2018--2020 period, they removed the download of the ImageNet-21k as they went through extensive filtering in these person synsets. Out of these 2832 synsets, 1593 were deemed "potentially offensive". Out of the remaining 1239, 1081 were deemed not really "visual". The result was that only 158 syn

    Read more →
  • Unfold (app)

    Unfold (app)

    Unfold is a mobile application that allows users to create social media content using a variety of templates and other tools. It was founded in 2018 by Alfonso Cobo and Andy McCune. It enables users to add photos, video, and text with a variety of tools. In 2019, Unfold was acquired by Squarespace. == History == In January 2017, Alfonso Cobo was studying at Parsons School of Design when he realized there was no software or app that could create a portfolio of his work on an iPad. Cobo created an app called Portfolio, a basic version of a portfolio layout app, and the first one to exist for iPad. He launched it in 2017. After launching the first version of Portfolio, Cobo realized the more popular market and use case was on mobile. Around that time, Instagram was launching Stories. As a result, Cobo pivoted the app away from portfolios and instead focused on an app to showcase one's stories. Cobo later contacted Andy McCune, founder of social media account Earth, to collaborate with Unfold. Unfold also partnered with various companies to create custom templates. These include Equinox, Tommy Hilfiger, NARS, Billboard Music Awards, and Product Red. Unfold also launched a collection of Product Red templates to help eliminate HIV/AIDS in several African countries. In 2019, Squarespace acquired Unfold. The Unfold app has been downloaded over 60 million times and has been used to create over 1 billion Instagram stories. == Features == With Unfold, users can utilize hundreds of templates to make social content for social media platforms such as Instagram, Snapchat, and Facebook. The free app offers users basic templates and standard fonts, filters, and stickers, and there are also premium templates available for a monthly subscription. With Unfold+ and Unfold Pro (previously Unfold for Brands), users can access premium templates and tools, as well as upload custom brand assets and fonts. In 2020, Unfold launched Bio Sites, which allows users to link to multiple sites and platforms.

    Read more →
  • Algorithmic inference

    Algorithmic inference

    Algorithmic inference gathers new developments in the statistical inference methods made feasible by the powerful computing devices widely available to any data analyst. Cornerstones in this field are computational learning theory, granular computing, bioinformatics, and, long ago, structural probability (Fraser 1966). The main focus is on the algorithms which compute statistics rooting the study of a random phenomenon, along with the amount of data they must feed on to produce reliable results. This shifts the interest of mathematicians from the study of the distribution laws to the functional properties of the statistics, and the interest of computer scientists from the algorithms for processing data to the information they process. == The Fisher parametric inference problem == Concerning the identification of the parameters of a distribution law, the mature reader may recall lengthy disputes in the mid 20th century about the interpretation of their variability in terms of fiducial distribution (Fisher 1956), structural probabilities (Fraser 1966), priors/posteriors (Ramsey 1925), and so on. From an epistemology viewpoint, this entailed a companion dispute as to the nature of probability: is it a physical feature of phenomena to be described through random variables or a way of synthesizing data about a phenomenon? Opting for the latter, Fisher defines a fiducial distribution law of parameters of a given random variable that he deduces from a sample of its specifications. With this law he computes, for instance "the probability that μ (mean of a Gaussian variable – omeur note) is less than any assigned value, or the probability that it lies between any assigned values, or, in short, its probability distribution, in the light of the sample observed". == The classic solution == Fisher fought hard to defend the difference and superiority of his notion of parameter distribution in comparison to analogous notions, such as Bayes' posterior distribution, Fraser's constructive probability and Neyman's confidence intervals. For half a century, Neyman's confidence intervals won out for all practical purposes, crediting the phenomenological nature of probability. With this perspective, when you deal with a Gaussian variable, its mean μ is fixed by the physical features of the phenomenon you are observing, where the observations are random operators, hence the observed values are specifications of a random sample. Because of their randomness, you may compute from the sample specific intervals containing the fixed μ with a given probability that you denote confidence. === Example === Let X be a Gaussian variable with parameters μ {\displaystyle \mu } and σ 2 {\displaystyle \sigma ^{2}} and { X 1 , … , X m } {\displaystyle \{X_{1},\ldots ,X_{m}\}} a sample drawn from it. Working with statistics S μ = ∑ i = 1 m X i {\displaystyle S_{\mu }=\sum _{i=1}^{m}X_{i}} and S σ 2 = ∑ i = 1 m ( X i − X ¯ ) 2 , where X ¯ = S μ m {\displaystyle S_{\sigma ^{2}}=\sum _{i=1}^{m}(X_{i}-{\overline {X}})^{2},{\text{ where }}{\overline {X}}={\frac {S_{\mu }}{m}}} is the sample mean, we recognize that T = S μ − m μ S σ 2 m − 1 m = X ¯ − μ S σ 2 / ( m ( m − 1 ) ) {\displaystyle T={\frac {S_{\mu }-m\mu }{\sqrt {S_{\sigma ^{2}}}}}{\sqrt {\frac {m-1}{m}}}={\frac {{\overline {X}}-\mu }{\sqrt {S_{\sigma ^{2}}/(m(m-1))}}}} follows a Student's t distribution (Wilks 1962) with parameter (degrees of freedom) m − 1, so that f T ( t ) = Γ ( m / 2 ) Γ ( ( m − 1 ) / 2 ) 1 π ( m − 1 ) ( 1 + t 2 m − 1 ) m / 2 . {\displaystyle f_{T}(t)={\frac {\Gamma (m/2)}{\Gamma ((m-1)/2)}}{\frac {1}{\sqrt {\pi (m-1)}}}\left(1+{\frac {t^{2}}{m-1}}\right)^{m/2}.} Gauging T between two quantiles and inverting its expression as a function of μ {\displaystyle \mu } you obtain confidence intervals for μ {\displaystyle \mu } . With the sample specification: x = { 7.14 , 6.3 , 3.9 , 6.46 , 0.2 , 2.94 , 4.14 , 4.69 , 6.02 , 1.58 } {\displaystyle \mathbf {x} =\{7.14,6.3,3.9,6.46,0.2,2.94,4.14,4.69,6.02,1.58\}} having size m = 10, you compute the statistics s μ = 43.37 {\displaystyle s_{\mu }=43.37} and s σ 2 = 46.07 {\displaystyle s_{\sigma ^{2}}=46.07} , and obtain a 0.90 confidence interval for μ {\displaystyle \mu } with extremes (3.03, 5.65). == Inferring functions with the help of a computer == From a modeling perspective the entire dispute looks like a chicken-egg dilemma: either fixed data by first and probability distribution of their properties as a consequence, or fixed properties by first and probability distribution of the observed data as a corollary. The classic solution has one benefit and one drawback. The former was appreciated particularly back when people still did computations with sheet and pencil. Per se, the task of computing a Neyman confidence interval for the fixed parameter θ is hard: you do not know θ, but you look for disposing around it an interval with a possibly very low probability of failing. The analytical solution is allowed for a very limited number of theoretical cases. Vice versa a large variety of instances may be quickly solved in an approximate way via the central limit theorem in terms of confidence interval around a Gaussian distribution – that's the benefit. The drawback is that the central limit theorem is applicable when the sample size is sufficiently large. Therefore, it is less and less applicable with the sample involved in modern inference instances. The fault is not in the sample size on its own part. Rather, this size is not sufficiently large because of the complexity of the inference problem. With the availability of large computing facilities, scientists refocused from isolated parameters inference to complex functions inference, i.e. re sets of highly nested parameters identifying functions. In these cases we speak about learning of functions (in terms for instance of regression, neuro-fuzzy system or computational learning) on the basis of highly informative samples. A first effect of having a complex structure linking data is the reduction of the number of sample degrees of freedom, i.e. the burning of a part of sample points, so that the effective sample size to be considered in the central limit theorem is too small. Focusing on the sample size ensuring a limited learning error with a given confidence level, the consequence is that the lower bound on this size grows with complexity indices such as VC dimension or detail of a class to which the function we want to learn belongs. === Example === A sample of 1,000 independent bits is enough to ensure an absolute error of at most 0.081 on the estimation of the parameter p of the underlying Bernoulli variable with a confidence of at least 0.99. The same size cannot guarantee a threshold less than 0.088 with the same confidence 0.99 when the error is identified with the probability that a 20-year-old man living in New York does not fit the ranges of height, weight and waistline observed on 1,000 Big Apple inhabitants. The accuracy shortage occurs because both the VC dimension and the detail of the class of parallelepipeds, among which the one observed from the 1,000 inhabitants' ranges falls, are equal to 6. == The general inversion problem solving the Fisher question == With insufficiently large samples, the approach: fixed sample – random properties suggests inference procedures in three steps: === Definition === For a random variable and a sample drawn from it a compatible distribution is a distribution having the same sampling mechanism M X = ( Z , g θ ) {\displaystyle {\mathcal {M}}_{X}=(Z,g_{\boldsymbol {\theta }})} of X with a value θ {\displaystyle {\boldsymbol {\theta }}} of the random parameter Θ {\displaystyle \mathbf {\Theta } } derived from a master equation rooted on a well-behaved statistic s. === Example === You may find the distribution law of the Pareto parameters A and K as an implementation example of the population bootstrap method as in the figure on the left. Implementing the twisting argument method, you get the distribution law F M ( μ ) {\displaystyle F_{M}(\mu )} of the mean M of a Gaussian variable X on the basis of the statistic s M = ∑ i = 1 m x i {\textstyle s_{M}=\sum _{i=1}^{m}x_{i}} when Σ 2 {\displaystyle \Sigma ^{2}} is known to be equal to σ 2 {\displaystyle \sigma ^{2}} (Apolloni, Malchiodi & Gaito 2006). Its expression is: F M ( μ ) = Φ ( m μ − s M σ m ) , {\displaystyle F_{M}(\mu )=\Phi {\left({\frac {m\mu -s_{M}}{\sigma {\sqrt {m}}}}\right)},} shown in the figure on the right, where Φ {\displaystyle \Phi } is the cumulative distribution function of a standard normal distribution. Computing a confidence interval for M given its distribution function is straightforward: we need only find two quantiles (for instance δ / 2 {\displaystyle \delta /2} and 1 − δ / 2 {\displaystyle 1-\delta /2} quantiles in case we are interested in a confidence interval of level δ symmetric in the tail's probabilities) as indicated on the left in the diagram showing the behavior of

    Read more →
  • Scenery generator

    Scenery generator

    A scenery generator (or terrain generator) is a software used to create landscape images, 3D models, and animations. These programs often use procedural generation to generate the landscapes, or sometimes created and rendered by a 3D artist. These programs are often used in video games or movies. Basic elements of landscapes created by scenery generators include terrain, water, foliage, and clouds. The process for basic random generation uses a diamond square algorithm. == Common features == Most scenery generators can create basic heightmaps to simulate the variation of elevation in basic terrain. Common techniques include Simplex noise, fractals, or the diamond-square algorithm, which can generate 2-dimensional heightmaps. A version of scenery generator can be very simplistic. Using a diamond-square algorithm with some extra steps involving fractals, an algorithm for random generation of terrain can be made with only 120 lines of code. The program in example takes a grid and then divides the grid repeatedly. Each smaller grid is then split into squares and diamonds and the algorithm then makes the randomized terrain for each square and diamond. Most programs for creating landscapes also allow for adjustment and editing of the landscape. For example, World Creator allows for terrain sculpting, which uses a similar brush system as Photoshop, and allows for additional terrain enhancement with its procedural techniques such as erosion, sediments, and more. Other tools in the World Creator program include terrain stamping, which allows you to import elevation maps and use them as a base. The programs tend to also allow for additional placement of rocks, trees, etc. These can be done procedurally or by hand depending on the program. Typically the models used for the placement objects are the same as to lessen the amount of work that would be done if the user was to create a multitude of different trees. The terrain generated the computer does a generation of multifractals then integrates them until finally rendering them onto the screen. These techniques are typically done “on-the-fly” which typically for a 128 × 128 resolution terrain would mean 1.5 seconds on a CPU from the early 1990s. == Applications == Scenery generators are commonly used in movies, animations, 3D rendering, and video games. For example, Industrial Light & Magic used E-on Vue to create the fictional environments for Pirates of the Caribbean: Dead Man's Chest. In such live-action cases, a 3D model of the generated environment is rendered and blended with live-action footage. Scenery generated by the software may also be used to create completely computer-generated scenes. In the case of animated movies such as Kung Fu Panda, the raw generation is assisted by hand-painting to accentuate subtle details. Environmental elements not commonly associated with landscapes, such as ocean waves, have also been handled by the software. Scenery generation is used in most 3D based video-games. These typically use either custom or purchased engines that contain their own scenery generators. For some games they tend to use a procedurally generated terrain. These typically use a form of height mapping and use of Perlin noise. This will create a grid that with one point in a 2D coordinate will create the same heightmap as it is pseudorandom, meaning it will result in the same output with the same input. This can then easily be translated into the product 3D image. These can then be changed from the editor tools in most engines if the terrain will be custom built. With recent developments neural networks can be built to create or texture the terrain based on previously suggested artwork or heightmap data. These would be generated using algorithms that have been able to identify images and similarities between them. With the info the machine can take other heightmaps and render a very similar looking image to the style image. This can be used to create similar images in example a Studio Ghibli or Van Gogh art-style. == Software == Most game engines, whether custom or proprietary, will have terrain generation built in. Some terrain generator programs include, Terragen, which can create terrain, water, atmosphere and lighting; L3DT, which provides similar functions to Terragen, and has a 2048 × 2048 resolution limit; and World Creator, which can create terrain, and is fully GPU powered. === List of 3D terrain generation software ===

    Read more →
  • Starlight Information Visualization System

    Starlight Information Visualization System

    Starlight is a software product originally developed at Pacific Northwest National Laboratory and now by Future Point Systems. It is an advanced visual analysis environment. In addition to using information visualization to show the importance of individual pieces of data by showing how they relate to one another, it also contains a small suite of tools useful for collaboration and data sharing, as well as data conversion, processing, augmentation and loading. The software, originally developed for the intelligence community, allows users to load data from XML files, databases, RSS feeds, web services, HTML files, Microsoft Word, PowerPoint, Excel, CSV, Adobe PDF, TXT files, etc. and analyze it with a variety of visualizations and tools. The system integrates structured, unstructured, geospatial, and multimedia data, offering comparisons of information at multiple levels of abstraction, simultaneously and in near real-time. In addition Starlight allows users to build their own named entity-extractors using a combination of algorithms, targeted normalization lists and regular expressions in the Starlight Data Engineer (SDE). As an example, Starlight might be used to look for correlations in a database containing records about chemical spills. An analyst could begin by grouping records according to the cause of the spill to reveal general trends. Sorting the data a second time, they could apply different colors based on related details such as the company responsible, age of equipment or geographic location. Maps and photographs could be integrated into the display, making it even easier to recognize connections among multiple variables. Starlight has been deployed to both the Iraq and Afghanistan wars and used on a number of large-scale projects. PNNL began developing Starlight in the mid-1990s, with funding from the Land Information Warfare Agency, a part of the Army Intelligence and Security Command and continued developed at the laboratory with funding from the NSA and the CIA. Starlight integrates visual representations of reports, radio transcripts, radar signals, maps and other information. The software system was recently honored with an R&D 100 Award for technical innovation. In 2006 Future Point Systems, a Silicon Valley startup, acquired rights to jointly develop and distribute the Starlight product in cooperation with the Pacific Northwest National Laboratory. The software is now also used outside of the military/intelligence communities in a number of commercial environments.

    Read more →
  • Vatican News App

    Vatican News App

    The Vatican News App is an official mobile application software issued by the Vatican's Dicastery for Communication. Formerly titled The Pope App, the app was launched on January 23, 2013, under the auspices of the Pontifical Council for Social Communications, a now-defunct dicastery that was merged into the Secretariat (now Dicastery) for Communication in March 2016. Initially, The Pope App was available only on iOS devices, but became available for Android phones at the end of February 2013. The app is available for download on iOS and Android in five languages: English, French, Italian, Portuguese and Spanish. It was originally promoted as an application with focus on the figure of the Pope which made it possible to follow the Pope's events while they are taking place. Alerts notified the followers by informing and offering access to "official papal-related content in a variety of formats". The app also enabled its users to see areas of the Vatican through webcams allocated throughout St. Peter's Square in Rome that broadcast images. In early 2018, The Pope App was relaunched as the Vatican News App, accompanied by a redesign that eliminated many of the previous version's features, reducing the app to a more conventional news service, with increased emphasis on news from the Vatican and the worldwide Catholic Church and less focus on the day-to-day activities of the Pope.

    Read more →
  • Managed private cloud

    Managed private cloud

    Managed private cloud (also known as "hosted private cloud" or "single-tenant SaaS") refers to a principle in software architecture where a single instance of the software runs on a server, serves a single client organization (tenant), and is managed by a third party. The third-party provider is responsible for providing the hardware for the server and also for preliminary maintenance. This is in contrast to multitenancy, where multiple client organizations share a single server, or an on-premises deployment, where the client organization hosts its software instance. Managed private clouds also fall under the larger umbrella of cloud computing. == Adoption == The need for private clouds arose due to enterprises requiring a dedicated service and infrastructure for their cloud computing needs, such as for business-critical operations, improved security, and better control over their resources. Managed private cloud adoption is a popular choice among organizations. It has been on the rise due to enterprises requiring a dedicated cloud environment and preferring to avoid having to deal with management, maintenance, or future upgrade costs for the associated infrastructure and services. Such operational costs are unavoidable in on-premises private cloud data centers. == Advantages and challenges of managed private cloud == A managed private cloud cuts down on upkeep costs by outsourcing infrastructure management and maintenance to the managed cloud provider. It is easier to integrate an organization's existing software, services, and applications into a dedicated cloud hosting infrastructure which can be customized to the client's needs instead of a public cloud platform, whose hardware or infrastructure/software platform cannot be individualized to each client. Customers who choose a managed private cloud deployment usually choose them because of their desire for efficient cloud deployment, but also have the need for service customization or integration only available in a single-tenant environment. This chart shows the key benefits of the different types of deployments, and shows the overlap between these cloud solutions. This chart shows key drawbacks. Since deployments are done in a single-tenant environment, it is usually cost-prohibitive for small and medium-sized businesses. While server upkeep and maintenance are handled by the service provider, including network management and security, the client is charged for all such services. It is up to the potential client to determine if a managed private cloud solution aligns with their business objectives and budget. While the service provider maintains the upkeep of servers, network, and platform infrastructure, sensitive data is typically not stored on managed private clouds as it may leave business-critical information prone to breaches via third-party attacks on the cloud service provider. Common customizations and integrations include: Active Directory Single Sign-on Learning Management Systems Video Teleconferencing == Deployment strategies and service providers == Software companies have taken a variety of strategies in the Managed Private Cloud realm. Some software organizations have provided managed private cloud options internally, such as Microsoft. Companies that offer an on-premises deployment option, by definition, enable third-party companies to market Managed Private Cloud solutions. A few managed private cloud service providers are: Adobe Connect: Adobe Connect may be purchased for on-premises deployment, multi-tenant hosted deployment, managed private cloud as ACMS, or managed by third-party managed private cloud provider ConnectSolutions. Rackspace CenturyLink Microsoft licenses for Lync, SharePoint and Exchange may be purchased for on-premises deployment, a multi-tenant hosted deployment via Office 365, or managed by third-party cloud hosting from Azaleos, ConnectSolutions and others.

    Read more →
  • Split screen (computing)

    Split screen (computing)

    Split screen is a display technique in computer graphics that consists of dividing graphics and/or text into non-overlapping adjacent parts, typically as two or four rectangular areas. This allows for the simultaneous presentation of (usually) related graphical and textual information on a computer display. TV sports adopted this presentation methodology in the 1960s for instant replay. Non-dynamic split screens differ from windowing systems in that the latter allowed overlapping and freely movable parts of the screen (the "windows") to present both related and unrelated application data to the user. In contrast, split-screen views are strictly limited to fixed positions. The split screen technique can also be used to run two instances of an application, potentially allowing another user to interact with the second instance. == In operating systems == Split screen modes are used by mobile operating systems to enable computer multitasking similar to the window interface present in desktop operating systems. Android supports split screen view of two apps natively on all devices, while certain devices, such as Samsung Galaxy Z TriFold, support three sumultaneous views. Split screen functionality is not supported on iOS, but a similar feature called Split View is present in iPadOS, first introduced in 2015 with the first generation of iPad Pro. == In video games == The split screen feature is commonly used in non-networked, also known as couch co-op, video games with multiplayer options. In its most easily understood form, a split screen for a multiplayer video game is an audiovisual output device (usually a standard television for video game consoles) where the display has been divided into 2-4 equally sized areas (depending on number of players) so that the players can explore different areas simultaneously without being close to each other. This has historically been remarkably popular on consoles, which until the 2000s did not have access to the Internet or any other network and is less common today with modern support for networked console-to-console multiplayer. In competitive split-screen games, it is customarily considered cheating to look at another player's screen section to gain an advantage. === History === Split screen gaming dates back to at least the 1970s, with games such Drag Race (1977) from Kee Games in the arcades being presented in this format. It has always been a common feature of two or more player home console and computer games too, with notable titles being Kikstart II for 8-bit systems, a number of 16-bit racing games (such as Lotus Esprit Turbo Challenge and Road Rash II), and action/strategy games (such as Toejam & Earl and Lemmings), all employing a vertical or horizontal screen split for two player games. Xenophobe is notable as a three-way split screen arcade title, although on home platforms it was reduced to one or two screens. The addition of four controller ports on home consoles also ushered in more four-way split screen games, with Mario Kart 64 and Goldeneye 007 on the Nintendo 64 being two well known examples. In arcades, machines tended to move towards having a whole screen for each player, or multiple connected machines, for multiplayer. On home machines, especially in the first and third person shooter genres, multiplayer is now more common over a network or the internet rather than locally with split screen. Starting from the late 2000s, the presence of split screen multiplayer has largely been declining due to the increasing prevalence of online multiplayer, though TechRadar reported a resurgence of split screen due to support from independent studios and increased interest from the players.

    Read more →
  • Variable data publishing

    Variable data publishing

    Variable-data publishing (VDP) (also known as database publishing) is a term referring to the output of a variable composition system. While these systems can produce both electronically viewable and hard-copy (print) output, the "variable-data publishing" term today often distinguishes output destined for electronic viewing, rather than that which is destined for hard-copy print (e.g. variable data printing). Essentially the same techniques are employed to perform variable-data publishing, as those utilized with variable data printing. The difference is in the interpretation for output. While variable-data printing may be interpreted to produce various print streams or page-description files (e.g. AFP/IPDS, PostScript, PCL), variable-data publishing produces electronically viewable files, most commonly seen in the forms of PDF, HTML, or XML. Variable-data composition involves the use of data to conditionally: exhibit text (static blocks and/or variable content) exhibit images select fonts select colors format page layouts & flows Variable-data may be as simple as an address block or salutation. However, it can be any or all of the document's textual content—including words, sentences, paragraphs, pages, or the entire document. In other words, it can make up as little or as much of the document as the composer desires. Variable data may also be used to exhibit various images, such as logos, products, or membership photos. Further, variable-data can be used to build rule-based design schemes, including fonts, colors, and page formats. The possibilities are vast. The variable-data tools available today, make it possible to perform variable-data composition at nearly every stage of document production. However, the level of control that can be achieved varies, based upon how far into the document production process a variable-data tool is deployed. For example, if variable-data insertion occurs just prior to output...it's not likely that the text flow or layout can be altered with nearly as much control as would be available at the time of initial document composition. Many organizations will produce multiple forms of output (aka: multi-channel output), for the same document. This ensures that the published content is available to recipients via any form of access method they might require. When multi-channel output is utilized, integrity between those output channels often becomes important. Variable-data publishing may be performed on everything from a personal computer to a mainframe system. However, the speed and practical output volumes which can be achieved are directly affected by the computer power utilized. == Origin of the concept == The term variable-data publishing was likely an offshoot of the term "variable-data printing", first introduced to the printing industry by Frank Romano, Professor Emeritus, School of Print Media, at the College of Imaging Arts and Sciences at Rochester Institute of Technology. However, the concept of merging static document elements and variable document elements predates the term and has seen various implementations ranging from simple desktop 'mail merge', to complex mainframe applications in the financial and banking industry. In the past, the term VDP has been most closely associated with digital printing machines. However, in the past 3 years the application of this technology has spread to web pages, emails, and mobile messaging.

    Read more →