AI Chatbot Soulmate

AI Chatbot Soulmate — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Rake (software)

    Rake (software)

    Rake is a software task management and a build automation tool created by Jim Weirich. It allows the user to specify tasks and to describe dependencies as well as to group tasks into namespaces. It is similar to SCons and Make. Rake was written in Ruby and has been part of the standard library of Ruby since version 1.9. == Examples == The tasks that should be executed need to be defined in a configuration file called Rakefile. A Rakefile has no special syntax and contains executable Ruby code. === Tasks === The basic unit in Rake is the task. A task has a name and an action block, that defines its functionality. The following code defines a task called greet that will output the text "Hello, Rake!" to the console. When defining a task, you can optionally add dependencies, that is one task can depend on the successful completion of another task. Calling the "seed" task from the following example will first execute the "migrate" task and only then proceed with the execution of the "seed" task.Tasks can also be made more versatile by accepting arguments. For example, the "generate_report" task will take a date as argument. If no argument is supplied the current date is used.A special type of task is the file task, which can be used to specify file creation tasks. The following task, for example, is given two object files, i.e. "a.o" and "b.o", to create an executable program.Another useful tool is the directory convenience method, that can be used to create directories upon demand. === Rules === When a file is named as a prerequisite but it does not have a file task defined for it, Rake will attempt to synthesize a task by looking at a list of rules supplied in the Rakefile. For example, suppose we were trying to invoke task "mycode.o" with no tasks defined for it. If the Rakefile has a rule that looks like this: This rule will synthesize any task that ends in ".o". It has as a prerequisite that a source file with an extension of ".c" must exist. If Rake is able to find a file named "mycode.c", it will automatically create a task that builds "mycode.o" from "mycode.c". If the file "mycode.c" does not exist, Rake will attempt to recursively synthesize a rule for it. When a task is synthesized from a rule, the source attribute of the task is set to the matching source file. This allows users to write rules with actions that reference the source file. === Advanced rules === Any regular expression may be used as the rule pattern. Additionally, a proc may be used to calculate the name of the source file. This allows for complex patterns and sources. The following rule is equivalent to the example above: NOTE: Because of a quirk in Ruby syntax, parentheses are required around a rule when the first argument is a regular expression. The following rule might be used for Java files: === Namespaces === To better organize big Rakefiles, tasks can be grouped into namespaces. Below is an example of a simple Rake recipe:

    Read more →
  • Materials informatics

    Materials informatics

    Materials informatics is a field of study that applies the principles of informatics and data science to materials science and engineering to improve the understanding, use, selection, development, and discovery of materials. The term "materials informatics" is frequently used interchangeably with "data science", "machine learning", and "artificial intelligence" by the community. This is an emerging field, with a goal to achieve high-speed and robust acquisition, management, analysis, and dissemination of diverse materials data with the goal of greatly reducing the time and risk required to develop, produce, and deploy new materials, which generally takes longer than 20 years. This field of endeavor is not limited to some traditional understandings of the relationship between materials and information. Some more narrow interpretations include combinatorial chemistry, process modeling, materials databases, materials data management, and product life cycle management. Materials informatics is at the convergence of these concepts, but also transcends them and has the potential to achieve greater insights and deeper understanding by applying lessons learned from data gathered on one type of material to others. By gathering appropriate meta data, the value of each individual data point can be greatly expanded. == Databases == Databases are essential for any informatics research and applications. In material informatics many databases exist containing both empirical data obtained experimentally, and theoretical data obtained computationally. Big data that can be used for machine learning is particularly difficult to obtain for experimental data due to the lack of a standard for reporting data and the variability in the experimental environment. This lack of big data has led to growing effort in developing machine learning techniques that utilize data extremely data sets. On the other hand, large uniform database of theoretical density functional theory (DFT) calculations exists. These databases have proven their utility in high-throughput material screening and discovery. Some common DFT databases and high throughput tools are listed below: Databases: MaterialsProject.org, MaterialsWeb.org (University of Florida) HT software: Pymatgen, MPInterfaces, Matminer == Beyond computational methods? == The concept of materials informatics is addressed by the Materials Research Society. For example, materials informatics was the theme of the December 2006 issue of the MRS Bulletin. The issue was guest-edited by John Rodgers of Innovative Materials, Inc., and David Cebon of Cambridge University, who described the "high payoff for developing methodologies that will accelerate the insertion of materials, thereby saving millions of investment dollars." The editors focused on the limited definition of materials informatics as primarily focused on computational methods to process and interpret data. They stated that "specialized informatics tools for data capture, management, analysis, and dissemination" and "advances in computing power, coupled with computational modeling and simulation and materials properties databases" will enable such accelerated insertion of materials. A broader definition of materials informatics goes beyond the use of computational methods to carry out the same experimentation, viewing materials informatics as a framework in which a measurement or computation is one step in an information-based learning process that uses the power of a collective to achieve greater efficiency in exploration. When properly organized, this framework crosses materials boundaries to uncover fundamental knowledge of the basis of physical, mechanical, and engineering properties. == Challenges == While there are many who believe in the future of informatics in the materials development and scaling process, many challenges remain. Hill, et al., write that "Today, the materials community faces serious challenges to bringing about this data-accelerated research paradigm, including diversity of research areas within materials, lack of data standards, and missing incentives for sharing, among others. Nonetheless, the landscape is rapidly changing in ways that should benefit the entire materials research enterprise." This remaining tension between traditional materials development methodologies and the use of more computationally, machine learning, and analytics approaches will likely exist for some time as the materials industry overcomes some of the cultural barriers necessary to fully embrace such new ways of thinking. == Analogy from Biology == The overarching goals of bioinformatics and systems biology may provide a useful analogy. Andrew Murray of Harvard University expresses the hope that such an approach "will save us from the era of "one graduate student, one gene, one PhD". Similarly, the goal of materials informatics is to save us from one graduate student, one alloy, one PhD. Such goals will require more sophisticated strategies and research paradigms than applying data-science methods to the same tasks set currently undertaken by students.

    Read more →
  • Enterprise Objects Framework

    Enterprise Objects Framework

    The Enterprise Objects Framework, or simply EOF, was introduced by NeXT in 1994 as a pioneering object-relational mapping product for its NeXTSTEP and OpenStep development platforms. EOF abstracts the process of interacting with a relational database by mapping database rows to Java or Objective-C objects. This largely relieves developers from writing low-level SQL code. EOF enjoyed some niche success in the mid-1990s among financial institutions who were attracted to the rapid application development advantages of NeXT's object-oriented platform. Since Apple Inc's merger with NeXT in 1996, EOF has evolved into a fully integrated part of WebObjects, an application server also originally from NeXT. Many of the core concepts of EOF re-emerged as part of Core Data, which further abstracts the underlying data formats to allow it to be based on non-SQL stores. == History == In the early 1990s NeXT Computer recognized that connecting to databases was essential to most businesses and yet also potentially complex. Every data source has a different data-access language (or API), driving up the costs to learn and use each vendor's product. The NeXT engineers wanted to apply the advantages of object-oriented programming, by getting objects to "talk" to relational databases. As the two technologies are very different, the solution was to create an abstraction layer, insulating developers from writing the low-level procedural code (SQL) specific to each data source. The first attempt came in 1992 with the release of Database Kit (DBKit), which wrapped an object-oriented framework around any database. Unfortunately, NEXTSTEP at the time was not powerful enough and DBKit had serious design flaws. NeXT's second attempt came in 1994 with the Enterprise Objects Framework (EOF) version 1, a complete rewrite that was far more modular and OpenStep compatible. EOF 1.0 was the first product released by NeXT using the Foundation Kit and introduced autoreleased objects to the developer community. The development team at the time was only four people: Jack Greenfield, Rich Williamson, Linus Upson and Dan Willhite. EOF 2.0, released in late 1995, further refined the architecture, introducing the editing context. At that point, the development team consisted of Dan Willhite, Craig Federighi, Eric Noyau and Charly Kleissner. EOF achieved a modest level of popularity in the financial programming community in the mid-1990s, but it would come into its own with the emergence of the World Wide Web and the concept of web applications. It was clear that EOF could help companies plug their legacy databases into the Web without any rewriting of that data. With the addition of frameworks to do state management, load balancing and dynamic HTML generation, NeXT was able to launch the first object-oriented Web application server, WebObjects, in 1996, with EOF at its core. In 2000, Apple Inc. (which had merged with NeXT) officially dropped EOF as a standalone product, meaning that developers would be unable to use it to create desktop applications for the forthcoming Mac OS X. It would, however, continue to be an integral part of a major new release of WebObjects. WebObjects 5, released in 2001, was significant for the fact that its frameworks had been ported from their native Objective-C programming language to the Java language. Critics of this change argue that most of the power of EOF was a side effect of its Objective-C roots, and that EOF lost the beauty or simplicity it once had. Third-party tools, such as EOGenerator, help fill the deficiencies introduced by Java (mainly due to the loss of categories). The Objective-C code base was re-introduced with some modifications to desktop application developers as Core Data, part of Apple's Cocoa API, with the release of Mac OS X Tiger in April 2005. == How EOF works == Enterprise Objects provides tools and frameworks for object-relational mapping. The technology specializes in providing mechanisms to retrieve data from various data sources, such as relational databases via JDBC and JNDI directories, and mechanisms to commit data back to those data sources. These mechanisms are designed in a layered, abstract approach that allows developers to think about data retrieval and commitment at a higher level than a specific data source or data source vendor. Central to this mapping is a model file (an "EOModel") that you build with a visual tool — either EOModeler, or the EOModeler plug-in to Xcode. The mapping works as follows: Database tables are mapped to classes. Database columns are mapped to class attributes. Database rows are mapped to objects (or class instances). You can build data models based on existing data sources or you can build data models from scratch, which you then use to create data structures (tables, columns, joins) in a data source. The result is that database records can be transposed into Java objects. The advantage of using data models is that applications are isolated from the idiosyncrasies of the data sources they access. This separation of an application's business logic from database logic allows developers to change the database an application accesses without needing to change the application. EOF provides a level of database transparency not seen in other tools and allows the same model to be used to access different vendor databases and even allows relationships across different vendor databases without changing source code. Its power comes from exposing the underlying data sources as managed graphs of persistent objects. In simple terms, this means that it organizes the application's model layer into a set of defined in-memory data objects. It then tracks changes to these objects and can reverse those changes on demand, such as when a user performs an undo command. Then, when it is time to save changes to the application's data, it archives the objects to the underlying data sources. === Using Inheritance === In designing Enterprise Objects developers can leverage the object-oriented feature known as inheritance. A Customer object and an Employee object, for example, might both inherit certain characteristics from a more generic Person object, such as name, address, and phone number. While this kind of thinking is inherent in object-oriented design, relational databases have no explicit support for inheritance. However, using Enterprise Objects, you can build data models that reflect object hierarchies. That is, you can design database tables to support inheritance by also designing enterprise objects that map to multiple tables or particular views of a database table. == Enterprise Objects (EOs) == An Enterprise Object is analogous to what is often known in object-oriented programming as a business object — a class which models a physical or conceptual object in the business domain (e.g. a customer, an order, an item, etc.). What makes an EO different from other objects is that its instance data maps to a data store. Typically, an enterprise object contains key-value pairs that represent a row in a relational database. The key is basically the column name, and the value is what was in that row in the database. So it can be said that an EO's properties persist beyond the life of any particular running application. More precisely, an Enterprise Object is an instance of a class that implements the com.webobjects.eocontrol.EOEnterpriseObject interface. An Enterprise Object has a corresponding model (called an EOModel) that defines the mapping between the class's object model and the database schema. However, an enterprise object doesn't explicitly know about its model. This level of abstraction means that database vendors can be switched without it affecting the developer's code. This gives Enterprise Objects a high degree of reusability. == EOF and Core Data == Despite their common origins, the two technologies diverged, with each technology retaining a subset of the features of the original Objective-C code base, while adding some new features. === Features Supported Only by EOF === EOF supports custom SQL; shared editing contexts; nested editing contexts; and pre-fetching and batch faulting of relationships, all features of the original Objective-C implementation not supported by Core Data. Core Data also does not provide the equivalent of an EOModelGroup—the NSManagedObjectModel class provides methods for merging models from existing models, and for retrieving merged models from bundles. === Features Supported Only by Core Data === Core Data supports fetched properties; multiple configurations within a managed object model; local stores; and store aggregation (the data for a given entity may be spread across multiple stores); customization and localization of property names and validation warnings; and the use of predicates for property validation. These features of the original Objective-C implementation are not supported by the Java implementation.

    Read more →
  • Artificial Intelligence Applications Institute

    Artificial Intelligence Applications Institute

    The Artificial Intelligence Applications Institute (AIAI) at the School of Informatics at the University of Edinburgh is a non-profit technology transfer organisation that promoted research in the field of artificial intelligence. == History == The Artificial Intelligence Applications Institute (AIAI) was founded in 1983 at the University of Edinburgh as a specialist research and technology-transfer unit focusing on the practical uses of artificial intelligence (AI). The institute was established by Professor Jim Howe and colleagues from the Science and Engineering Research Council (SERC) Special Interest Group in AI in the Department of Artificial Intelligence, with a mission to apply AI techniques to solve real-world industrial and governmental problems. Under the directorship of Austin Tate, who served from 1985 to 2019, AIAI became one of the leading UK research centres devoted to AI programming systems, intelligent planning systems, decision support, and knowledge-based engineering. It collaborated with both academic partners and international organisations such as the European Space Agency and the UK Ministry of Defence. In 2001, AIAI joined the newly created Centre for Intelligent Systems and their Applications (CISA) within the University's School of Informatics. In December 2019, the institute was renamed the Artificial Intelligence and its Applications Institute to reflect a broader integration of fundamental and applied AI research. == Research programmes == AIAI’s research spans multiple areas of artificial intelligence, including: AI programming Systems - Edinburgh Prolog, Edinburgh Common Lisp, Logo; Knowledge representation and reasoning – development of ontologies, rule-based inference, and semantic modelling; Automated planning and scheduling – intelligent task management systems used in aerospace, manufacturing, and emergency response; Natural language processing and intelligent agents – interaction frameworks for human–computer collaboration; AI ethics and decision-making – research into responsible deployment and evaluation of autonomous systems. The institute also contributes to interdisciplinary fields such as computational creativity, explainable AI, and human–AI interaction. AIAI maintains close collaboration with the Bayes Centre and the Alan Turing Institute through joint research programmes and doctoral training initiatives. == Technology transfer and impact == From its inception, AIAI has combined academic research with technology-transfer activity, offering professional training, industrial consultancy, and bespoke software systems. It pioneered one of the earliest knowledge-based project-management systems, O-Plan, later evolved into the I-Plan framework used for autonomous planning and workflow management.

    Read more →
  • Healthy Together

    Healthy Together

    Healthy Together is a health technology company that provides software for Health & Humans Services Departments. Healthy Together supports a “One Door” approach to eligibility, enrollment, and management for programs like Medicaid, Supplemental Nutrition Assistance Program, TANF and WIC, as well as behavioral health (988), disease surveillance, vital records, child welfare and more. The platform's use is to increase the reach and efficacy of program initiatives, improve health equity and reduce cost. Software is available in the United States of America with current deployments in Florida, Oklahoma. The United States Department of Veterans Affairs also utilizes Healthy Together's mobile platform. == Development == Healthy Together launched in March 2020 and builds software for public health and health and human services departments. The Florida Department of Health began using the platform in September 2020 to deliver real-time test results to residents. Over 50% of households in Florida have adopted the mobile application. On December 6, 2022, the Advanced Technology Academic Research Center (ATARC) awarded Healthy Together and the State of Florida's Department of Health with a Digital Experience Award at their 2022 GITEC Emerging Technology Award Ceremony in Washington, D.C. to recognize success of the project. The partnership was also highlighted on the Federal News Network's show Federal Drive. The platform is also used at universities in Oklahoma. In November 2022, the United States Department of Veterans Affairs and Healthy Together announced a collaboration to expand access to health records for Veterans. The platform provides 18 million Veterans with access to their health information through their smartphones and mobile devices. In December 2022, the integration was recognized as one of Healthcare IT News' Top 10 stories of 2022.

    Read more →
  • XOR swap algorithm

    XOR swap algorithm

    In computer programming, the exclusive or swap (sometimes shortened to XOR swap) is an algorithm that uses the exclusive or bitwise operation to swap the values of two variables without using the temporary variable which is normally required. The algorithm is primarily a novelty and a way of demonstrating properties of the exclusive or operation. It is sometimes discussed as a program optimization, but there are almost no cases where swapping via exclusive or provides benefit over the standard, obvious technique. == The algorithm == Conventional swapping requires the use of a temporary storage variable. Using the XOR swap algorithm, however, no temporary storage is needed. The algorithm is as follows: Since XOR is a commutative operation, either X XOR Y or Y XOR X can be used interchangeably in any of the foregoing three lines. Note that on some architectures the first operand of the XOR instruction specifies the target location at which the result of the operation is stored, preventing this interchangeability. The algorithm typically corresponds to three machine-code instructions, represented by corresponding pseudocode and assembly instructions in the three rows of the following table: In the above System/370 assembly code sample, R1 and R2 are distinct registers, and each XR operation leaves its result in the register named in the first argument. Using x86 assembly, values X and Y are in registers eax and ebx (respectively), and xor places the result of the operation in the first register (Note: x86 supports XCHG instruction so using triple XOR do not make sense on this architecture). In RISC-V assembly, value X and Y are in registers x10 and x11, and xor places the result of the operation in the first operand. However, in the pseudocode or high-level language version or implementation, the algorithm fails if x and y use the same storage location, since the value stored in that location will be zeroed out by the first XOR instruction, and then remain zero; it will not be "swapped with itself". This is not the same as if x and y have the same values. The trouble only comes when x and y use the same storage location, in which case their values must already be equal. That is, if x and y use the same storage location, then the line: sets x to zero (because x = y so X XOR Y is zero) and sets y to zero (since it uses the same storage location), causing x and y to lose their original values. == Proof of correctness == The binary operation XOR over bit strings of length N {\displaystyle N} exhibits the following properties (where ⊕ {\displaystyle \oplus } denotes XOR): L1. Commutativity: A ⊕ B = B ⊕ A {\displaystyle A\oplus B=B\oplus A} L2. Associativity: ( A ⊕ B ) ⊕ C = A ⊕ ( B ⊕ C ) {\displaystyle (A\oplus B)\oplus C=A\oplus (B\oplus C)} L3. Identity exists: there is a bit string, 0, (of length N) such that A ⊕ 0 = A {\displaystyle A\oplus 0=A} for any A {\displaystyle A} L4. Each element is its own inverse: for each A {\displaystyle A} , A ⊕ A = 0 {\displaystyle A\oplus A=0} . Suppose that we have two distinct registers R1 and R2 as in the table below, with initial values A and B respectively. We perform the operations below in sequence, and reduce our results using the properties listed above. === Linear algebra interpretation === As XOR can be interpreted as binary addition and a pair of bits can be interpreted as a vector in a two-dimensional vector space over the field with two elements, the steps in the algorithm can be interpreted as multiplication by 2×2 matrices over the field with two elements. For simplicity, assume initially that x and y are each single bits, not bit vectors. For example, the step: which also has the implicit: corresponds to the matrix ( 1 1 0 1 ) {\displaystyle \left({\begin{smallmatrix}1&1\\0&1\end{smallmatrix}}\right)} as ( 1 1 0 1 ) ( x y ) = ( x + y y ) . {\displaystyle {\begin{pmatrix}1&1\\0&1\end{pmatrix}}{\begin{pmatrix}x\\y\end{pmatrix}}={\begin{pmatrix}x+y\\y\end{pmatrix}}.} The sequence of operations is then expressed as: ( 1 1 0 1 ) ( 1 0 1 1 ) ( 1 1 0 1 ) = ( 0 1 1 0 ) {\displaystyle {\begin{pmatrix}1&1\\0&1\end{pmatrix}}{\begin{pmatrix}1&0\\1&1\end{pmatrix}}{\begin{pmatrix}1&1\\0&1\end{pmatrix}}={\begin{pmatrix}0&1\\1&0\end{pmatrix}}} (working with binary values, so 1 + 1 = 0 {\displaystyle 1+1=0} ), which expresses the elementary matrix of switching two rows (or columns) in terms of the transvections (shears) of adding one element to the other. To generalize to where X and Y are not single bits, but instead bit vectors of length n, these 2×2 matrices are replaced by 2n×2n block matrices such as ( I n I n 0 I n ) . {\displaystyle \left({\begin{smallmatrix}I_{n}&I_{n}\\0&I_{n}\end{smallmatrix}}\right).} These matrices are operating on values, not on variables (with storage locations), hence this interpretation abstracts away from issues of storage location and the problem of both variables sharing the same storage location. == Code example == A C function that implements the XOR swap algorithm: The code first checks if the addresses are distinct and uses a guard clause to exit the function early if they are equal. Without that check, if they were equal, the algorithm would fold to a triple x ^= x resulting in zero. == Reasons for avoidance in practice == On modern CPU architectures, the XOR technique can be slower than using a temporary variable to do swapping. At least on recent x86 CPUs, both by AMD and Intel, moving between registers regularly incurs zero latency. (This is called MOV-elimination.) Even if there is not any architectural register available to use, the XCHG instruction will be at least as fast as the three XORs taken together. Another reason is that modern CPUs strive to execute instructions in parallel via instruction pipelines. In the XOR technique, the inputs to each operation depend on the results of the previous operation, so they must be executed in strictly sequential order, negating any benefits of instruction-level parallelism. === Aliasing === The XOR swap is also complicated in practice by aliasing. If an attempt is made to XOR-swap the contents of some location with itself, the result is that the location is zeroed out and its value lost. Therefore, XOR swapping must not be used blindly in a high-level language if aliasing is possible. This issue does not apply if the technique is used in assembly to swap the contents of two registers. Similar problems occur with call by name, as in Jensen's Device, where swapping i and A[i] via a temporary variable yields incorrect results due to the arguments being related: swapping via temp = i; i = A[i]; A[i] = temp changes the value for i in the second statement, which then results in the incorrect i value for A[i] in the third statement. == Variations == The underlying principle of the XOR swap algorithm can be applied to any operation meeting criteria L1 through L4 above. Replacing XOR by addition and subtraction gives various slightly different, but largely equivalent, formulations. For example: Unlike the XOR swap, this variation requires that the underlying processor or programming language uses a method such as modular arithmetic or bignums to guarantee that the computation of X + Y cannot cause an error due to integer overflow. Therefore, it is seen even more rarely in practice than the XOR swap. However, the implementation of AddSwap above in the C programming language always works even in case of integer overflow, since, according to the C standard, addition and subtraction of unsigned integers follow the rules of modular arithmetic, i. e. are done in the cyclic group Z / 2 s Z {\displaystyle \mathbb {Z} /2^{s}\mathbb {Z} } where s {\displaystyle s} is the number of bits of unsigned int. Indeed, the correctness of the algorithm follows from the fact that the formulas ( x + y ) − y = x {\displaystyle (x+y)-y=x} and ( x + y ) − ( ( x + y ) − y ) = y {\displaystyle (x+y)-((x+y)-y)=y} hold in any abelian group. This generalizes the proof for the XOR swap algorithm: XOR is both the addition and subtraction in the abelian group ( Z / 2 Z ) s {\displaystyle (\mathbb {Z} /2\mathbb {Z} )^{s}} (which is the direct sum of s copies of Z / 2 Z {\displaystyle \mathbb {Z} /2\mathbb {Z} } ). This doesn't hold when dealing with the signed int type (the default for int). Signed integer overflow is an undefined behavior in C and thus modular arithmetic is not guaranteed by the standard, which may lead to incorrect results. The sequence of operations in AddSwap can be expressed via matrix multiplication as: ( 1 − 1 0 1 ) ( 1 0 1 − 1 ) ( 1 1 0 1 ) = ( 0 1 1 0 ) {\displaystyle {\begin{pmatrix}1&-1\\0&1\end{pmatrix}}{\begin{pmatrix}1&0\\1&-1\end{pmatrix}}{\begin{pmatrix}1&1\\0&1\end{pmatrix}}={\begin{pmatrix}0&1\\1&0\end{pmatrix}}} == Application to register allocation == On architectures lacking a dedicated swap instruction, because it avoids the extra temporary register, the XOR swap algorithm is required for optimal register allocatio

    Read more →
  • BioCreative

    BioCreative

    BioCreAtIvE (A critical assessment of text mining methods in molecular biology) consists in a community-wide effort for evaluating information extraction and text mining developments in the biological domain. It was preceded by the Knowledge Discovery and Data Mining (KDD) Challenge Cup for detection of gene mentions. == Community Challenges == === First edition (2004-2005) === Three main tasks were posed at the first BioCreAtIvE challenge: the entity extraction task, the gene name normalization task, and the functional annotation of gene products task. The data sets produced by this contest serve as a Gold Standard training and test set to evaluate and train Bio-NER tools and annotation extraction tools. === Second edition (2006-2007) === The second BioCreAtIvE challenge (2006-2007) had also 3 tasks: detection of gene mentions, extraction of unique idenfiers for genes and extraction information related to physical protein-protein interactions. It counted with participation of 44 teams from 13 countries. === Third edition (2011-2012) === The third edition of BioCreative included for the first time the InterActive Task (IAT), designed to evaluate the practical usability of text mining tools in real-world biocuration tasks. === Fifth edition (2016) === BioCreative V had 5 different tracks, including an interactive task (IAT) for usability of text mining systems and a track using the BioC format for curating information for BioGRID.

    Read more →
  • Applications of artificial intelligence

    Applications of artificial intelligence

    Artificial intelligence is the capability of computational systems to perform tasks that are typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. Artificial intelligence has been used in applications throughout industry and academia. Within the field of Artificial Intelligence, there are multiple subfields. The subfield of machine learning has been used for various scientific and commercial purposes, including language translation, image recognition, decision-making, credit scoring, and e-commerce. In recent years, massive advancements have been made in the field of generative artificial intelligence, which uses generative models to generate text, images, videos, and other forms of data. This article describes applications of AI in different sectors. == Agriculture == In agriculture, AI has been proposed as a way for farmers to identify areas that need irrigation, fertilization, or pesticide treatments to increase yields, thereby improving efficiency. AI has been used to attempt to classify livestock pig call emotions, automate greenhouses, detect diseases and pests, and optimize irrigation. == AI-assisted software develoment == == Architecture and design == == Business == A 2023 study found that generative AI increased productivity by 15% in contact centers. Another 2023 study found it increased productivity by up to 40% in writing tasks. An August 2025 review by MIT found that of surveyed companies, 95% did not report any improvement in revenue from the use of AI. A September 2025 article by the Harvard Business Review describes how increased use of AI does not automatically lead to increases in revenue or actual productivity. Referring to "AI generated work content that masquerades as good work, but lacks the substance to meaningfully advance a given task" the article coins the term workslop. Per studies done in collaboration with the Stanford Social Media Lab, workslop does not improve productivity and undermines trust and collaboration among colleagues. In telehealth, agentic AI is reportedly facilitating the creation of large business models (millions in annual profit) with 1-2 employees, such as MEDVi, which as of August 2025 only had 2 employees and ~$75M in annual profit for GLP-1 weight-loss telehealth services. == Chatbots == == Computer science == === Programming assistance === ==== AI-assisted software development ==== AI can be used for real-time code completion, chat, and automated test generation. These tools are typically integrated with editors and IDEs as plugins. AI-assisted software development systems differ in functionality, quality, speed, and approach to privacy. Creating software primarily via AI is known as "vibe coding". Code created or suggested by AI can be incorrect or inefficient. The use of AI-assisted coding can potentially speed-up software development, but can also slow-down the process by creating more work when debugging and testing. The rush to prematurely adopt AI technology can also incur additional technical debt. AI also requires additional consideration and careful review for cybersecurity, since AI coding software is trained on a wide range of code of inconsistent quality and often replicates poor practices. ==== Neural network design ==== AI can be used to create other AIs. For example, around November 2017, Google's AutoML project to evolve new neural net topologies created NASNet, a system optimized for ImageNet and POCO F1. NASNet's performance exceeded all previously published performance on ImageNet. ==== Quantum computing ==== Research and development of quantum computers has been performed with machine learning algorithms. For example, there is a prototype, photonic, quantum memristive device for neuromorphic computers (NC)/artificial neural networks and NC-using quantum materials with some variety of potential neuromorphic computing-related applications. The use of quantum machine learning for quantum simulators has been proposed for solving physics and chemistry problems. === Historical contributions === AI researchers have created many tools to solve the most difficult problems in computer science. Many of their inventions have been adopted by mainstream computer science and are no longer considered AI. All of the following were originally developed in AI laboratories: Time sharing Interactive interpreters Graphical user interfaces and the computer mouse Rapid application development environments The linked list data structure Automatic storage management Symbolic programming Functional programming Dynamic programming Object-oriented programming Optical character recognition Constraint satisfaction == Customer service == === Human resources === AI programs have been used in hiring processes to screen resumes and rank candidates based on their qualifications, predict a candidate's likelihood of success in a given role, and automate repetitive communication tasks using chatbots. Studies on these programs have identified tendencies for gender bias, favoring male names and male-coded characteristics, as well as bias against disabled candidates and racial minorities. === Online and telephone customer service === AI underlies avatars (automated online assistants) on web pages. It can reduce operation and training costs. Pypestream automated customer service for its mobile application to streamline communication with customers. A Google app analyzes language and converts speech into text. The platform can identify angry customers through their language and respond appropriately. Amazon uses a chatbot for customer service that can perform tasks like checking the status of an order, cancelling orders, offering refunds and connecting the customer with a human representative. Generative AI (GenAI), such as ChatGPT, is increasingly used in business to automate tasks and enhance decision-making. === Hospitality === In the hospitality industry, AI is used to reduce repetitive tasks, analyze trends, interact with guests, and predict customer needs. AI hotel services come in the form of a chatbot, application, virtual voice assistant and service robots. == Education == In educational institutions, AI has been used to automate routine tasks such as attendance tracking, grading, and marking. AI tools have also been used to monitor student progress and analyze learning behaviors, with the goal of facilitating timely interventions for students facing academic challenges. == Energy and environment == === Energy system === The U.S. Department of Energy wrote in an April 2024 report that AI may have applications in modeling power grids, reviewing federal permits with large language models, predicting levels of renewable energy production, and improving the planning process for electrical vehicle charging networks. Other studies have suggested that machine learning can be used for energy consumption prediction and scheduling, e.g. to help with renewable energy intermittency management (see also: smart grid and climate change mitigation in the power grid). === Environmental monitoring === Autonomous ships that monitor the ocean, AI-driven satellite data analysis, passive acoustics or remote sensing and other applications of environmental monitoring make use of machine learning. For example, "Global Plastic Watch" is an AI-based satellite monitoring-platform for analysis/tracking of plastic waste sites to help prevention of plastic pollution – primarily ocean pollution – by helping identify who and where mismanages plastic waste, dumping it into oceans. === Early-warning systems === Machine learning can be used to spot early-warning signs of disasters and environmental issues, possibly including natural pandemics, earthquakes, landslides, heavy rainfall, long-term water supply vulnerability, tipping-points of ecosystem collapse, cyanobacterial bloom outbreaks, and droughts. === Economic and social challenges === The University of Southern California launched the Center for Artificial Intelligence in Society, with the goal of using AI to address problems such as homelessness. Stanford researchers use AI to analyze satellite images to identify high poverty areas. == Entertainment and media == === Media === AI applications analyze media content such as movies, TV programs, advertisement videos or user-generated content. The solutions often involve computer vision. Typical scenarios include the analysis of images using object recognition or face recognition techniques, or the analysis of video for scene recognizing scenes, objects or faces. AI-based media analysis can facilitate media search, the creation of descriptive keywords for content, content policy monitoring (such as verifying the suitability of content for a particular TV viewing time), speech to text for archival or other purposes, and the detection of logos, products or celebrity faces for ad placement. Motion interpolation Pixel-art scaling algorithms Image scaling Imag

    Read more →
  • Texture artist

    Texture artist

    A texture artist is an individual who develops textures for digital media, usually for video games, movies, web sites and television shows or things like 3D posters. These textures can be in the form of 2D or (rarely) 3D art that may be overlaid onto a polygon mesh to create a realistic 3D model. Texture artists often take advantage of web sites for the purposes of marketing their art and self-promotion of their skills with the goal of gaining employment from a professional game studio or to join a team working on a "mod" (modification) of an existing game in hopes of establishing industry or trade credentials.

    Read more →
  • Operational historian

    Operational historian

    In manufacturing, an operational historian is a time-series database application that is developed for operational process data. Historian software is often embedded or used in conjunction with standard DCS and PLC control systems to provide enhanced data capture, validation, compression, and aggregation capabilities. Historians have been deployed in almost every industry and contribute to functions such as supervisory control, performance monitoring, quality assurance, and, more recently, machine learning applications which can learn from vast quantities of historical data. These systems were originally developed to capture instrumentation and control data, which led many to use the term "tag" for a stream of process data, referring to the physical "tags" which had been placed on instrumentation for manually capturing data. Raw data may be accessed via OPC HDA, SQL, or REST API interfaces. == Operational Support == Operational historians are typically used within the manufacturing facility by engineers and operators for supervisory functions and analysis. An operational historian will typically capture all instrumentation and control data, whereas an enterprise historian that is deployed to support business functions will capture only a subset of the plant data. Typically, these applications offer data access through dedicated APIs (Application Programming Interfaces) and SDKs (Software Development Kits) which offer high-performance read and write operations. These operate through vendor-specific or custom applications. Front-end tools for trending process data over time are the most common interfaces to these databases. Because these applications are typically deployed next to or near the source of their process data, they are often marketed and sold as 'real-time database systems.' This distinction varies among vendors, who often have to make tradeoffs in performance between data capture and presentation, and application and analysis functionality. The following is a list of typical challenges for operational historians: data collection from instrumentation and controls storage and archiving of very large volumes of data organization of data in the form of "tags" or "points" limiting of monitoring (alarms) and validation aggregation and interpolation manual data entry (MDE) == Data access == As opposed to enterprise historians, the data access layer in the operational historian is designed to offer sophisticated data fetching modes without complex information analysis facilities. The following settings are typically available for data access operations: Data scope (single point or tag, history based on time range, history based on sample count) Request modes (raw data, last-known value, aggregation, interpolation) Sampling (single point, all points without sampling, all points with interval sampling) Data omission (based on the sample quality, based on the sample value, based on the count) Even though the operational historians are rarely relational database management systems, they often offer SQL-based interfaces to query the database. In most of such implementations, the dialect does not follow the SQL standard in order to provide syntax for specifying data access operations parameters.

    Read more →
  • Spatial computing

    Spatial computing

    Spatial computing refers to 3D human–computer interaction techniques that are perceived by users as taking place in the real world, in and around their bodies and physical environments, instead of constrained to and perceptually behind computer screens or in purely virtual worlds. This concept inverts the long-standing practice of teaching people to interact with computers in digital environments, and instead teaches computers to better understand and interact with people more naturally in the human world. This concept overlaps with and encompasses others including extended reality, augmented reality, mixed reality, natural user interface, contextual computing, affective computing, and ubiquitous computing. The usage for labeling and discussing these adjacent technologies is imprecise. Spatial computing devices include sensors—such as RGB cameras, depth cameras, 3D trackers, inertial measurement units, or other tools—to sense and track nearby human bodies (including hands, arms, eyes, legs, mouths) during ordinary interactions with people and computers in a 3D space. They further use computer vision to attempt to understand real world scenes, such as rooms, streets or stores, to read labels, to recognize objects, create 3D maps, and more. Quite often they also use extended reality and mixed reality to superimpose virtual 3D graphics and virtual 3D audio onto the human visual and auditory system as a way of providing information more naturally and contextually than traditional 2D screens. Spatial computing often refers to personal computing devices like headsets and headphones, but other human-computer interactions that leverage real-time spatial positioning for displays, like projection mapping or cave automatic virtual environment displays, can also be considered spatial computing if they leverage human-computer input for the participants. == History == The term "spatial computing" apparently originated in the field of GIS around 1985 or earlier to describe computations on large-scale geospatial information. Early examples of spatial computing in GIS include ArcInfo and its iterations, initially released in 1981, a part of ArcGIS along with ArcEditor, which together provide mapping, analysis, editing, and geoprocessing for geodatabases. This is somewhat related to the modern use, but on the scale of continents, cities, and neighborhoods. Modern spatial computing is more centered on the human scale of interaction, around the size of a living room or smaller. But it is not limited to that scale in the aggregate. In the early 1990s, as field of virtual reality was beginning to be commercialized beyond academic and military labs, a startup called Worldesign in Seattle used the term Spatial Computing to describe the interaction between individual people and 3D spaces, operating more at the human end of the scale than previous GIS examples may have contemplated. The company built a CAVE-like environment it called the Virtual Environment Theater, whose 3D experience was of a virtual flyover of the Giza Plateau, circa 3000 BC. Robert Jacobson, CEO of Worldesign, attributes the origins of the term to experiments at the Human Interface Technology Lab, at the University of Washington, under the direction of Thomas A. Furness III. Jacobson was a co-founder of that lab before spinning off this early VR startup. In 1997, an academic publication by T. Caelli, Peng Lam, and H. Bunke called "Spatial Computing: Issues in Vision, Multimedia and Visualization Technologies" introduced the term more broadly for academic audiences, focusing on a variety of topics such as image processing, dead reckoning navigation, object recognition, and visualizing spatial data. The specific term "spatial computing" was later referenced again in 2003 by Simon Greenwold, as "human interaction with a machine in which the machine retains and manipulates referents to real objects and spaces". MIT Media Lab alumnus John Underkoffler gave a TED talk in 2010 giving a live demo of the multi-screen, multi-user spatial computing systems being developed by Oblong Industries, which sought to bring to life the futuristic interfaces conceptualized by Underkoffler in the films Minority Report and Iron Man. Google Earth, initially released by Keyhole Inc. in 2001 and re-released by Google in 2005 can be considered a capable GIS and includes advanced geospatial tools and capabilities. == Notable instances of the use of spatial computing == In 2019, Microsoft HoloLens released a video outlining Airbus' partnership with Microsoft Azure to utilize the latter's mixed reality services for streamlining and improving the aircraft design process, as well as reducing the error in development. Airbus utilized the HoloLens 2 to this end, and the executive vice president of engineering claimed that their design process' validation phases were "hugely accelerated by 80 percent", as well as "strongly believe[d]" that up to 30% improvements in their industrial tasks could be attained with the HoloLens 2. During the presentational video, Airbus cited the maturity of Microsoft Azure services as "key" for their usage of the HoloLens 2. Also in 2019, the U.S. army partnered with Microsoft to produce a HoloLens based Integrated Visual Augmentation System (IVAS) to enhance infantry members by giving troops various abilities, including but not limited to using holographs to train, projecting 3D maps into their vision, and seeing through smoke and corners. Microsoft received tens of thousands of hours of feedback for their systems by 2021. Sergeant Marc Krugh at the time claimed that Microsoft's partnership has already caused the army to rethink some of its troops' operation strategy. == Products == === Apple Vision Pro === Apple announced Apple Vision Pro, a device it markets as a "spatial computer", on June 5, 2023. It includes several features such as Spatial Audio, two 4K micro-OLED displays, the Apple R1 chip and eye tracking, and released in the United States on February 2, 2024. In announcing the platform, Apple invoked its history of popularizing 2D graphical user interfaces that supplanted prior human-computer interface mechanisms such as the command line. Apple suggests the introduction of spatial computing as a new category of interactive device, on the same level of importance as the introduction of the 2D GUI. Apple Vision Pro runs on a new operating system called visionOS, which combines eye tracking, gesture recognition, and voice input to enable immersive interaction without physical controllers. The platform is aimed at productivity, entertainment, collaboration, and enterprise use cases. === Magic Leap === Magic Leap had also previously used the term “spatial computing” to describe its own devices. Its first headset, the Magic Leap 1, was released on August 8, 2018. Magic Leap’s technology enables the display of content into the real world using an optical see-through head-mounted display, which projects an overlay of a virtual world into the user’s field of view. This allows for an experience where the physical and digital worlds are perceived simultaneously. === Microsoft Hololens === On February 24, 2019, Microsoft released the HoloLens 2, which includes mixed reality tools and can generate interactable, manipulatable holograms in 3D space. The holograms in question can be related to a physical object or completely independent and free-floating. The Azure Spatial Anchors cloud service was released simultaneously, which gives the holograms capability to persist across time and many individuals' devices. === Meta Quest === The Meta Quest 3, a mixed reality gaming headset that includes spatial audio, two color cameras, and grants the ability to interact with virtual characters released on October 9, 2023, at a notably cheaper price than the Apple Vision Pro, but with reduced capabilities. === Snap Spectacles === Spectacles (product) are augmented reality glasses developed by Snap Inc.. The latest generation includes a 46-degree stereoscopic display, adjustable tint, and Snapdragon processors. Spectacles allow users to interact with a collection of augmented reality experiences designed for education, entertainment, and utility. Currently, the device is in the hands of selected developers and creators, as part of an experimental AR ecosystem focused on creativity, use case exploration and expression.

    Read more →
  • Least-squares spectral analysis

    Least-squares spectral analysis

    Least-squares spectral analysis (LSSA) is a class of methods for estimating a frequency spectrum by fitting sinusoids to data using a least-squares fit. Unlike Fourier analysis, the most widely used spectral method in science, data need not be equally spaced to use LSSA. Furthermore, while Fourier analysis generally amplifies long-period noise in long or gapped records, LSSA mitigates such problems. The first strictly least-squares LSSA method was developed in 1969 and 1971, and is known as the Vaníček method or the Gauss–Vaniček method, after its inventor Petr Vaníček and Carl Friedrich Gauss, the inventor of the least-squares method for error minimization. A widely known LSSA variant is the Lomb method or the Lomb–Scargle periodogram, based on dated computational simplifications of the Vaníček method introduced in the 1970s and 1980s, first by Nicholas R. Lomb and later by Jeffrey D. Scargle. Other LSSA variants have been subsequently developed. == Historical background == The close connections between Fourier analysis, the periodogram, and the least-squares fitting of sinusoids have been known for a long time. However, most developments are restricted to complete data sets of equally spaced samples. In 1963, Freek J. M. Barning of Mathematisch Centrum, Amsterdam, handled unequally spaced data by similar techniques, including both a periodogram analysis equivalent to what nowadays is called the Lomb method and least-squares fitting of selected frequencies of sinusoids determined from such periodograms — and connected by a procedure known today as the matching pursuit with post-back fitting or the orthogonal matching pursuit. Petr Vaníček, a Canadian geophysicist and geodesist of the University of New Brunswick, proposed in 1969 also the matching-pursuit approach for equally and unequally spaced data, which he called "successive spectral analysis" and the result a "least-squares periodogram". He generalized this method to account for any systematic components beyond a simple mean, such as a "predicted linear (quadratic, exponential, ...) secular trend of unknown magnitude", and applied it to a variety of samples, in 1971. Vaníček's strictly least-squares method was then simplified in 1976 by Nicholas R. Lomb of the University of Sydney, who pointed out its close connection to periodogram analysis. Subsequently, the definition of a periodogram of unequally spaced data was modified and analyzed by Jeffrey D. Scargle of NASA Ames Research Center, who showed that, with minor changes, it becomes identical to Lomb's least-squares formula for fitting individual sinusoid frequencies. Scargle states that his paper "does not introduce a new detection technique, but instead studies the reliability and efficiency of detection with the most commonly used technique, the periodogram, in the case where the observation times are unevenly spaced," and further points out regarding least-squares fitting of sinusoids compared to periodogram analysis, that his paper "establishes, apparently for the first time, that (with the proposed modifications) these two methods are exactly equivalent." Press summarizes the development this way: A completely different method of spectral analysis for unevenly sampled data, one that mitigates these difficulties and has some other very desirable properties, was developed by Lomb, based in part on earlier work by Barning and Vanicek, and additionally elaborated by Scargle. In 1989, Michael J. Korenberg of Queen's University in Kingston, Ontario, developed the "fast orthogonal search" method of more quickly finding a near-optimal decomposition of spectra or other problems, similar to the technique that later became known as the orthogonal matching pursuit. == Development of LSSA and variants == === The Vaníček method === In the Vaníček method, a discrete data set is approximated by a weighted sum of sinusoids of progressively determined frequencies using a standard linear regression or least-squares fit. The frequencies are chosen using a method similar to Barning's, but going further in optimizing the choice of each successive new frequency by picking the frequency that minimizes the residual after least-squares fitting (equivalent to the fitting technique now known as matching pursuit with pre-backfitting). The number of sinusoids must be less than or equal to the number of data samples (counting sines and cosines of the same frequency as separate sinusoids). The relationship between the DFT and the approximation of trigonometric functions using the least-squares method is well explained in (Strutz, 2017). A data vector Φ is represented as a weighted sum of sinusoidal basis functions, tabulated in a matrix A by evaluating each function at the sample times, with weight vector x: ϕ ≈ A x , {\displaystyle \phi \approx {\textbf {A}}x,} where the weights vector x is chosen to minimize the sum of squared errors in approximating Φ. The solution for x is closed-form, using standard linear regression: x = ( A T A ) − 1 A T ϕ . {\displaystyle x=({\textbf {A}}^{\mathrm {T} }{\textbf {A}})^{-1}{\textbf {A}}^{\mathrm {T} }\phi .} Here the matrix A can be based on any set of functions mutually independent (not necessarily orthogonal) when evaluated at the sample times; functions used for spectral analysis are typically sines and cosines evenly distributed over the frequency range of interest. If we choose too many frequencies in a too-narrow frequency range, the functions will be insufficiently independent, the matrix ill-conditioned, and the resulting spectrum meaningless. When the basis functions in A are orthogonal (that is, not correlated, meaning the columns have zero pair-wise dot products), the matrix ATA is diagonal; when the columns all have the same power (sum of squares of elements), then that matrix is an identity matrix times a constant, so the inversion is trivial. The latter is the case when the sample times are equally spaced and sinusoids chosen as sines and cosines equally spaced in pairs on the frequency interval 0 to a half cycle per sample (spaced by 1/N cycles per sample, omitting the sine phases at 0 and maximum frequency where they are identically zero). This case is known as the discrete Fourier transform, slightly rewritten in terms of measurements and coefficients. x = A T ϕ {\displaystyle x={\textbf {A}}^{\mathrm {T} }\phi } — DFT case for N equally spaced samples and frequencies, within a scalar factor. === The Lomb method === Trying to lower the computational burden of the Vaníček method in 1976 (no longer an issue), Lomb proposed using the above simplification in general, except for pair-wise correlations between sine and cosine bases of the same frequency, since the correlations between pairs of sinusoids are often small, at least when they are not tightly spaced. This formulation is essentially that of the traditional periodogram but adapted for use with unevenly spaced samples. The vector x is a reasonably good estimate of an underlying spectrum, but since we ignore any correlations, Ax is no longer a good approximation to the signal, and the method is no longer a least-squares method — yet in the literature continues to be referred to as such. Rather than just taking dot products of the data with sine and cosine waveforms directly, Scargle modified the standard periodogram formula so to find a time delay τ {\displaystyle \tau } first, such that this pair of sinusoids would be mutually orthogonal at sample times t j {\displaystyle t_{j}} and also adjusted for the potentially unequal powers of these two basis functions, to obtain a better estimate of the power at a frequency. This procedure made his modified periodogram method exactly equivalent to Lomb's method. Time delay τ {\displaystyle \tau } by definition equals to tan ⁡ 2 ω τ = ∑ j sin ⁡ 2 ω t j ∑ j cos ⁡ 2 ω t j . {\displaystyle \tan {2\omega \tau }={\frac {\sum _{j}\sin 2\omega t_{j}}{\sum _{j}\cos 2\omega t_{j}}}.} Then the periodogram at frequency ω {\displaystyle \omega } is estimated as: P x ( ω ) = 1 2 [ [ ∑ j X j cos ⁡ ω ( t j − τ ) ] 2 ∑ j cos 2 ⁡ ω ( t j − τ ) + [ ∑ j X j sin ⁡ ω ( t j − τ ) ] 2 ∑ j sin 2 ⁡ ω ( t j − τ ) ] , {\displaystyle P_{x}(\omega )={\frac {1}{2}}\left[{\frac {\left[\sum _{j}X_{j}\cos \omega (t_{j}-\tau )\right]^{2}}{\sum _{j}\cos ^{2}\omega (t_{j}-\tau )}}+{\frac {\left[\sum _{j}X_{j}\sin \omega (t_{j}-\tau )\right]^{2}}{\sum _{j}\sin ^{2}\omega (t_{j}-\tau )}}\right],} which, as Scargle reports, has the same statistical distribution as the periodogram in the evenly sampled case. At any individual frequency ω {\displaystyle \omega } , this method gives the same power as does a least-squares fit to sinusoids of that frequency and of the form: ϕ ( t ) = A sin ⁡ ω t + B cos ⁡ ω t . {\displaystyle \phi (t)=A\sin \omega t+B\cos \omega t.} In practice, it is always difficult to judge if a given Lomb peak is significant or not, especially when the nature of the noise is unknown, so for example a false-alarm spectr

    Read more →
  • Visual analytics

    Visual analytics

    Visual analytics is a multidisciplinary science and technology field that emerged from information visualization and scientific visualization. It focuses on how analytical reasoning can be facilitated by interactive visual interfaces. == Overview == Visual analytics is "the science of analytical reasoning facilitated by interactive visual interfaces." It can address problems whose size, complexity, and need for closely coupled human and machine analysis may make them otherwise intractable. Visual analytics advances scientific and technological development across multiple domains, including analytical reasoning, human–computer interaction, data transformations, visual representation for computation and analysis, analytic reporting, and the transition of new technologies into practice. As a research agenda, visual analytics brings together several scientific and technical communities from computer science, information visualization, cognitive and perceptual sciences, interactive design, graphic design, and social sciences. Visual analytics integrates new computational and theory-based tools with innovative interactive techniques and visual representations to enable human-information discourse. The design of the tools and techniques is based on cognitive, design, and perceptual principles. This science of analytical reasoning provides the reasoning framework upon which one can build both strategic and tactical visual analytics technologies for threat analysis, prevention, and response. Analytical reasoning is central to the analyst's task of applying human judgments to reach conclusions from a combination of evidence and assumptions. Visual analytics has some overlapping goals and techniques with information visualization and scientific visualization. There is currently no clear consensus on the boundaries between these fields, but broadly speaking the three areas can be distinguished as follows: Scientific visualization deals with data that has a natural geometric structure (e.g., MRI data, wind flows). Information visualization handles abstract data structures such as trees or graphs. Visual analytics is especially concerned with coupling interactive visual representations with underlying analytical processes (e.g., statistical procedures, data mining techniques) such that high-level, complex activities can be effectively performed (e.g., sense making, reasoning, decision making). Visual analytics seeks to marry techniques from information visualization with techniques from computational transformation and analysis of data. Information visualization forms part of the direct interface between user and machine, amplifying human cognitive capabilities in six basic ways: by increasing cognitive resources, such as by using a visual resource to expand human working memory, by reducing search, such as by representing a large amount of data in a small space, by enhancing the recognition of patterns, such as when information is organized in space by its time relationships, by supporting the easy perceptual inference of relationships that are otherwise more difficult to induce, by perceptual monitoring of a large number of potential events, and by providing a manipulable medium that, unlike static diagrams, enables the exploration of a space of parameter values These capabilities of information visualization, combined with computational data analysis, can be applied to analytic reasoning to support the sense-making process. == History == As an interdisciplinary approach, visual analytics has its roots in information visualization, cognitive sciences, and computer science. The term and scope of the field was defined in the early 2000s through researchers such as Jim Thomas, Kristin A. Cook, John Stasko, Pak Chung Wong, Daniel A. Keim and David S. Ebert. As a reaction to the September 11, 2001 attacks the United States Department of Homeland Security was established in late 2002, combining dozens of previously separated government agencies. Building upon earlier work on visual data mining by Daniel A. Keim starting in the late 1990s, this simultaneously lead to the development of a research agenda for visual analytics. As part of these efforts the National Visualization and Analytics Center (NVAC) at Pacific Northwest National Laboratory was established in 2004, whose charter was to develop system to mitigate information overload after the September 11, 2001 attacks in the intelligence community. Their research work determined core challenges, posed open research questions, and positioned visual analytics as a new research domain, in particular through the 2005 research agenda Illuminating the Path. In 2006, the IEEE VIS community led by Pak Chung Wong and Daniel A. Keim launched the annual IEEE Conference on Visual Analytics Science and Technology (VAST), providing a dedicated venue for research into visual analytics, which in 2020 merged to form the IEEE Visualization conference. In 2008, scope and challenges of visual analytics were conceptually defined by Daniel A. Keim and Jim Thomas in their influential book about visual data mining. The domain was further refined as part of the European Commissions FP7 VisMaster program in the late 2000s. == Topics == === Scope === Visual analytics is a multidisciplinary field that includes the following focus areas: Analytical reasoning techniques that enable users to obtain deep insights that directly support assessment, planning, and decision making Data representations and transformations that convert all types of conflicting and dynamic data in ways that support visualization and analysis Techniques to support production, presentation, and dissemination of the results of an analysis to communicate information in the appropriate context to a variety of audiences. Visual representations and interaction techniques that take advantage of the human eye's broad bandwidth pathway into the mind to allow users to see, explore, and understand large amounts of information at once. === Analytical reasoning techniques === Analytical reasoning techniques are the method by which users obtain deep insights that directly support situation assessment, planning, and decision making. Visual analytics must facilitate high-quality human judgment with a limited investment of the analysts’ time. Visual analytics tools must enable diverse analytical tasks such as: Understanding past and present situations quickly, as well as the trends and events that have produced current conditions Identifying possible alternative futures and their warning signs Monitoring current events for emergence of warning signs as well as unexpected events Determining indicators of the intent of an action or an individual Supporting the decision maker in times of crisis. These tasks will be conducted through a combination of individual and collaborative analysis, often under extreme time pressure. Visual analytics must enable hypothesis-based and scenario-based analytical techniques, providing support for the analyst to reason based on the available evidence. === Data representations === Data representations are structured forms suitable for computer-based transformations. These structures must exist in the original data or be derivable from the data themselves. They must retain the information and knowledge content and the related context within the original data to the greatest degree possible. The structures of underlying data representations are generally neither accessible nor intuitive to the user of the visual analytics tool. They are frequently more complex in nature than the original data and are not necessarily smaller in size than the original data. The structures of the data representations may contain hundreds or thousands of dimensions and be unintelligible to a person, but they must be transformable into lower-dimensional representations for visualization and analysis. === Theories of visualization === Theories of visualization include: Jacques Bertin's Semiology of Graphics (1967) Nelson Goodman's Languages of Art (1977) Jock D. Mackinlay's Automated design of optimal visualization (APT) (1986) Leland Wilkinson's Grammar of Graphics (1998) Hadley Wickham's Layered Grammar of Graphics (2010) === Visual representations === Visual representations translate data into a visible form that highlights important features, including commonalities and anomalies. These visual representations make it easy for users to perceive salient aspects of their data quickly. Augmenting the cognitive reasoning process with perceptual reasoning through visual representations permits the analytical reasoning process to become faster and more focused. == Process == The input for the data sets used in the visual analytics process are heterogeneous data sources (i.e., the internet, newspapers, books, scientific experiments, expert systems). From these rich sources, the data sets S = S1, ..., Sm are chosen, whereas each Si , i ∈ (1, ..., m) consists of attrib

    Read more →
  • Applied Information Science in Economics

    Applied Information Science in Economics

    The Applied Information Science in Economics (Russian: Прикладная информатика в Экономике) or Applied Computer Science in Economics is a professional qualification generally awarded in Russian Federation. The degree inherited from the U.S.S.R. education system also known as Specialist degree. The degree is awarded after five years of full-time study and includes several internships, course-works, thesis writing and defense. The degree has similarities with German Magister Artium or Diplom degree. However, due to the Bologna Process number of such degrees are declining. Degree focuses on applying mathematical methods in economics involving maximum information technology. It is very close to applied mathematics, but includes also major part of computer science. == List of specialty codes in the education system == 080801 - Applied computer science in economics 351400 - Applied computer science == Fields of activity == Organization and management; Project design; Experimental research; Marketing; Consulting; Operational and Maintenance. == Major == Information Science and Programming. High Level Methods of Information Science and Programming. Information Technologies in Economics. Computer Systems, Networks and Telecommunications Services. Operational Environments, Systems and Shells. Architecture and Design of Information Systems for Companies. Data Bases. Information security. Information Management. Imitative Simulation.

    Read more →
  • Source criticism

    Source criticism

    Source criticism (or information evaluation) is the process of evaluating an information source, i.e.: a document, a person, a speech, a fingerprint, a photo, an observation, or anything used in order to obtain knowledge. In relation to a given purpose, a given information source may be more or less valid, reliable or relevant. Broadly, "source criticism" is the interdisciplinary study of how information sources are evaluated for given tasks. == Meaning == Problems in translation: The Danish word kildekritik, like the Norwegian word kildekritikk and the Swedish word källkritik, derived from the German Quellenkritik and is closely associated with the German historian Leopold von Ranke (1795–1886). Historian Wolfgang Hardtwig wrote: His [Ranke's] first work Geschichte der romanischen und germanischen Völker von 1494–1514 (History of the Latin and Teutonic Nations from 1494 to 1514) (1824) was a great success. It already showed some of the basic characteristics of his conception of Europe, and was of historiographical importance particularly because Ranke made an exemplary critical analysis of his sources in a separate volume, Zur Kritik neuerer Geschichtsschreiber (On the Critical Methods of Recent Historians). In this work he raised the method of textual criticism used in the late eighteenth century, particularly in classical philology to the standard method of scientific historical writing. (Hardtwig, 2001, p. 12739) Historical theorist Chris Lorenz wrote: The larger part of the nineteenth and twentieth centuries would be dominated by the research-oriented conception of historical method of the so-called Historical School in Germany, led by historians as Leopold Ranke and Berthold Niebuhr. Their conception of history, long been regarded as the beginning of modern, 'scientific' history, harked back to the 'narrow' conception of historical method, limiting the methodical character of history to source criticism. (Lorenz, 2001) In the early 21st century, source criticism is a growing field in, among other fields, library and information science. In this context source criticism is studied from a broader perspective than just, for example, history, classical philology, or biblical studies (but there, too, it has more recently received new attention). == Principles == The following principles are from two Scandinavian textbooks on source criticism, written by the historians Olden-Jørgensen (1998) and Thurén (1997): Human sources may be relics (e.g. a fingerprint) or narratives (e.g. a statement or a letter). Relics are more credible sources than narratives. A given source may be forged or corrupted; strong indications of the originality of the source increases its reliability. The closer a source is to the event which it purports to describe, the more one can trust it to give an accurate description of what really happened A primary source is more reliable than a secondary source, which in turn is more reliable than a tertiary source and so on. If a number of independent sources contain the same message, the credibility of the message is strongly increased. The tendency of a source is its motivation for providing some kind of bias. Tendencies should be minimized or supplemented with opposite motivations. If it can be demonstrated that the witness (or source) has no direct interest in creating bias, the credibility of the message is increased. Two other principles are: Knowledge of source criticism cannot substitute for subject knowledge: "Because each source teaches you more and more about your subject, you will be able to judge with ever-increasing precision the usefulness and value of any prospective source. In other words, the more you know about the subject, the more precisely you can identify what you must still find out". (Bazerman, 1995, p. 304). The reliability of a given source is relative to the questions put to it. "The empirical case study showed that most people find it difficult to assess questions of cognitive authority and media credibility in a general sense, for example, by comparing the overall credibility of newspapers and the Internet. Thus these assessments tend to be situationally sensitive. Newspapers, television and the Internet were frequently used as sources of orienting information, but their credibility varied depending on the actual topic at hand" (Savolainen, 2007). The following questions are often good ones to ask about any source according to the American Library Association (1994) and Engeldinger (1988): How was the source located? What type of source is it? Who is the author and what are the qualifications of the author in regard to the topic that is discussed? When was the information published? In which country was it published? What is the reputation of the publisher? Does the source show a particular cultural or political bias? For literary sources complementing criteria are: Does the source contain a bibliography? Has the material been reviewed by a group of peers, or has it been edited? How does the article/book compare with similar articles/books? == Levels of generality == Some principles of source criticism are universal, other principles are specific for certain kinds of information sources. There is today no consensus about the similarities and differences between source criticism in the natural science and humanities. Logical positivism claimed that all fields of knowledge were based on the same principles. Much of the criticism of logical positivism claimed that positivism is the basis of the sciences, whereas hermeneutics is the basis of the humanities. This was, for example, the position of Jürgen Habermas. A newer position, in accordance with, among others, Hans-Georg Gadamer and Thomas Kuhn, understands both science and humanities as determined by researchers' preunderstanding and paradigms. Hermeneutics is thus a universal theory. The difference is, however, that the sources of the humanities are themselves products of human interests and preunderstanding, whereas the sources of the natural sciences are not. Humanities are thus "doubly hermeneutic". Natural scientists, however, are also using human products (such as scientific papers) which are products of preunderstanding (and can lead to, for example, academic fraud). == Contributing fields == === Epistemology === Epistemological theories are the basic theories about how knowledge is obtained and are thus the most general theories about how to evaluate information sources. Empiricism evaluates sources by considering the observations (or sensations) on which they are based. Sources without basis in experience are not seen as valid. Rationalism provides low priority to sources based on observations. In order to be meaningful, observations must be explained by clear ideas or concepts. It is the logical structure and the well definedness that is in focus in evaluating information sources from the rationalist point of view. Historicism evaluates information sources on the basis of their reflection of their sociocultural context and their theoretical development. Pragmatism evaluate sources on the basis of how their values and usefulness to accomplish certain outcomes. Pragmatism is skeptical about claimed neutral information sources. The evaluation of knowledge or information sources cannot be more certain than is the construction of knowledge. If one accepts the principle of fallibilism then one also has to accept that source criticism can never 100% verify knowledge claims. As discussed in the next section, source criticism is intimately linked to scientific methods. The presence of fallacies of argument in sources is another kind of philosophical criterion for evaluating sources. Fallacies are presented by Walton (1998). Among the fallacies are the ad hominem fallacy (the use of personal attack to try to undermine or refute a person's argument) and the straw man fallacy (when one arguer misrepresents another's position to make it appear less plausible than it really is, in order more easily to criticize or refute it.) === Research methodology === Research methods are methods used to produce scholarly knowledge. The methods that are relevant for producing knowledge are also relevant for evaluating knowledge. An example of a book that turns methodology upside-down and uses it to evaluate produced knowledge is Katzer; Cook & Crouch (1998). === Science studies === Studies of quality evaluation processes such as peer review, book reviews and of the normative criteria used in evaluation of scientific and scholarly research. Another field is the study of scientific misconduct. Harris (1979) provides a case study of how a famous experiment in psychology, Little Albert, has been distorted throughout the history of psychology, starting with the author (Watson) himself, general textbook authors, behavior therapists, and a prominent learning theorist. Harris proposes possible causes for these distortions and analyzes the Albert study as an ex

    Read more →