AI Detector Similar To Turnitin

AI Detector Similar To Turnitin — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Conduit (company)

    Conduit (company)

    Conduit Ltd. is an international software company. From its founding in 2005 to 2013, its most well-known product was the Conduit toolbar, which was widely-described as malware. In 2013, it spun off its toolbar business; today, its main product is a mobile development platform that allows users to create native and web mobile applications for smartphones. == Products == From 2005 to 2013, the company's most well-known product was the Conduit toolbar, which is flagged by most antivirus software as potentially unwanted and adware. Conduit's toolbar software is often downloaded by malware packages from other publishers. The company spun off the toolbar division that manages the Conduit toolbar in 2013. Today, the company's main product is a mobile development platform that allows users to create native and web mobile applications for smartphones. App creation for its App Gallery is free, but it charges a monthly subscription fee to place apps on the App Store or Google Play. == History == Conduit was founded in 2005 by Shilo, Dror Erez, and Gaby Bilcyzk. Between years 2005 and 2013, it ran a successful but controversial toolbar platform business. Conduit was part of the so-called Download Valley companies monetizing free software and downloads by bundling adware. The toolbars were criticized by some as being very difficult to uninstall. The toolbar software was referred to as a "potentially unwanted program" by some in the computer industry because it could be used to change browser settings. The company had more than 400 employees in 2013. In September same year, Conduit spun off its entire website toolbar business division, which combined with Perion Network. After the deal, Conduit shareholders owned 81% of Perion's existing shares and both Perion and Conduit remained independent companies. The substantial size of the Conduit user base allowed Perion to immediately surpass AOL in U.S. searches. In 2015, Conduit announced it would purchase Keeprz, a mobile customer loyalty platform, for $45 million.

    Read more →
  • How to Solve it by Computer

    How to Solve it by Computer

    How to Solve it by Computer is a computer science book by R. G. Dromey, first published by Prentice-Hall in 1982. It is occasionally used as a textbook, especially in India. It is an introduction to the whys of algorithms and data structures. Features of the book: The design factors associated with problems, The creative process behind coming up with innovative solutions for algorithms and data structures, The line of reasoning behind the constraints, factors and the design choices made. The very fundamental algorithms portrayed by this book are mostly presented in pseudocode and/or Pascal notation.

    Read more →
  • SQL programming tool

    SQL programming tool

    In the field of software, SQL programming tools provide platforms for database administrators (DBAs) and application developers to perform daily tasks efficiently and accurately. Database administrators and application developers often face constantly changing environments which they rarely completely control. Many changes result from new development projects or from modifications to existing code, which, when deployed to production, do not always produce the expected result. For organizations to better manage development projects and the teams that develop code, suppliers of SQL programming tools normally provide more than facility to the database administrator or application developer to aid in database management and in quality code-deployment practices. == Features == SQL programming tools may include the following features: === SQL editing === SQL editors allow users to edit and execute SQL statements. They may support the following features: cut, copy, paste, undo, redo, find (and replace), bookmarks block indent, print, save file, uppercase/lowercase keyword highlighting auto-completion access to frequently used files output of query result editing query-results committing and rolling-back transactions inside cut paper === Object browsing === Tools may display information about database objects relevant to developers or to database administrators. Users may: view object descriptions view object definitions (DDL) create database objects enable and disable triggers and constraints recompile valid or invalid objects query or edit tables and views Some tools also provide features to display dependencies among objects, and allow users to expand these dependent objects recursively (for example: packages may reference views, views generally reference tables, super/subtypes, and so on). === Session browsing === Database administrators and application developers can use session browsing tools to view the current activities of each user in the database. They can check the resource-usage of individual users, statistics information, locked objects and the current running SQL of each individual session. === User-security management === DBAs can create, edit, delete, disable or enable user-accounts in the database using security-management tools. DBAs can also assign roles, system privileges, object privileges, and storage-quotas to users. === Debugging === Some tools offer features for the debugging of stored procedures: step in, step over, step out, run until exception, breakpoints, view & set variables, view call stack, and so on. Users can debug any program-unit without making any modification to it, including triggers and object types. === Performance monitoring === Monitoring tools may show the database resources — usage summary, service time summary, recent activities, top sessions, session history or top SQL — in easy-to-read graphs. Database administrators can easily monitor the health of various components in the monitoring instance. Application developers may also make use of such tools to diagnose and correct application-performance problems as well as improve SQL server performance. === Test data === Test data generation tools can populate the database by realistic test data for server or client side testing purposes. Also, this kind of software can upload sample blob files to database.

    Read more →
  • National Data Repository

    National Data Repository

    A National Data Repository (NDR) is a data bank that seeks to preserve and promote a country's natural resources data, particularly data related to the petroleum exploration and production (E&P) sector. A National Data Repository is normally established by an entity that governs, controls and supports the exchange, capture, transference and distribution of E&P information, with the final target to provide the State with the tools and information to assure the growth, govern-ability, control, independence and sovereignty of the industry. The two fundamental reasons for a country to establish an NDR are to preserve data generated inside the country by the industry, and to promote investments in the country by utilizing data to reduce the exploration, production, and transportation business risks. Countries take different approaches towards preserving and promoting their natural resources data. The approach varies according to a country's natural resources policies, level of openness, and its attitude towards foreign investment. == Data types == NDRs store a vast array of data related to a country's natural resources. This includes wells, well log data, well reports, core samples, seismic surveys, post-stack seismic, field data/tapes, seismic (acquisition/processing) reports, production data, geological maps and reports, license data and geological models. == Funding models == Some NDRs are financed entirely by a country's government. Others are industry-funded. Still some are hybrid systems, funded in part by industry and government. NDRs typically charge fees for data requests and for data loading. The cost differs significantly between countries. In some cases an annual membership is charged to oil companies to store and access the data in the NDR. == Standards body == Energistics is the global energy standards resource center for the upstream oil and gas industry. Energistics National Data Repository Work Group: The standards body is Energistics. === Energistics-standards-directory === Global regulators of upstream oil and natural gas information, including seismic, drilling, production and reservoir data, formed the National Data Repository (NDR) Work Group in 2008 to collaborate on the development of data management standards and to assist emerging nations with hydrocarbon reserves to better collect, maintain and deliver oil and gas data to the public and to the industry. Ten countries, led by the Netherlands, Norway and the United Kingdom, formed NDR to share best practices and to formalize the development and deployment of data management standards for regulatory agencies. The other countries involved in the NDR Work Group's formation are Australia, Canada, India, Kenya, New Zealand, South Africa and the United States. Annual NDR Conference: Approximately every 18 months Energistics organizes a National Data Repository Conference. The purpose is to provide government and regulatory agencies from around the world an opportunity to attend a series of workshops dedicated to developing data exchange standards, improving communications with the oil and gas industry and learning data management techniques for natural resources information. === Society of Exploration Geophysicists and The International Oil and Gas Producers Association === The SEG is the custodian of the SEG standards which are used for the exchange, retention and release of seismic data. They are commonly used by National Data Repositories with the SEGD and SEGY being the field and processed exchange standards respectively. == NDRs around the world == Click here to see a map of the NDRs around the world

    Read more →
  • Gitter

    Gitter

    Gitter is an open-source instant messaging and chat room system for developers and users of GitLab and GitHub repositories. Gitter is provided as software as a service, with a free option providing all basic features and the ability to create a single private chat room, and paid subscription options for individuals and organisations, which allows them to create arbitrary numbers of private chat rooms. Individual chat rooms can be created for individual Git repositories on GitHub. Chatroom privacy follows the privacy settings of the associated GitHub repository: thus, a chatroom for a private (i.e. members-only) GitHub repository is also private to those with access to the repository. A graphical badge linking to the chat room can then be placed in the git repository's README file, bringing it to the attention of all users and developers of the project. Users can chat in the chat rooms, or access private chat rooms for repositories they have access to, by logging into Gitter via GitHub. Gitter is similar to Slack. Like Slack, it automatically logs all messages in the cloud. In late 2020, New Vector Limited acquired Gitter from GitLab, and announced Gitter's features would eventually be moved to New Vector's flagship product, Element, thereby replacing Gitter entirely. On February 13, 2023, Gitter migrated their service to a custom-branded Matrix instance that uses Element for its web interface. == Features prior to Migration to Matrix == Gitter supports: Notifications, which are batched up on mobile devices to avoid annoyance Inline media files Viewing and subscribing to ("starring") multiple chat rooms in one web browser tab Linking to individual files in the linked git repository Linking to GitHub issues (by typing # and then the issue number) in the linked Git repository, with hovercards showing the details of the issue GitHub-flavored Markdown in chat messages Online status for users User hovercards, based on their GitHub profiles and statistics (number of GitHub followers, etc.) Browsable and searchable message archives, grouped by month Connection from IRC clients Gitter on iOS support authentication using GitHub or Twitter === Integrations with non-GitHub sites and applications === Gitter integrates with Trello, Jenkins, Travis CI, Drone (software), Heroku, and Bitbucket, among others. === Apps === Official Gitter apps for Windows, Mac, Linux, iOS and Android are available. === Account registration === Like other chat technologies, Gitter allows clients to instant message each other. It allows people to authenticate using a GitHub account and join a chatroom from a web browser, thus not requiring one to install any software, or create additional online accounts. == History == Gitter was created by some developers who were initially trying to create a generic web-based chat product, but then wrote extra code to hook their chat application up to GitHub to meet their own needs, and realised that they could turn the combined product into a viable specialist product in its own right. Gitter came out of beta in 2014. During the beta period, Gitter delivered 1.8 million chat messages. On March 15, 2017, GitLab announced the acquisition of Gitter. Included in the announcement was the stated intent that Gitter would continue as a standalone project. It was published as open source under an MIT License as of June 2017. On September 30, 2020, New Vector Limited acquired Gitter from GitLab, and announced upcoming support for the Matrix protocol in Gitter, which went live by the end of the year. Gitter's features would eventually be moved to New Vector's flagship product, Element, thereby replacing Gitter entirely. On February 13, 2023, Gitter migrated their service to a custom-branded Matrix instance that uses Element for its web interface. == Implementation prior to Migration to Matrix == The Gitter web application is implemented entirely in JavaScript, with the back end being implemented on Node.js. The source code to the web application was formerly proprietary (it was open-sourced in June 2017), although Gitter had made numerous auxiliary projects available as open-source software, such as an IRC bridge for IRC users who prefer using IRC client applications (and their extra features) to converse in the Gitter chat rooms.

    Read more →
  • The Algorithm Auction

    The Algorithm Auction

    The Algorithm Auction is the world's first auction of computer algorithms. Created by Ruse Laboratories, the initial auction featured seven lots and was held at the Cooper Hewitt, Smithsonian Design Museum on March 27, 2015. Five lots were physical representations of famous code or algorithms, including a signed, handwritten copy of the original Hello, World! C program by its creator Brian Kernighan on dot-matrix printer paper, a printed copy of 5,000 lines of Assembly code comprising the earliest known version of Turtle Graphics, signed by its creator Hal Abelson, a necktie containing the six-line qrpff algorithm capable of decrypting content on a commercially produced DVD video disc, and a pair of drawings representing OkCupid's original Compatibility Calculation algorithm, signed by the company founders. The qrpff lot sold for $2,500. Two other lots were “living algorithms,” including a set of JavaScript tools for building applications that are accessible to the visually impaired and the other is for a program that converts lines of software code into music. Winning bidders received, along with artifacts related to the algorithms, a full intellectual property license to use, modify, or open-source the code. All lots were sold, with Hello World receiving the most bids. Exhibited alongside the auction lots were a facsimile of the Plimpton 322 tablet on loan from Columbia University, and Nigella, an art-world facing computer virus named after Nigella Lawson and created by cypherpunk and hacktivist Richard Jones. Sebastian Chan, Director of Digital & Emerging Media at the Cooper–Hewitt, attended the event remotely from Milan, Italy via a Beam Pro telepresence robot. == Effects == Following the auction, the Museum of Modern Art held a salon titled The Way of the Algorithm highlighting algorithms as "a ubiquitous and indispensable component of our lives."

    Read more →
  • Artificial intelligence industry in China

    Artificial intelligence industry in China

    The roots of the development of artificial intelligence in the People's Republic of China started in the late 1970s following Deng Xiaoping's reform and opening up emphasizing science and technology as the country's primary productive force. The initial stages of China's AI development were slow and encountered significant challenges due to lack of resources and talent. At the beginning China was behind most Western countries in terms of AI development. A majority of the research was led by scientists who had received higher education abroad. Since 2006, the Chinese government has steadily developed a national agenda for artificial intelligence development and emerged as one of the leading nations in artificial intelligence research and development. In 2016, the Chinese Communist Party (CCP) released its 13th Five-Year Plan in which it aimed to become a global AI leader by 2030. As of 2025, China is considered to be a world leader in AI technology along with the United States. The State Council has a list of "national AI teams" including fifteen China-based companies, including Baidu, Tencent, Alibaba, SenseTime, and iFlytek. Each company should lead the development of a designated specialized AI sector in China, such as facial recognition, software/hardware, and speech recognition. China's rapid AI development has significantly impacted Chinese society in many areas, including the socio-economic, military, intelligence, and political spheres. Agriculture, transportation, accommodation and food services, and manufacturing are the top industries that would be the most impacted by further AI deployment. The private sector, university laboratories, and the military are working collaboratively in many aspects as there are few current existing boundaries. In 2021, China published the Data Security Law of the People's Republic of China, its first national law addressing AI-related ethical concerns. In October 2022, the United States federal government announced a series of export controls and trade restrictions intended to restrict China's access to advanced computer chips for AI applications. In 2023, the Cyberspace Administration of China issued guidelines requiring that AI content upholds the ideology of the CCP including Core Socialist Values, avoids discrimination, respects intellectual property rights, and safeguards user data. In 2025, the Chinese government issued a document regarding training data, requiring companies to use as little as data deemed "unsafe" as possible, as well as requiring companies to test models regularly. Concerns have been raised about the effects of the Chinese government's censorship regime on the development of generative artificial intelligence and long-term talent acquisition with state of the country's demographics. Others have noted that official notions of AI safety require following the priorities of the CCP and are antithetical to standards in democratic societies and raised concerns about the extension of China's system of mass surveillance and censorship abroad. == History == The Chinese term for artificial intelligence (réngōngzhìnéng 人工智能) connotes "humanmade" intelligence. The term developed as mid-20th century localisation of the Japanese term jinko chino. The research and development of artificial intelligence in China started in the 1980s, with the announcement by Deng Xiaoping of the importance of science and technology for China's economic growth. === Late 1970s to early 2010s === Chinese artificial intelligence research and development began in late 1970s after Deng Xiaoping's reform and opening up. China's first national conference on AI occurred in 1979. Academic journals in the late 1970s began publishing literature reviews of Western research on AI topics. In the 1980s, a group of Chinese scientists launched AI research led by Qian Xuesen and Wu Wenjun. However, during the time, China's society still had a generally conservative view towards AI. In the early 1980s, Science Press published translated versions of Western textbooks such as Patrick Winston's Artificial Intelligence and Nils John Nilsson's Principles of Artificial Intelligence. In 1980, a journal of the Chinese Academy of Sciences convened its first annual National Symposium on Artificial Intelligence, which included national and international scholars like Herbert A. Simon. The Chinese Association for Artificial Intelligence (CAAI) was founded in September 1981 and was authorized by the Ministry of Civil Affairs. CAAI has continued to be the largest AI association in China as of 2025. In 1982, CAAI began publishing the Artificial Intelligence Journal, which published early AI research by Chinese academics. In the 1980s, Chinese research on AI was influenced by the field of cybernetics, particularly the work of Norbert Weiner and his text Cybernetics: Or Control and Communication in the Animal and the Machine. Chinese researchers at the time sought to situate AI as part of a broader "Intelligence Science" field which would include disciplines like mathematics, computer science, cognitive science, social sciences, and philosophy. In 1987, Tsinghua University began a research publication on AI. Beginning in 1993, smart automation and intelligence have been part of China's national technology plan. Since the 2000s, the Chinese government has further expanded its research and development funds for AI and the number of government-sponsored research projects has dramatically increased. In 2006, China announced a policy priority for the development of artificial intelligence, which was included in the National Medium and Long Term Plan for the Development of Science and Technology (2006–2020), released by the State Council. In the same year, artificial intelligence was also mentioned in the 11th Five-Year Plan. In 2011, the Association for the Advancement of Artificial Intelligence (AAAI) established a branch in Beijing, China. At same year, the Wu Wenjun Artificial Intelligence Science and Technology Award was founded in honor of Chinese mathematician Wu Wenjun, and it became the highest award for Chinese achievements in the field of artificial intelligence. The first award ceremony was held on May 14, 2012. In 2013, the International Joint Conferences on Artificial Intelligence (IJCAI) was held in Beijing, marking the first time the conference was held in China. This event coincided with the Chinese government's announcement of the "Chinese Intelligence Year," a significant milestone in China's development of artificial intelligence. === Late 2010s to early 2020s === AI became a major issue of commercial, public, and political focus in China in the latter half of the 2010s. Various interpretations of the primary cause for this increased focus exist, with some analyses focusing on the 2016 Go match between Google's AlphaGo and Lee Sedol, others emphasising the U.S. increasing trade restrictions on China's technology industries and the desire to achieve national technological self-sufficiency. The State Council of China issued "A Next Generation Artificial Intelligence Development Plan" (State Council Document [2017] No. 35) on 20 July 2017. In the document, the CCP Central Committee and the State Council urged governing bodies in China to promote the development of artificial intelligence. Specifically, the plan described AI as a strategic technology that has become a "focus of international competition".:2 The document urged significant investment in a number of strategic areas related to AI and called for close cooperation between the state and private sectors. It set the goal of China becoming the preeminent country for AI research and application by 2030. During the general secretaryship of Xi Jinping, artificial intelligence has been a focus of the CCP's military-civil fusion efforts. On the occasion of Xi's speech at the first plenary meeting of the Central Military-Civil Fusion Development Committee (CMCFDC), scholars from the National Defense University wrote in the PLA Daily that the "transferability of social resources" between economic and military ends is an essential component to being a great power. During the Two Sessions 2017,"artificial intelligence plus" was proposed to be elevated to a strategic level. The same year witnessed the emergence of multiple application-level usages in the medical field according to reports. In 2018, Xinhua News Agency, in partnership with Tencent's subsidiary Sogou, launched its first artificial intelligence-generated news anchor. In 2018, the State Council budgeted $2.1 billion for an AI industrial park in Mentougou district. In order to achieve this the State Council stated the need for massive talent acquisition, theoretical and practical developments, as well as public and private investments. Some of the stated motivations that the State Council gave for pursuing its AI strategy include the potential of artificial intelligence for industrial transformation, better social

    Read more →
  • Collaborative diffusion

    Collaborative diffusion

    Collaborative Diffusion is a type of pathfinding algorithm which uses the concept of antiobjects, objects within a computer program that function opposite to what would be conventionally expected. Collaborative Diffusion is typically used in video games, when multiple agents must path towards a single target agent. For example, the ghosts in Pac-Man. In this case, the background tiles serve as antiobjects, carrying out the necessary calculations for creating a path and having the foreground objects react accordingly, whereas having foreground objects be responsible for their own pathing would be conventionally expected. Collaborative Diffusion is favored for its efficiency over other pathfinding algorithms, such as A, when handling multiple agents. Also, this method allows elements of competition and teamwork to easily be incorporated between tracking agents. Notably, the time taken to calculate paths remains constant as the number of agents increases.

    Read more →
  • Meta-learning (computer science)

    Meta-learning (computer science)

    Meta-learning is a subfield of machine learning where automatic learning algorithms are applied to metadata about machine learning experiments. As of 2017, the term had not found a standard interpretation, however the main goal is to use such metadata to understand how automatic learning can become flexible in solving learning problems, hence to improve the performance of existing learning algorithms or to learn (induce) the learning algorithm itself, hence the alternative term learning to learn. Flexibility is important because each learning algorithm is based on a set of assumptions about the data, its inductive bias. This means that it will only learn well if the bias matches the learning problem. A learning algorithm may perform very well in one domain, but not on the next. This poses strong restrictions on the use of machine learning or data mining techniques, since the relationship between the learning problem (often some kind of database) and the effectiveness of different learning algorithms is not yet understood. By using different kinds of metadata, like properties of the learning problem, algorithm properties (like performance measures), or patterns previously derived from the data, it is possible to learn, select, alter or combine different learning algorithms to effectively solve a given learning problem. Critiques of meta-learning approaches bear a strong resemblance to the critique of metaheuristic, a possibly related problem. A good analogy to meta-learning, and the inspiration for Jürgen Schmidhuber's early work (1987) and Yoshua Bengio et al.'s work (1991), considers that genetic evolution learns the learning procedure encoded in genes and executed in each individual's brain. In an open-ended hierarchical meta-learning system using genetic programming, better evolutionary methods can be learned by meta evolution, which itself can be improved by meta meta evolution, etc. == Definition == A proposed definition for a meta-learning system combines three requirements: The system must include a learning subsystem. Experience is gained by exploiting meta knowledge extracted in a previous learning episode on a single dataset, or from different domains. Learning bias must be chosen dynamically. Bias refers to the assumptions that influence the choice of explanatory hypotheses and not the notion of bias represented in the bias-variance dilemma. Meta-learning is concerned with two aspects of learning bias. Declarative bias specifies the representation of the space of hypotheses, and affects the size of the search space (e.g., represent hypotheses using linear functions only). Procedural bias imposes constraints on the ordering of the inductive hypotheses (e.g., preferring smaller hypotheses). == Common approaches == There are three common approaches: using (cyclic) networks with external or internal memory (model-based) learning effective distance metrics (metrics-based) explicitly optimizing model parameters for fast learning (optimization-based). === Model-Based === Model-based meta-learning models updates its parameters rapidly with a few training steps, which can be achieved by its internal architecture or controlled by another meta-learner model. ==== Memory-Augmented Neural Networks ==== A Memory-Augmented Neural Network, or MANN for short, is claimed to be able to encode new information quickly and thus to adapt to new tasks after only a few examples. ==== Meta Networks ==== Meta Networks (MetaNet) learns a meta-level knowledge across tasks and shifts its inductive biases via fast parameterization for rapid generalization. === Metric-Based === The core idea in metric-based meta-learning is similar to nearest neighbors algorithms, which weight is generated by a kernel function. It aims to learn a metric or distance function over objects. The notion of a good metric is problem-dependent. It should represent the relationship between inputs in the task space and facilitate problem solving. ==== Convolutional Siamese Neural Network ==== Siamese neural network is composed of two twin networks whose output is jointly trained. There is a function above to learn the relationship between input data sample pairs. The two networks are the same, sharing the same weight and network parameters. ==== Matching Networks ==== Matching Networks learn a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types. ==== Relation Network ==== The Relation Network (RN), is trained end-to-end from scratch. During meta-learning, it learns to learn a deep distance metric to compare a small number of images within episodes, each of which is designed to simulate the few-shot setting. ==== Prototypical Networks ==== Prototypical Networks learn a metric space in which classification can be performed by computing distances to prototype representations of each class. Compared to recent approaches for few-shot learning, they reflect a simpler inductive bias that is beneficial in this limited-data regime, and achieve satisfied results. === Optimization-Based === What optimization-based meta-learning algorithms intend for is to adjust the optimization algorithm so that the model can be good at learning with a few examples. ==== LSTM Meta-Learner ==== LSTM-based meta-learner is to learn the exact optimization algorithm used to train another learner neural network classifier in the few-shot regime. The parametrization allows it to learn appropriate parameter updates specifically for the scenario where a set amount of updates will be made, while also learning a general initialization of the learner (classifier) network that allows for quick convergence of training. ==== Temporal Discreteness ==== Model-Agnostic Meta-Learning (MAML) is a fairly general optimization algorithm, compatible with any model that learns through gradient descent. ==== Reptile ==== Reptile is a remarkably simple meta-learning optimization algorithm, given that both of its components rely on meta-optimization through gradient descent and both are model-agnostic. == Examples == Some approaches which have been viewed as instances of meta-learning: Recurrent neural networks (RNNs) are universal computers. In 1993, Jürgen Schmidhuber showed how "self-referential" RNNs can in principle learn by backpropagation to run their own weight change algorithm, which may be quite different from backpropagation. In 2001, Sepp Hochreiter & A.S. Younger & P.R. Conwell built a successful supervised meta-learner based on Long short-term memory RNNs. It learned through backpropagation a learning algorithm for quadratic functions that is much faster than backpropagation. Researchers at Deepmind (Marcin Andrychowicz et al.) extended this approach to optimization in 2017. In the 1990s, Meta Reinforcement Learning or Meta RL was achieved in Schmidhuber's research group through self-modifying policies written in a universal programming language that contains special instructions for changing the policy itself. There is a single lifelong trial. The goal of the RL agent is to maximize reward. It learns to accelerate reward intake by continually improving its own learning algorithm which is part of the "self-referential" policy. An extreme type of Meta Reinforcement Learning is embodied by the Gödel machine, a theoretical construct which can inspect and modify any part of its own software which also contains a general theorem prover. It can achieve recursive self-improvement in a provably optimal way. Model-Agnostic Meta-Learning (MAML) was introduced in 2017 by Chelsea Finn et al. Given a sequence of tasks, the parameters of a given model are trained such that few iterations of gradient descent with few training data from a new task will lead to good generalization performance on that task. MAML "trains the model to be easy to fine-tune." MAML was successfully applied to few-shot image classification benchmarks and to policy-gradient-based reinforcement learning. Variational Bayes-Adaptive Deep RL (VariBAD) was introduced in 2019. While MAML is optimization-based, VariBAD is a model-based method for meta reinforcement learning, and leverages a variational autoencoder to capture the task information in an internal memory, thus conditioning its decision making on the task. When addressing a set of tasks, most meta learning approaches optimize the average score across all tasks. Hence, certain tasks may be sacrificed in favor of the average score, which is often unacceptable in real-world applications. By contrast, Robust Meta Reinforcement Learning (RoML) focuses on improving low-score tasks, increasing robustness to the selection of task. RoML works as a meta-algorithm, as it can be applied on top of other meta learning algorithms (such as MAML and VariBAD) to increase their robustness. It is applicable to both supervised meta learning and meta reinforcement learning. Discovering meta-knowledge works by inducing knowledge

    Read more →
  • Exploratory search

    Exploratory search

    Exploratory search is a specialization of information exploration which represents the activities carried out by searchers who are: unfamiliar with the domain of their goal (i.e. need to learn about the topic in order to understand how to achieve their goal) or unsure about the ways to achieve their goals (either the technology or the process) or unsure about their goals in the first place. Exploratory search is distinguished from known-item search, for which the searcher has a particular target in mind. Consequently, exploratory search covers a broader class of activities than typical information retrieval, such as investigating, evaluating, comparing, and synthesizing, where new information is sought in a defined conceptual area; exploratory data analysis is another example of an information exploration activity. Typically, therefore, such users generally combine querying and browsing strategies to foster learning and investigation. == History == Exploratory search is a topic that has grown from the fields of information retrieval and information seeking but has become more concerned with alternatives to the kind of search that has received the majority of focus (returning the most relevant documents to a Google-like keyword search). The research is motivated by questions like "What if the user doesn't know which keywords to use?" or "What if the user isn't looking for a single answer?" Consequently, research has begun to focus on defining the broader set of information behaviors in order to learn about the situations when a user is, or feels, limited by only having the ability to perform a keyword search. In the last few years, a series of workshops has been held at various related and key events. In 2005, the Exploratory Search Interfaces workshop focused on beginning to define some of the key challenges in the field. Since then a series of other workshops has been held at related conferences: Evaluating Exploratory Search at SIGIR06 and Exploratory Search and HCI at CHI07 (in order to meet with the experts in human–computer interaction). In March 2008, an Information Processing and Management special issue focused particularly on the challenges of evaluating exploratory search, given the reduced assumptions that can be made about scenarios of use. In June 2008, the National Science Foundation sponsored an invitational workshop to identify a research agenda for exploratory search and similar fields for the coming years. == Research challenges == === Important scenarios === With the majority of research in the information retrieval community focusing on typical keyword search scenarios, one challenge for exploratory search is to further understand the scenarios of use for when keyword search is not sufficient. An example scenario, often used to motivate the research by mSpace, states: if a user does not know much about classical music, how should they even begin to find a piece that they might like. Similarly, for patients or their carers, if they don't know the right keywords for their health problems, how can they effectively find useful health information for themselves? === Designing new interfaces === With one of the motivations being to support users when keyword search is not enough, some research has focused on identifying alternative user interfaces and interaction models that support the user in different ways. An example is faceted search which presents diverse category-style options to the users, so that they can choose from a list instead of guess a possible keyword query. Many of the interactive forms of search, including faceted browsers, are being considered for their support of exploratory search conditions. Computational cognitive models of exploratory search have been developed to capture the cognitive complexities involved in exploratory search. Model-based dynamic presentation of information cues are proposed to facilitate exploratory search performance. === Evaluating interfaces === As the tasks and goals involved with exploratory search are largely undefined or unpredictable, it is very hard to evaluate systems with the measures often used in information retrieval. Accuracy was typically used to show that a user had found a correct answer, but when the user is trying to summarize a domain of information, the correct answer is near impossible to identify, if not entirely subjective (for example: possible hotels to stay in Paris). In exploration, it is also arguable that spending more time (where time efficiency is typically desirable) researching a topic shows that a system provides increased support for investigation. Finally, and perhaps most importantly, giving study participants a well specified task could immediately prevent them from exhibiting exploratory behavior. === Models of exploratory search behavior === There have been recent attempts to develop a process model of exploratory search behavior, especially in social information system (e.g., see models of collaborative tagging. The process model assumes that user-generated information cues, such as social tags, can act as navigational cues that facilitate exploration of information that others have found and shared with other users on a social information system (such as social bookmarking system). These models provided extension to existing process model of information search that characterizes information-seeking behavior in traditional fact-retrievals using search engines. Recent development in exploratory search is often concentrated in predicting users' search intents in interaction with the user. Such predictive user modeling, also referred as intent modeling, can help users to get accustomed to a body of domain knowledge and help users to make sense of the potential directions to be explored around their initial, often vague, expression of information needs. == Major figures == Key figures, including experts from both information seeking and human–computer interaction, are: Marcia Bates Nicholas Belkin Gary Marchionini m.c. schraefel Ryen W. White

    Read more →
  • Organizational information theory

    Organizational information theory

    Organizational Information Theory (OIT) is a communication theory, developed by Karl Weick, offering systemic insight into the processing and exchange of information within organizations and among its members. Unlike the past structure-centered theory, OIT focuses on the process of organizing in dynamic, information-rich environments. Given that, it contends that the main activity of organizations is the process of making sense of equivocal information. Organizational members are instrumental to reduce equivocality and achieve sensemaking through some strategies — enactment, selection, and retention of information. With a framework that is interdisciplinary in nature, organizational information theory's desire to eliminate both ambiguity and complexity from workplace messaging builds upon earlier findings from general systems theory and phenomenology. == Inspiration and influence of pre-existing theories == 1. General Systems Theory The General Systems Theory, on its most basic premise, describes the phenomenon of a cohesive group of interrelated parts. When one part of the system is changed or affected, it will affect the system as a whole. Weick uses this theoretical framework from 1950 to influence his organizational information theory. Likewise, organizations can be viewed as a system of related parts that work together towards a common goal or vision. Applying this to Weick's organizational information theory, organizations must work to reduce ambiguity and complexity in the workplace to maximize cohesiveness and efficiency. Weick uses the term, coupling, to describe how organizations, like a system, can be composed of interrelated and dependent parts. Coupling looks at the relationship between people and work. There are two types of coupling: 1. Loose coupling Loose coupling describes that while people within the organization or system are connected and often work together, they do not depend on one another to continue or fully complete individual work. The dependencies are weak and workflow is flexible. For example, "if the whole Science department completely shuts down because all of teachers are sick or for whatsoever reason, the school can still continue to operate because other departments are still present." 2. Tight coupling Tight coupling describes when connections within an organization are strong and dependent. If one part of the organization is not operating correctly, the organization as a whole cannot continue to their fullest potential. " For instance, the format and ink section completely shuts down hence the succeeding steps cannot be continued, so the whole process of the organization will be dropped. Thus, components of a system are directly dependent on one another." 2. Theory of evolution The theory of evolution, by Charles Darwin, is a framework for survival of the fittest. According to Darwin, organisms attempt to adapt and live in an unforgiving environment. Those that are unsuccessful in adaptation do not survive, while the strong organisms continue to thrive and reproduce. Weick invokes inspiration from Darwin, to incorporate a biological perspective to his theory. It is natural for organizations to have to adapt to incoming information that often interfere with the preexisting environment. Organizations that are able to plan and alter strategies in accordance with their constant need of organizing and sense making, will survive and be the most successful. However, there is a notable difference between animal evolution and survival of the fittest in organizations, "A given animal is what it is; variation comes through mutation. But the nature of an organization can change when its members alter their behavior." == Assumptions == 1. Human organizations exist in an information environment Unlike senders and receivers models, OIT stands on the situational perspective. Karl Weick views a human organization as an open social system. People in that system develop a mechanism to establish goals, obtain and process information, or perceive the environment. In this process, people and the environment come to conclusions on "what's going on here?". Colville believes that this attributional process is retrospective. Take an education institution as an example. A university can obtain information regarding students' needs in numerous ways. It might create feedback section in its website. It could organize alumni panels or academic affairs to attract prospective students and collect concrete questions they are interested in. It may also conduct the survey or host focus group to get the information. After that, the staff of the university have to decide how to deal with these information, based on which, it has to set and accomplish its goals for current and prospective students. 2. The information an organization receives differs in terms of equivocality Weick posits that numerous feasible interpretations of reality exist when organizations process information. Their varying levels of understandability lead to different outcomes of information inputs. In other academic works, scholars tend to say that messages are uncertain or ambiguous. While according to OIT, messages are described to be equivocal. believes that people proactively exclude a number of possibilities to perceive what is going on in the environment. Due to OIT's situational perspective, the meanings of messages consist of the messages, the interpretations of receivers, and the interactional context. However, ambiguity and uncertainty can mean that a standard answer - the only one true objective interpretation - exists. Also, Weick emphasizes that "the equivocality is the engine that motivates people to organize". Maitlis and Christianson states that the equivocality trigger sensemaking for three reasons: environment jolts and organizational crises, threats to identity, and planned change interventions. 3. Human organizations engage in information processing to reduce equivocality of information Based upon the first two assumption, OIT proposes that information processing within organizations is a social activity. Sharing is the key feature of organizational information processing. In that particular context, members jointly make sense the reality by reducing equivocality. It other words, the sensemaking is a joint responsibility which includes numerous interdependent people to accomplish. In this process, organizations and its members combine actions and attributions together in order to find the balance between the complexity of thoughts and the simplicity of actions. Weick also proposes that people create their own environment though enactment, which is the action of making sense. This is because people have different perceptual schemas and selective perception, so people create different information environments. In creating different information environments, people can arrive at the same or close to the same understanding or solution through different thought processes and overall understanding. == Key concepts == === The organization === In order to place Weick's vision regarding Organizational Information Theory into proper working context, exploring his view regarding what constitutes the organization and how its individuals embody that construct might yield significant insights. From a fundamental standpoint, he shared a belief that organizational validation is derived---not through bricks and mortar, or locale—but from a series of events which enable entities to "collect, manage and use the information they receive." In elaborating further on what constitutes an organization during early writings outlining OIT, Weick said, "The word organization is a noun and it is also a myth. if one looks for an organization, one will not find it. What will be found is that there are events linked together, that transpire within concrete walls and these sequences, their pathways, their timing, are the forms we erroneously make into substances when we talk about an organization". When viewed in this modular fashion, the organization meets Weick's theoretical vision by encompassing parameters that are less bound by concrete, wood, and structural restraints and more by an ability to serve as a repository where information can be consistently and effectively channeled. Taking these defining characteristics into account, proper channel execution relies on maximization of messaging clarity, context, delivery and evolution through any system. One example as to how these interactions might unfold on a more granular level within these confines can be gleaned through Weick's double interact loop, which he considers the "building blocks of every organization". Simply put, double interacts describe interpersonal exchanges that, inherently, occur across the organizational chain of command and in life, itself. Thus: "An act occurs when you say something (Can I have a Popsicle?). An interact occurs when you say something and I respond ("No, it will spoil your dinner

    Read more →
  • Leiden algorithm

    Leiden algorithm

    The Leiden algorithm is a community detection algorithm developed by Traag et al at Leiden University. It was developed as a modification of the Louvain method. Like the Louvain method, the Leiden algorithm attempts to optimize modularity in extracting communities from networks; however, it addresses key issues present in the Louvain method, namely poorly connected communities and the resolution limit of modularity. == Improvement over Louvain method == Broadly, the Leiden algorithm uses the same two primary phases as the Louvain algorithm: a local node moving step (though, the method by which nodes are considered in Leiden is more efficient) and a graph aggregation step. However, to address the issues with poorly-connected communities and the merging of smaller communities into larger communities (the resolution limit of modularity), the Leiden algorithm employs an intermediate refinement phase in which communities may be split to guarantee that all communities are well-connected. Consider, for example, the following graph: Three communities are present in this graph (each color represents a community). Additionally, the center "bridge" node (represented with an extra circle) is a member of the community represented by blue nodes. Now consider the result of a node-moving step which merges the communities denoted by red and green nodes into a single community (as the two communities are highly connected): Notably, the center "bridge" node is now a member of the larger red community after node moving occurs (due to the greedy nature of the local node moving algorithm). In the Louvain method, such a merging would be followed immediately by the graph aggregation phase. However, this causes a disconnection between two different sections of the community represented by blue nodes. In the Leiden algorithm, the graph is instead refined: The Leiden algorithm's refinement step ensures that the center "bridge" node is kept in the blue community to ensure that it remains intact and connected, despite the potential improvement in modularity from adding the center "bridge" node to the red community. == Graph components == Before defining the Leiden algorithm, it will be helpful to define some of the components of a graph. === Vertices and edges === A graph is composed of vertices (nodes) and edges. Each edge is connected to two vertices, and each vertex may be connected to zero or more edges. Edges are typically represented by straight lines, while nodes are represented by circles or points. In set notation, let V {\displaystyle V} be the set of vertices, and E {\displaystyle E} be the set of edges: V := { v 1 , v 2 , … , v n } E := { e i j , e i k , … , e k l } {\displaystyle {\begin{aligned}V&:=\{v_{1},v_{2},\dots ,v_{n}\}\\E&:=\{e_{ij},e_{ik},\dots ,e_{kl}\}\end{aligned}}} where e i j {\displaystyle e_{ij}} is the directed edge from vertex v i {\displaystyle v_{i}} to vertex v j {\displaystyle v_{j}} . We can also write this as an ordered pair: e i j := ( v i , v j ) {\displaystyle {\begin{aligned}e_{ij}&:=(v_{i},v_{j})\end{aligned}}} === Community === A community is a unique set of nodes: C i ⊆ V C i ⋂ C j = ∅ ∀ i ≠ j {\displaystyle {\begin{aligned}C_{i}&\subseteq V\\C_{i}&\bigcap C_{j}=\emptyset ~\forall ~i\neq j\end{aligned}}} and the union of all communities must be the total set of vertices: V = ⋃ i = 1 C i {\displaystyle {\begin{aligned}V&=\bigcup _{i=1}C_{i}\end{aligned}}} === Partition === A partition is the set of all communities: P = { C 1 , C 2 , … , C n } {\displaystyle {\begin{aligned}{\mathcal {P}}&=\{C_{1},C_{2},\dots ,C_{n}\}\end{aligned}}} == Partition quality == How communities are partitioned is an integral part on the Leiden algorithm. How partitions are decided can depend on how their quality is measured. Additionally, many of these metrics contain parameters of their own that can change the outcome of their communities. === Modularity === Modularity is a highly used quality metric for assessing how well a set of communities partition a graph. The equation for this metric is defined for an adjacency matrix, A, as: Q = 1 2 m ∑ i j ( A i j − k i k j 2 m ) δ ( c i , c j ) {\displaystyle Q={\frac {1}{2m}}\sum _{ij}(A_{ij}-{\frac {k_{i}k_{j}}{2m}})\delta (c_{i},c_{j})} where: A i j {\displaystyle A_{ij}} represents the edge weight between nodes i {\displaystyle i} and j {\displaystyle j} ; see Adjacency matrix; k i {\displaystyle k_{i}} and k j {\displaystyle k_{j}} are the sum of the weights of the edges attached to nodes i {\displaystyle i} and j {\displaystyle j} , respectively; m {\displaystyle m} is the sum of all of the edge weights in the graph; c i {\displaystyle c_{i}} and c j {\displaystyle c_{j}} are the communities to which the nodes i {\displaystyle i} and j {\displaystyle j} belong; and δ {\displaystyle \delta } is Kronecker delta function: δ ( c i , c j ) = { 1 if c i and c j are the same community 0 otherwise {\displaystyle {\begin{aligned}\delta (c_{i},c_{j})&={\begin{cases}1&{\text{if }}c_{i}{\text{ and }}c_{j}{\text{ are the same community}}\\0&{\text{otherwise}}\end{cases}}\end{aligned}}} === Reichardt Bornholdt Potts Model (RB) === One of the most well used metrics for the Leiden algorithm is the Reichardt Bornholdt Potts Model (RB). This model is used by default in most mainstream Leiden algorithm libraries under the name RBConfigurationVertexPartition. This model introduces a resolution parameter γ {\displaystyle \gamma } and is highly similar to the equation for modularity. This model is defined by the following quality function for an adjacency matrix, A, as: Q = ∑ i j ( A i j − γ k i k j 2 m ) δ ( c i , c j ) {\displaystyle Q=\sum _{ij}(A_{ij}-\gamma {\frac {k_{i}k_{j}}{2m}})\delta (c_{i},c_{j})} where: γ {\displaystyle \gamma } represents a linear resolution parameter === Constant Potts Model (CPM) === Another metric similar to RB is the Constant Potts Model (CPM). This metric also relies on a resolution parameter γ {\displaystyle \gamma } The quality function is defined as: H = − ∑ i j ( A i j w i j − γ ) δ ( c i , c j ) {\displaystyle H=-\sum _{ij}(A_{ij}w_{ij}-\gamma )\delta (c_{i},c_{j})} === Understanding Potts Model resolution parameters/Resolution limit === Typically Potts models such as RB or CPM include a resolution parameter in their calculation. Potts models are introduced as a response to the resolution limit problem that is present in modularity maximization based community detection. The resolution limit problem is that, for some graphs, maximizing modularity may cause substructures of a graph to merge and become a single community and thus smaller structures are lost. These resolution parameters allow modularity adjacent methods to be modified to suit the requirements of the user applying the Leiden algorithm to account for small substructures at a certain granularity. The figure on the right illustrates why resolution can be a helpful parameter when using modularity based quality metrics. In the first graph, modularity only captures the large scale structures of the graph; however, in the second example, a more granular quality metric could potentially detect all substructures in a graph. == Algorithm == The Leiden algorithm starts with a graph of disorganized nodes (a) and sorts it by partitioning them to maximize modularity (the difference in quality between the generated partition and a hypothetical randomized partition of communities). The method it uses is similar to the Louvain algorithm, except that after moving each node it also considers that node's neighbors that are not already in the community it was placed in. This process results in our first partition (b), also referred to as P {\displaystyle {\mathcal {P}}} . Then the algorithm refines this partition by first placing each node into its own individual community and then moving them from one community to another to maximize modularity. It does this iteratively until each node has been visited and moved, and each community has been refined - this creates partition (c), which is the initial partition of P refined {\displaystyle {\mathcal {P}}_{\text{refined}}} . Then an aggregate network (d) is created by turning each community into a node. P refined {\displaystyle {\mathcal {P}}_{\text{refined}}} is used as the basis for the aggregate network while P {\displaystyle {\mathcal {P}}} is used to create its initial partition. Because we use the original partition P {\displaystyle {\mathcal {P}}} in this step, we must retain it so that it can be used in future iterations. These steps together form the first iteration of the algorithm. In subsequent iterations, the nodes of the aggregate network (which each represent a community) are once again placed into their own individual communities and then sorted according to modularity to form a new P refined {\displaystyle {\mathcal {P}}_{\text{refined}}} , forming (e) in the above graphic. In the case depicted by the graph, the nodes were already sorted optimally, so no change too

    Read more →
  • XLNet

    XLNet

    The XLNet was an autoregressive Transformer designed as an improvement over BERT, with 340M parameters and trained on 33 billion words. It was released on 19 June 2019, under the Apache 2.0 license. It achieved state-of-the-art results on a variety of natural language processing tasks, including language modeling, question answering, and natural language inference. == Architecture == The main idea of XLNet is to model language autoregressively like the GPT models, but allow for all possible permutations of a sentence. Concretely, consider the following sentence:My dog is cute.In standard autoregressive language modeling, the model would be tasked with predicting the probability of each word, conditioned on the previous words as its context: We factorize the joint probability of a sequence of words x 1 , … , x T {\displaystyle x_{1},\ldots ,x_{T}} using the chain rule: Pr ( x 1 , … , x T ) = Pr ( x 1 ) Pr ( x 2 | x 1 ) Pr ( x 3 | x 1 , x 2 ) … Pr ( x T | x 1 , … , x T − 1 ) . {\displaystyle \Pr(x_{1},\ldots ,x_{T})=\Pr(x_{1})\Pr(x_{2}|x_{1})\Pr(x_{3}|x_{1},x_{2})\ldots \Pr(x_{T}|x_{1},\ldots ,x_{T-1}).} For example, the sentence "My dog is cute" is factorized as: Pr ( My , dog , is , cute ) = Pr ( My ) Pr ( dog | My ) Pr ( is | My , dog ) Pr ( cute | My , dog , is ) . {\displaystyle \Pr({\text{My}},{\text{dog}},{\text{is}},{\text{cute}})=\Pr({\text{My}})\Pr({\text{dog}}|{\text{My}})\Pr({\text{is}}|{\text{My}},{\text{dog}})\Pr({\text{cute}}|{\text{My}},{\text{dog}},{\text{is}}).} Schematically, we can write it as → My → My dog → My dog is → My dog is cute . {\displaystyle {\texttt {}}{\texttt {}}{\texttt {}}{\texttt {}}\to {\text{My }}{\texttt {}}{\texttt {}}{\texttt {}}\to {\text{My dog }}{\texttt {}}{\texttt {}}\to {\text{My dog is }}{\texttt {}}\to {\text{My dog is cute}}.} However, for XLNet, the model is required to predict the words in a randomly generated order. Suppose we have sampled a randomly generated order 3241, then schematically, the model is required to perform the following prediction task: is dog is dog is cute → My dog is cute {\displaystyle {\texttt {}}{\texttt {}}{\texttt {}}{\texttt {}}\to {\texttt {}}{\texttt {}}{\text{is }}{\texttt {}}\to {\texttt {}}{\text{dog is }}{\texttt {}}\to {\texttt {}}{\text{dog is cute}}\to {\text{My dog is cute}}} By considering all permutations, XLNet is able to capture longer-range dependencies and better model the bidirectional context of words. === Two-Stream Self-Attention === To implement permutation language modeling, XLNet uses a two-stream self-attention mechanism. The two streams are: Content stream: This stream encodes the content of each word, as in standard causally masked self-attention. Query stream: This stream encodes the content of each word in the context of what has gone before. In more detail, it is a masked cross-attention mechanism, where the queries are from the query stream, and the key-value pairs are from the content stream. The content stream uses the causal mask M causal = [ 0 − ∞ − ∞ … − ∞ 0 0 − ∞ … − ∞ 0 0 0 … − ∞ ⋮ ⋮ ⋮ ⋱ ⋮ 0 0 0 … 0 ] {\displaystyle M_{\text{causal}}={\begin{bmatrix}0&-\infty &-\infty &\dots &-\infty \\0&0&-\infty &\dots &-\infty \\0&0&0&\dots &-\infty \\\vdots &\vdots &\vdots &\ddots &\vdots \\0&0&0&\dots &0\end{bmatrix}}} permuted by a random permutation matrix to P M causal P − 1 {\displaystyle PM_{\text{causal}}P^{-1}} . The query stream uses the cross-attention mask P ( M causal − ∞ I ) P − 1 {\displaystyle P(M_{\text{causal}}-\infty I)P^{-1}} , where the diagonal is subtracted away specifically to avoid the model "cheating" by looking at the content stream for what the current masked token is. Like the causal masking for GPT models, this two-stream masked architecture allows the model to train on all tokens in one forward pass. == Training == Two models were released: XLNet-Large, cased: 110M parameters, 24-layer, 1024-hidden, 16-heads XLNet-Base, cased: 340M parameters, 12-layer, 768-hidden, 12-heads. It was trained on a dataset that amounted to 32.89 billion tokens after tokenization with SentencePiece. The dataset was composed of BooksCorpus, and English Wikipedia, Giga5, ClueWeb 2012-B, and Common Crawl. It was trained on 512 TPU v3 chips, for 5.5 days. At the end of training, it still under-fitted the data, meaning it could have achieved lower loss with more training. It took 0.5 million steps with an Adam optimizer, linear learning rate decay, and a batch size of 8192.

    Read more →
  • Documentation

    Documentation

    Documentation is any communicable material that is used to describe, explain, or instruct regarding some attributes of an object, system, or procedure, such as its parts, assembly, installation, maintenance, and use. As a form of knowledge management and knowledge organization, documentation can be provided on paper, online, or on digital or analog media, such as audio tape or CDs. Examples of such resources include user guides, white papers, online help, and quick-reference guides. Paper or hard-copy documentation has become less common. Contemporary documentation is often distributed through websites, software products, and other online applications. Documentation, understood as a set of instructional materials, should not be confused with documentation science, which is the study of the recording and retrieval of information. == Principles for producing documentation == While associated International Organization for Standardization (ISO) standards are not easily available publicly, a guide from other sources for this topic may serve the purpose. Documentation development may involve document drafting, formatting, submitting, reviewing, approving, distributing, reposting and tracking, etc., and are convened by associated standard operating procedure in a regulatory industry. It could also involve creating content from scratch. Documentation should be easy to read and understand. If it is too long and too wordy, it may be misunderstood or ignored. Clear, concise words should be used, and sentences should be limited to a maximum of 15 words. Documentation intended for a general audience should avoid gender-specific terms and cultural biases. In a series of procedures, steps should be clearly numbered. == Producing documentation == Technical writers and corporate communicators are professionals whose field and work is documentation. Ideally, technical writers have a background in both the subject matter and also in writing, managing content, and information architecture. Technical writers more commonly collaborate with subject-matter experts, such as engineers, technical experts, medical professionals, etc. to define and then create documentation to meet the user's needs. Corporate communications includes other types of written documentation, for example: Market communications (MarCom): MarCom writers endeavor to convey the company's value proposition through a variety of print, electronic, and social media. This area of corporate writing is often engaged in responding to proposals. Technical communication (TechCom): Technical writers document a company's product or service. Technical publications can include user guides, installation and configuration manuals, and troubleshooting and repair procedures. Legal writing: This type of documentation is often prepared by attorneys or paralegals. Compliance documentation: This type of documentation codifies standard operating procedures, for any regulatory compliance needs, as for safety approval, taxation, financing, and technical approval. Healthcare documentation: This field of documentation encompasses the timely recording and validation of events that have occurred during the course of providing health care. == Documentation in computer science == === Types === The following are typical software documentation types: Request for proposal Requirements/statement of work/scope of work Software design and functional specification System design and functional specifications Change management, error and enhancement tracking User acceptance testing Manpages The following are typical hardware and service documentation types: Network diagrams Network maps Datasheet for IT systems (server, switch, e.g.) Service catalog and service portfolio (Information Technology Infrastructure Library) === Software Documentation Folder (SDF) tool === A common type of software document written in the simulation industry is the SDF. When developing software for a simulator, which can range from embedded avionics devices to 3D terrain databases by way of full motion control systems, the engineer keeps a notebook detailing the development "the build" of the project or module. The document can be a wiki page, Microsoft Word document or other environment. They should contain a requirements section, an interface section to detail the communication interface of the software. Often a notes section is used to detail the proof of concept, and then track errors and enhancements. Finally, a testing section to document how the software was tested. This documents conformance to the client's requirements. The result is a detailed description of how the software is designed, how to build and install the software on the target device, and any known defects and workarounds. This build document enables future developers and maintainers to come up to speed on the software in a timely manner, and also provides a roadmap to modifying code or searching for bugs. === Software tools for network inventory and configuration === These software tools can automatically collect data of your network equipment. The data could be for inventory and for configuration information. The Information Technology Infrastructure Library requests to create such a database as a basis for all information for the IT responsible. It is also the basis for IT documentation. Examples include XIA Configuration. == Documentation in criminal justice == "Documentation" is the preferred term for the process of populating criminal databases. Examples include the National Counterterrorism Center's Terrorist Identities Datamart Environment, sex offender registries, and gang databases. == Documentation in early childhood education == Documentation, as it pertains to the early childhood education field, is "when we notice and value children's ideas, thinking, questions, and theories about the world and then collect traces of their work (drawings, photographs of the children in action, and transcripts of their words) to share with a wider community". Thus, documentation is a process, used to link the educator's knowledge and learning of the child/children with the families, other collaborators, and even to the children themselves. Documentation is an integral part of the cycle of inquiry - observing, reflecting, documenting, sharing and responding. Pedagogical documentation, in terms of the teacher documentation, is the "teacher's story of the movement in children's understanding". According to Stephanie Cox Suarez in "Documentation - Transforming our Perspectives", "teachers are considered researchers, and documentation is a research tool to support knowledge building among children and adults". Documentation can take many different styles in the classroom. The following exemplifies ways in which documentation can make the research, or learning, visible: Documentation panels (bulletin-board-like presentation with multiple pictures and descriptions about the project or event). Daily log (a log kept every day that records the play and learning in the classroom) Documentation developed by or with the children (when observing children during documentation, the child's lens of the observation is used in the actual documentation) Individual portfolios (documentation used to track and highlight the development of each child) Electronic documentation (using apps and devices to share documentation with families and collaborators) Transcripts or recordings of conversations (using recording in documentation can bring about deeper reflections for both the educator and the child) Learning stories (a narrative used to "describe learning and help children see themselves as powerful learners") The classroom as documentation (reflections and documentation of the physical environment of a classroom). Documentation is certainly a process in and of itself, and it is also a process within the educator. The following is the development of documentation as it progresses for and in the educator themselves: Develop(s) habits of documentation Become(s) comfortable with going public with recounting of activities Develop(s) visual literacy skills Conceptualize(s) the purpose of documentation as making learning styles visible, and Share(s) visible theories for interpretation purposes and further design of curriculum.

    Read more →
  • Explore-then-commit algorithm

    Explore-then-commit algorithm

    Explore Then Commit (ETC) is an algorithm for the multi-armed bandit problem foc,used on finding the best trade-off between exploration and exploitation. == Multi-armed bandit problem == The multi-armed bandit problem is a sequential game where one player has to choose at each turn between K {\displaystyle K} actions (arms). Behind every arm a {\displaystyle a} is an unknown distribution ν a {\displaystyle \nu _{a}} that lies in a set D {\displaystyle {\mathcal {D}}} known by the player (for example, D {\displaystyle {\mathcal {D}}} can be the set of Gaussian distributions or Bernoulli distributions). At each turn t {\displaystyle t} the player chooses (pulls) an arm a t {\displaystyle a_{t}} , they then get an observation X t {\displaystyle X_{t}} of the distribution ν a t {\displaystyle \nu _{a_{t}}} . === Regret minimization === The goal is to minimize the regret at time T {\displaystyle T} that is defined as R T := ∑ a = 1 K Δ a E [ N a ( T ) ] {\displaystyle R_{T}:=\sum _{a=1}^{K}\Delta _{a}\mathbb {E} [N_{a}(T)]} where μ a := E [ ν a ] {\displaystyle \mu _{a}:=\mathbb {E} [\nu _{a}]} is the mean of arm a {\displaystyle a} μ ∗ := max a μ a {\displaystyle \mu ^{}:=\max _{a}\mu _{a}} is the highest mean Δ a := μ ∗ − μ a {\displaystyle \Delta _{a}:=\mu ^{}-\mu _{a}} N a ( t ) {\displaystyle N_{a}(t)} is the number of pulls of arm a {\displaystyle a} up to turn t {\displaystyle t} The player has to find an algorithm that chooses at each turn t {\displaystyle t} which arm to pull based on the previous actions and observations ( a s , X s ) s < t {\displaystyle (a_{s},X_{s})_{s Read more →