Rule induction

Rule induction

Rule induction is an area of machine learning in which formal rules are extracted from a set of observations. The rules extracted may represent a full scientific model of the data, or merely represent local patterns in the data. Data mining in general and rule induction in detail are trying to create algorithms without human programming but with analyzing existing data structures. In the easiest case, a rule is expressed with “if-then statements” and was created with the ID3 algorithm for decision tree learning. Rule learning algorithm are taking training data as input and creating rules by partitioning the table with cluster analysis. A possible alternative over the ID3 algorithm is genetic programming which evolves a program until it fits to the data. Creating different algorithm and testing them with input data can be realized in the WEKA software. Additional tools are machine learning libraries for Python, like scikit-learn. == Paradigms == Some major rule induction paradigms are: Association rule learning algorithms (e.g., Agrawal) Decision rule algorithms (e.g., Quinlan 1987) Hypothesis testing algorithms (e.g., RULEX) Horn clause induction Version spaces Rough set rules Inductive Logic Programming Boolean decomposition (Feldman) == Algorithms == Some rule induction algorithms are: Charade Rulex Progol CN2

Realization (linguistics)

In linguistics, realization is the process by which some kind of surface representation is derived from its underlying representation; that is, the way in which some abstract object of linguistic analysis comes to be produced in actual language. Phonemes are often said to be realized by speech sounds. The different sounds that can realize a particular phoneme are called its allophones. Realization is also a subtask of natural language generation, which involves creating an actual text in a human language (English, French, etc.) from a syntactic representation. There are a number of software packages available for realization, most of which have been developed by academic research groups in NLG. The remainder of this article concerns realization of this kind. == Example == For example, the following Java code causes the simplenlg system [2] to print out the text The women do not smoke.: In this example, the computer program has specified the linguistic constituents of the sentence (verb, subject), and also linguistic features (plural subject, negated), and from this information the realiser has constructed the actual sentence. == Processing == Realisation involves three kinds of processing: Syntactic realisation: Using grammatical knowledge to choose inflections, add function words and also to decide the order of components. For example, in English the subject usually precedes the verb, and the negated form of smoke is do not smoke. Morphological realisation: Computing inflected forms, for example the plural form of woman is women (not womans). Orthographic realisation: Dealing with casing, punctuation, and formatting. For example, capitalising The because it is the first word of the sentence. The above examples are very basic, most realisers are capable of considerably more complex processing. == Systems == A number of realisers have been developed over the past 20 years. These systems differ in terms of complexity and sophistication of their processing, robustness in dealing with unusual cases, and whether they are accessed programmatically via an API or whether they take a textual representation of a syntactic structure as their input. There are also major differences in pragmatic factors such as documentation, support, licensing terms, speed and memory usage, etc. It is not possible to describe all realisers here, but a few of the emerging areas are: Simplenlg [3]: a document realizing engine with an api which intended to be simple to learn and use, focused on limiting scope to only finding the surface area of a document. KPML [4]: this is the oldest realiser, which has been under development under different guises since the 1980s. It comes with grammars for ten different languages. FUF/SURGE [5]: a realiser which was widely used in the 1990s, and is still used in some projects today OpenCCG [6]: an open-source realiser which has a number of nice features, such as the ability to use statistical language models to make realisation decisions.

Fuse Services Framework

Fuse Services Framework is an open source SOAP and REST web services platform based on Apache CXF for use in enterprise IT organizations. It is productized and supported by the Fuse group at FuseSource Corp. Fuse Services Framework service-enables new and existing systems for use in enterprise SOA infrastructure. Fuse Services Framework is a pluggable, small-footprint engine that creates high performance, secure and robust services in minutes using front-end programming APIs like JAX-WS and JAX-RS. It supports multiple transports and bindings and is extensible so developers can add bindings for additional message formats so all systems can work together without having to communicate through a centralized server. Fuse Services Framework is now a part of Red Hat JBoss Fuse. Fabric8 is a free Apache 2.0 Licensed upstream community for the JBoss Fuse product from Red Hat.

Thai QR Payment

Thai QR Payment or PromptPay (พร้อมเพย์) is a real-time payment system in Thailand that allows money transfers through digital channels using identifiers linked to a bank account, including a mobile phone number, citizen identification number, tax identification number or bank account number. The system was introduced in 2016 as part of Thailand's national e-payment infrastructure and was developed under the National e-Payment Master Plan, a government programme intended to expand digital payment infrastructure and reduce the use of cash in everyday transactions. It is owned by National ITMX ltd and Bank of Thailand and developed by Vocalink, a group by Mastercard == History == PromptPay (originally AnyID) is one of the National e-Payment projects and policies by Thailand, to regulate and standardize electronic payments to follow the technologies with internet and smartphones that is expanding and bringing technology into Finance and Commerce. By 22 December 2015, The First Prayut cabinet have approved the project as a national infastructure PromptPay has also been used in cross-border payment linkages with other real-time payment systems in Southeast Asia. In April 2021, the Monetary Authority of Singapore and the Bank of Thailand launched a linkage between Singapore's PayNow and Thailand's PromptPay, allowing customers of participating banks to send money between the two countries using a mobile phone number. In June 2021, the central banks of Thailand and Malaysia launched a cross-border QR payment linkage between PromptPay and Malaysia's DuitNow system. == Services == PromptPay's Services have included Encrypted Transactions and Payment between Two Individuals (C2C) Government Infrastructure Payment Tax Returns Individual PromptPay e-Wallet Thai QR Payment Pay Alert e-Donation Cross Border QR Payment

Software bot

A software bot is a type of software agent in the service of software project management and software engineering. A software bot has an identity and potentially personified aspects in order to serve their stakeholders. Software bots often compose software services and provide an alternative user interface, which is sometimes, but not necessarily conversational. Software bots are typically used to execute tasks, suggest actions, engage in dialogue, and promote social and cultural aspects of a software project. The term bot is derived from robot. However, robots act in the physical world and software bots act only in digital spaces. Some software bots are designed and behave as chatbots, but not all chatbots are software bots. Discussions about the past and future of software bots show that software bots have been adopted for many years. == Usage == Software bots are used to support development activities, such as communication among software developers and automation of repetitive tasks. Software bots have been adopted by several communities related to software development, such as open-source communities on GitHub and Stack Overflow. GitHub bots have user accounts and can open, close, or comment on pull requests and issues. GitHub bots have been used to assign reviewers, ask contributors to sign the Contributor License Agreement, report continuous integration failures, review code and pull requests, welcome newcomers, run automated tests, merge pull requests, fix bugs and vulnerabilities, etc. The Slack tool includes an API for developing software bots. There are slack bots for keeping track of todo lists, coordinating standup meetings, and managing support tickets. The ChatBot company products further simplify the process of creating a custom Slack bot. On Wikipedia, Wikipedia bots automate a variety of tasks, such as creating stub articles, consistently updating the format of multiple articles, and so on. Bots like ClueBot NG are capable of recognizing vandalism and automatically remove disruptive content. == Taxonomies and Classification Frameworks == Lebeuf et al. provide a faceted taxonomy to characterize bots based on a literature review. It is composed of 3 main facets: (i) properties of the environment that the bot was created in; (ii) intrinsic properties of the bot itself; and (iii) the bot's interactions within its environment. They further detail the facets into sets of sub-facets under each of the main facets. Paikari and van der Hoek defined a set of dimensions to enable comparison of software bots, applied specifically to chatbots. It resulted in six dimensions: Type: the main purpose of the bot (information, collaboration, or automation) Direction of the "conversation" (input, output, or bi-directional) Guidance (human-mediated, or autonomous) Predictability (deterministic, or evolving) Interaction style (dull, alternate vocabulary, relationship-builder, human-like) Communication channel (text, voice, or both) Erlenhov et al. raised the question of the difference between a bot and simple automation, since much research done in the name of software bots uses the term bot to describe various different tools and sometimes things are "just" plain old development tools. After interviewing and surveying over 100 developers the authors found that not one, but three definitions dominated the community. They created three personas based on these definitions and the difference between what the three personas see as being a bot is mainly the association with a different set of human-like traits. The chat bot persona (Charlie) primarily thinks of bots as tools that communicates with the developer through a natural language interface (typically voice or chat), and caring little about what tasks the bot is used for or how it actually implements these tasks. The autonomous bot persona (Alex) thinks of bots as tools that work on their own (without requiring much input from a developer) on a task that would normally be done by a human. The smart bot persona (Sam) separates bots and plain old development tools through how smart (technically sophisticated) a tool is. Sam cares less about how the tool communicates, but more about if it is unusually good or adaptive at executing a task. The authors recommends that people doing research or writing about bots try to put their work in the context of one of the personas since the personas have different expectations and problems with the tools. == Example of notable bots == Dependabot and Renovatebot update software dependencies and detect vulnerabilities. (https://dependabot.com/) Probot is an organization that create and maintain bots for GitHub. The example bots using Probot are the following. Auto Assign (https://probot.github.io/apps/auto-assign/) license bot (https://probot.github.io/) Sentiment bot (https://probot.github.io/apps/sentiment-bot/) Untrivializer bot (https://probot.github.io/apps/untrivializer/) Refactoring-Bot (Refactoring-Bot): provides refactoring based on static code analysis Looks good to me bot (LGTM) is a Semmle product that inspects pull requests on GitHub for code style and unsafe code practices. == Issues and threats == Software bots may not be well accepted by humans. A study from the University of Antwerp has compared how developers active on Stack Overflow perceive answers generated by software bots. They find that developers perceive the quality of software bot-generated answers to be significantly worse if the identity of the software bot is made apparent. By contrast, answers from software bots with human-like identity were better received. In practice, when software bots are used on platforms like GitHub or Wikipedia, their username makes it clear that they are bots, e.g., DependaBot, RenovateBot, DatBot, SineBot. Bots may be subject to special rules. For instance, the GitHub terms of service does not allow 'bots' but accepts 'machine account', where a 'machine account' has two properties: 1) a human takes full responsibility of the bot's actions 2) it cannot create other accounts.

Parkerian Hexad

The Parkerian Hexad is a set of six elements of information security proposed by Donn B. Parker in 1998. The Parkerian Hexad adds three additional attributes to the three classic security attributes of the CIA triad (confidentiality, integrity, availability). The Parkerian Hexad attributes are the following: Confidentiality Possession or Control Integrity Authenticity Availability Utility These attributes of information are atomic in that they are not broken down into further constituents; they are non-overlapping in that they refer to unique aspects of information. Any information security breach can be described as affecting one or more of these fundamental attributes of information. == Attributes from the CIA triad == === Confidentiality === Confidentiality refers to the "quality or state of being private or secret; known only to a limited few", or "the property that information is not made available or disclosed to unauthorized individuals, entities, or processes". For example: If an enterprise's strategic plans are leaked to competitors then this is a breach of confidentiality; If unauthorized persons gain access to an individual's financial records then that individual's confidentiality is breached. === Integrity === Integrity refers to being correct or consistent with the intended state of information. Any unauthorized modification of data, whether deliberate or accidental, is a breach of data integrity. For example: Data stored on disk are expected to be stable. If the data is changed at random by problems with a disk controller then this is a breach of integrity; Data generated by a medical device is transmitted and stored in the healthcare center but neither altered nor tampered with; Application programs are supposed to record information correctly. If the application introduces deviations from the intended values then this is a breach of integrity. "From Donn Parker: My definition of information integrity comes from the dictionaries. Integrity means that the information is whole, sound, and unimpaired (not necessarily correct). It means nothing is missing from the information it is complete and in intended good order". === Availability === Availability means having timely access to information. For example: A disk crash or denial-of-service attacks both cause a breach of availability. Any delay in response of a system that exceeds the expected service levels for that system can be described as a breach of availability. GPS jamming can lead to loss of Availability of the GPS system. == Parker's added attributes == === Authenticity === Authenticity is the "quality of being authentic or of established authority for truth and correctness". Parker defines it thus: "is the information genuine and accurate? Does it conform to reality and have validity?" and "authoritative, valid, true, real, genuine, or worthy of acceptance or belief by reason of conformity to fact and reality". === Possession or control === Possession or control refers to the loss of data by the authorized user (even if the ʺthiefʺ cannot access the data). From a control systems perspective, it is any loss of control (the ability to change settings and functions) or loss of view (the ability to monitor the system’s operation and its response to controls). Suppose a thief were to steal a sealed envelope containing a bank debit card and its personal identification number. Even if the thief did not open that envelope, it's reasonable for the victim to be concerned that the thief could do so at any time. That situation illustrates a loss of control or possession of information but does not involve the breach of confidentiality. === Utility === Utility refers to the data's usefulness. For example: Suppose someone encrypted data on disk to prevent unauthorized access or undetected modifications–and then lost the decryption key: that would be a breach of utility. The data would be confidential, controlled, integral, authentic, and available–they just wouldn't be useful in that form. The conversion of salary data from one currency into an inappropriate currency would be a breach of utility, as would the storage of data in a format inappropriate for a specific computer architecture; e.g., EBCDIC instead of ASCII or 9-track magnetic tape instead of DVD-ROM. A tabular representation of data substituted for a graph could be described as a breach of utility if the substitution made it more difficult to interpret the data. Utility is often confused with availability because breaches such as those described in these examples may also require time to work around the change in data format or presentation. However, the concept of usefulness is distinct from that of availability.

DevOps toolchain

A DevOps toolchain is a set or combination of tools that aid in the delivery, development, and management of software applications throughout the systems development life cycle, as coordinated by an organization that uses DevOps practices. Generally, DevOps tools fit into one or more activities, which supports specific DevOps initiatives: Plan, Create, Verify, Package, Release, Configure, Monitor, and Version Control. == Toolchains == In software, a toolchain is the set of programming tools that is used to perform a complex software development task or to create a software product, which is typically another computer program or a set of related programs. In general, the tools forming a toolchain are executed consecutively so the output or resulting environment state of each tool becomes the input or starting environment for the next one, but the term is also used when referring to a set of related tools that are not necessarily executed consecutively. As DevOps is a set of practices that emphasizes the collaboration and communication of both software developers and other information technology (IT) professionals, while automating the process of software delivery and infrastructure changes, its implementation can include the definition of the series of tools used at various stages of the lifecycle; because DevOps is a cultural shift and collaboration between development and operations, there is no one product that can be considered a single DevOps tool. Instead a collection of tools, potentially from a variety of vendors, are used in one or more stages of the lifecycle. == Stages of DevOps == === Plan === Plan consists of two elements: "define" and "plan". This activity refers to the business value and application requirements. Specifically "Plan" activities include: Production metrics, objects and feedback Requirements Business metrics Update release metrics Release plan, timing and business case Security policy and requirement A combination of the IT personnel will be involved in these activities: business application owners, software development, software architects, continual release management, security officers and the organization responsible for managing the production of IT infrastructure. === Create === Create consists of the building, coding, and configuring of the software development process. The specific activities are: Design of the software and configuration Coding including code quality and performance Software build and build performance Release candidate Tools and vendors in this category often overlap with other categories. Because DevOps is about breaking down silos, this is reflective in the activities and product solutions. === Verify === Verify is directly associated with ensuring the quality of the software release; activities designed to ensure code quality is maintained and the highest quality is deployed to production. The main activities in this are: Acceptance testing Regression testing Security and vulnerability analysis Performance Configuration testing Solutions for verify-related activities generally fall under four main categories: Test automation, Static analysis, Test Lab, and Security. === Package === Package refers to the activities involved once the release is ready for deployment, often also referred to as staging or Preproduction / "preprod". This often includes tasks and activities such as: Approval/preapprovals Package configuration Triggered releases Release staging and holding === Release === Release related activities include schedule, orchestration, provisioning and deploying software into production and targeted environment. The specific Release activities include: Release coordination Deploying and promoting applications Fallbacks and recovery Scheduled/timed releases Solutions that cover this aspect of the toolchain include application release automation, deployment automation and release management. === Configure === Configure activities fall under the operation side of DevOps. Once software is deployed, there may be additional IT infrastructure provisioning and configuration activities required. Specific activities including: Infrastructure storage, database and network provisioning and configuring Application provision and configuration. The main types of solutions that facilitate these activities are continuous configuration automation, configuration management, and infrastructure as code tools. === Monitor === Monitoring is an important link in a DevOps toolchain. It allows IT organization to identify specific issues of specific releases and to understand the impact on end-users. A summary of Monitor related activities are: Performance of IT infrastructure End-user response and experience Production metrics and statistics Information from monitoring activities often impacts Plan activities required for changes and for new release cycles. === Version Control === Version Control is an important link in a DevOps toolchain and a component of software configuration management. Version Control is the management of changes to documents, computer programs, large web sites, and other collections of information. A summary of Version Control related activities are: Non-linear development Distributed development Compatibility with existent systems and protocols Toolkit-based design Information from Version Control often supports Release activities required for changes and for new release cycles.