EuroMatrixPlus

EuroMatrixPlus

The EuroMatrixPlus is a project that ran from March 2009 to February 2012. EuroMatrixPlus succeeded a project called EuroMatrix (September 2006 to February 2009) and continued in further development and improvement of machine translation (MT) systems for languages of the European Union (EU). == Project objectives == EuroMatrixPlus focused on achieving several goals: To continue advance of MT technology (create MT systems for all official EU languages and provide other MT researchers with existing data and infrastructure). To continually expand and investigate different MT approaches and techniques; to stay open to novel combinations of methods of MT. To bring MT to the users. Users post-edit output of statistical models and the system learns from the feedback and improves itself. Two groups of users were aimed at: Professional translators and translation agencies Users who voluntarily translate texts into their native language To contribute to MT research in Europe. To produce sample application for automatic translation of news and web pages and make that application freely accessible. == Outcome == EuroMatrixPlus contributed to MT field in several ways. It continued in development of an open source statistical MT engine Moses. The project worked on research in hybrid approaches to MT (combination of rule-based and statistical techniques). Several “MT Marathons” and annual evaluation campaigns were organized by the project. The project also resulted in releasing of 196 scientific publications. The results of the work were arranged into ten work packages: WP1: Rich Tree-Based Statistical Translation WP2: Hybrid Machine Translation WP3: Advanced Learning Methods for MT WP4: Open Source Tools and Data WP5: "WikiTrans" Translation Environments WP6: Integrated Localisation Workflow WP7: Evaluation Campaign WP8: Project Management and Dissemination WP9: Integrating Slovak Language Resources WP10: HPSG-based Statistical Translation === Software and data === Here is a list of software and data that were released by the project: Appraise – an open source tool for manual evaluation of MT output BURGER – Bulgarian Resource BulTreeBank – Treebank of Bulgarian CSLM toolkit – free tool for training continuous space language models (CSLM) to large tasks Caitra – tool for post-editing MT results Europarl – European Parliament parallel corpus IRSTLM toolkit – tool for training language models Joshua – an open-source statistical machine translation decoder for hierarchical and syntax-based MT MT Server Land – an open-source architecture for MT Moses – statistical MT MultiUN Corpora – parallel corpus extracted from the United Nations Website PCEDT 2.0 – Prague Czech-English Dependency Treebank PEDT 2.0 – English part of the Prague Czech-English Dependency Treebank Slovak corpora – English-Slovak and Czech-Slovak as well as a Slovak-English and a Slovak-Czech parallel corpus Slovak treebank – A dependency treebank TermEx – RBMT-Suited Statistical Terminology Extraction Tool Treex, TectoMT == Funding == The EuroMatrixPlus project was sponsored by EU Information Society Technology program. Total cost of the project was 5 942 121 €, from which the European Union contributed 4 266 896 €. == Project members == To ensure advance in MT, several organizations that are experts in various disciplines (linguistics, computer science, mathematics, translation) were brought together to cooperate on EuroMatrixPlus. The consortium consisted of academic as well as commercial partners. Academic partners were the University of Edinburgh (United Kingdom), DFKI – German Research Centre for Artificial Intelligence (Germany), Charles University (Czech Republic), Johns Hopkins University (United States), University of Le Mans (France), Fondazione Bruno Kessler (Italy), Dublin City University (Ireland). Two institutions joined about one year into the project. These were the L'udovít Štúr Institute of Linguistics (Slovak Republic) and IICT – Institute of Information and Communication Technologies at the Bulgarian Academy of Sciences (Bulgaria). Commercial partners included Lucy Software and Services GmbH (Germany) and CEET s.r.o. (Czech Republic). Coordination of the project was in hands of DFKI with its Language Technology Lab in Saarbrücken. The principal investigator and scientific coordinator was Hans Uszkoreit, a professor of Computational Linguistics at Saarland University.

Knowledge integration

Knowledge integration is the process of synthesizing multiple knowledge models (or representations) into a common model (representation). Compared to information integration, which involves merging information having different schemas and representation models, knowledge integration focuses more on synthesizing the understanding of a given subject from different perspectives. For example, multiple interpretations are possible of a set of student grades, typically each from a certain perspective. An overall, integrated view and understanding of this information can be achieved if these interpretations can be put under a common model, say, a student performance index. The Web-based Inquiry Science Environment (WISE), from the University of California at Berkeley has been developed along the lines of knowledge integration theory. Knowledge integration has also been studied as the process of incorporating new information into a body of existing knowledge with an interdisciplinary approach. This process involves determining how the new information and the existing knowledge interact, how existing knowledge should be modified to accommodate the new information, and how the new information should be modified in light of the existing knowledge. A learning agent that actively investigates the consequences of new information can detect and exploit a variety of learning opportunities; e.g., to resolve knowledge conflicts and to fill knowledge gaps. By exploiting these learning opportunities the learning agent is able to learn beyond the explicit content of the new information. The machine learning program KI, developed by Murray and Porter at the University of Texas at Austin, was created to study the use of automated and semi-automated knowledge integration to assist knowledge engineers constructing a large knowledge base. A possible technique which can be used is semantic matching. More recently, a technique useful to minimize the effort in mapping validation and visualization has been presented which is based on Minimal Mappings. Minimal mappings are high quality mappings such that i) all the other mappings can be computed from them in time linear in the size of the input graphs, and ii) none of them can be dropped without losing property i). The University of Waterloo operates a Bachelor of Knowledge Integration undergraduate degree program as an academic major or minor. The program started in 2008.

New Classification Scheme for Chinese Libraries

The New Classification Scheme for Chinese Libraries is a system of library classification developed by Lai Yung-hsiang since 1956. It is modified from "A System of Book Classification for Chinese Libraries" of Liu Guojun, which is based on the Dewey Decimal System. The scheme is developed for Chinese books and commonly used in Taiwan, Hong Kong and Macau. == Main classes == 000 Generalities 100 Philosophy 200 Religion 300 Sciences 400 Applied sciences 500 Social sciences 600 History of China and Geography of China 700 World history and Geography 800 Linguistics and Literature 900 Arts == Outline of the classification tables == 000 Generalities 000 Special collections 010 Bibliography; Literacy (Documentation) 020 Library and information science; Archive management 030 Sinology 040 General encyclopedia 050 Serial publications; Periodicals 060 General organization; Museology 070 General collected essays 080 General series 090 Collected Chinese classics 100 Philosophy 100 Philosophy: general 110 Thought; Learning 120 Chinese philosophy 130 Oriental philosophy 140 Western philosophy 150 Logic 160 Metaphysics 170 Psychology 180 Esthetics (Aesthetics) 190 Ethics 200 Religion 200 Religion: general 210 Science of religion 220 Buddhism 230 Taoism 240 Christianity 250 Islam (Mohammedanism) 260 Judaism 270 Other religions 280 Mythology 290 Astrology; Superstition 300 Sciences 300 Sciences: general 310 Mathematics 320 Astronomy 330 Physics 340 Chemistry 350 Earth science; Geology 360 Biological science 370 Botany 380 Zoology 390 Anthropology 400 Applied sciences 400 Applied sciences: general 410 Medical sciences 420 Home economics 430 Agriculture 440 Engineering 450 Mining and metallurgy 460 Chemical engineering 470 Manufacture 480 Commerce: various business 490 Commerce: administration and management 500 Social sciences 500 Social sciences: general 510 Statistics 520 Education 530 Rite and custom 540 Sociology 550 Economy 560 Finance 570 Political science 580 Law; Jurisprudence 590 Military science 600-700 History and geography 600 History and geography: General History and geography of China 610 General history of China 620 Chinese history by period 630 History of Chinese civilization 640 Diplomatic history of China 650 Historical sources 660 Geography of China 670 Local history 680 Topical topography 690 Chinese travels World history and geography 710 World: general history and geography 720 Oceans and seas 730 Asia: history and geography 740 Europe: history and geography 750 America: history and geography 760 Africa: history and geography 770 Oceania: history and geography 780 Biography 790 Antiquities and archaeology 800 Linguistics and literature 800 Linguistics: general 810 Literature: general 820 Chinese literature 830 Chinese literature: general collections 840 Chinese literature: individual works 850 Various Chinese literature 860 Oriental literature 870 Western literature 880 Other countries literatures 890 Journalism 900 Arts 900 Arts: general 910 Music 920 Architecture 930 Sculpture 940 Drawing and painting; Calligraphy 950 Photography; Computer art 960 Decorative arts 970 Arts and Crafts movement 980 Theatre 990 Recreation and leisure

MuZero

MuZero is a computer program developed by artificial intelligence research company DeepMind, a subsidiary of Google, to master games without knowing their rules and underlying dynamics. Its release in 2019 included benchmarks of its performance in Go, chess, shogi, and a suite of 57 different Atari games. The algorithm uses an approach similar to AlphaZero, where a combination of a tree-based search and a learned model is deployed. It matched AlphaZero's performance in chess and shogi, improved on its performance in Go, and improved on the state of the art in mastering a suite of 57 Atari games (the Arcade Learning Environment), a visually-complex domain. MuZero was trained via self-play, with no access to rules, opening books, or endgame tablebases. The trained algorithm used the same convolutional and residual architecture as AlphaZero, but with 20 percent fewer computation steps per node in the search tree. == History == MuZero really is discovering for itself how to build a model and understand it just from first principles. On November 19, 2019, the DeepMind team released a preprint introducing MuZero. === Derivation from AlphaZero === MuZero (MZ) is a combination of the high-performance planning of the AlphaZero (AZ) algorithm with approaches to model-free reinforcement learning. The combination allows for more efficient training in classical planning regimes, such as Go, while also handling domains with much more complex inputs at each stage, such as visual video games. MuZero was derived directly from AZ code, sharing its rules for setting hyperparameters. Differences between the approaches include: AZ's planning process uses a simulator. The simulator knows the rules of the game. It has to be explicitly programmed. A neural network then predicts the policy and value of a future position. Perfect knowledge of game rules is used in modeling state transitions in the search tree, actions available at each node, and termination of a branch of the tree. MZ does not have access to the rules, and instead learns one with neural networks. AZ has a single model for the game (from board state to predictions); MZ has separate models for representation of the current state (from board state into its internal embedding), dynamics of states (how actions change representations of board states), and prediction of policy and value of a future position (given a state's representation). MZ's hidden model may be complex, and it may turn out it can host computation; exploring the details of the hidden model in a trained instance of MZ is a topic for future exploration. MZ does not expect a two-player game where winners take all. It works with standard reinforcement-learning scenarios, including single-agent environments with continuous intermediate rewards, possibly of arbitrary magnitude and with time discounting. AZ was designed for two-player games that could be won, drawn, or lost. === Comparison with R2D2 === The previous state of the art technique for learning to play the suite of Atari games was R2D2, the Recurrent Replay Distributed DQN. MuZero surpassed both R2D2's mean and median performance across the suite of games, though it did not do better in every game. == Training and results == MuZero used 16 third-generation tensor processing units (TPUs) for training, and 1000 TPUs for selfplay for board games, with 800 simulations per step and 8 TPUs for training and 32 TPUs for selfplay for Atari games, with 50 simulations per step. AlphaZero used 64 second-generation TPUs for training, and 5000 first-generation TPUs for selfplay. As TPU design has improved (third-generation chips are 2x as powerful individually as second-generation chips, with further advances in bandwidth and networking across chips in a pod), these are comparable training setups. R2D2 was trained for 5 days through 2M training steps. === Initial results === MuZero matched AlphaZero's performance in chess and shogi after roughly 1 million training steps. It matched AZ's performance in Go after 500,000 training steps and surpassed it by 1 million steps. It matched R2D2's mean and median performance across the Atari game suite after 500 thousand training steps and surpassed it by 1 million steps, though it never performed well on 6 games in the suite. == Reactions and related work == MuZero was viewed as a significant advancement over AlphaZero, and a generalizable step forward in unsupervised learning techniques. The work was seen as advancing understanding of how to compose systems from smaller components, a systems-level development more than a pure machine-learning development. While only pseudocode was released by the development team, Werner Duvaud produced an open source implementation based on that. MuZero has been used as a reference implementation in other work, for instance as a way to generate model-based behavior. In late 2021, a more efficient variant of MuZero was proposed, named EfficientZero. It "achieves 194.3 percent mean human performance and 109.0 percent median performance on the Atari 100k benchmark with only two hours of real-time game experience". In early 2022, a variant of MuZero was proposed to play stochastic games (for example 2048, backgammon), called Stochastic MuZero, which uses afterstate dynamics and chance codes to account for the stochastic nature of the environment when training the dynamics network.

Cleverpath AION Business Rules Expert

Cleverpath AION Business Rules Expert (formerly Platinum AIONDS, and before that Trinzic AIONDS, and originally Aion) is an expert system and Business rules engine owned by Computer Associates by 2000. == History == The product was created around 1986 as "Aion" by the Aion company. In its initial release Aion was multi-platform and continues to be deliverable to the PC, Unixs, and Mainframe computer's. In addition it ties in seamlessly with a variety of databases including Oracle, Microsoft SQL Server, and ODBC. Aion was founded by Harry Reinstein, Larry Cohn, Garry Hallee, Scott Grinis, and others. From Scott Grinis's bio: Scott founded Aion, a company that developed expert systems and whose advanced inference engine and object technology were used by financial services and insurance firms to develop risk-scoring and underwriting applications. Harry Reinstein was quoted as saying: “Our biggest competitor was not AICorp, it was COBOL” Trinzic owned AION by 1993. A reference in a 1993 announcement indicates that Trinzic's formation was the result of a merger (paraphased): Trinzic set three development initiatives shortly after its formation from the merger of Aion Corp. and AICorp. The other initiatives -- adding SQL extensions to Aion/DS and evaluating the unbundling of some of that product's object-oriented programming capabilities -- are still active. Writing in 1993 Judith Hodges and Deborah Melewski give the date for the merger: Two rival artificial intelligence software vendors -- AICorp, Inc. and Aion Corp. -- merged in September 1992 to form Trinzic Corp. As part of the merger, redundant jobs were eliminated (20% of the combined work force), leaving a total work force of 245 employees worldwide. The new firm also boasted a combined installed base of more than 1,200 sites representing more than 10,000 software licenses. Although in the merger, technically AICorp bought Aion, as AICorp was a public company and Aion was still private, the reality was that Aion's leadership and technology subsumed AICorp's. Jim Gagnard, the CEO of Aion, became CEO of Trinzic and AICorp's flagship product, KBMS, was discontinued, while the Aion Development System continued to be enhanced and KBMS customers were assisted in converting to AIONDS, under the continued technical leadership of Garry Hallee and Scott Grinis. On August 1, 1994 Trinzic released version 6.4 of AIONDS saying, in part: Trinzic Corp., Palo Alto, Calif., has unveiled The Aion Development System (AionDS) Version 6.4, an upgrade to the company's development environment for building business process automation applications. Version 6.4 provides a visual development environment for Microsoft Windows or OS/2 PM applications using business rules. Trinzic was acquired by PLATINUM Technologies in 1995 which retained at least some of Trinzic's acquisitions Platinum Technologies was acquired by Computer Associates in 1999. CA changed the system's name to CA Aion Business Rules Expert" on or before 2009. It is currently (June 2011) at Release 11 on a wide range of supported platforms. == Applications using Aion == Aion has been used in a variety of industries including Energy, Insurance, Military, Aviation, and Banking. At one point an Aion expert system application written by Covia, LLC existed to do airport gate assignment. Colossus, a computer program, developed by Computer Sciences Corporation is the insurance industry’s leading expert system for assisting adjusters in the evaluation of bodily injury claims (aka "pain and suffering"). Colossus helps adjusters reduce variance in payouts on similar bodily injury claims through objective use of industry standard rules.

DAVI

The Dutch Automated Vehicle Initiative (DAVI) is a research and demonstration initiative developing automated vehicles for use on public roads. The project is unique in that, besides simply making driverless cars, it also focuses on having automated vehicles share information among each other. The aim is to have the cars help to avoid traffic congestion by reducing the safety distance between the cars (from 2 seconds to 0.5 seconds) and avoiding sudden traffic slow-downs due to maneuvers undertaken by drivers.

Production Rule Representation

The Production Rule Representation (PRR) is a proposed standard of the Object Management Group (OMG) that aims to define a vendor-neutral model for representing production rules within the Unified Modeling Language (UML), specifically for use in forward-chaining rule engines. == History == The OMG set up a Business Rules Working Group in 2002 as the first standards body to recognize the importance of the "Business Rules Approach". It issued 2 main RFPs in 2003 – a standard for modeling production rules (PRR), and a standard for modeling business rules as business documentation (BSBR, now SBVR). PRR was mostly defined by and for vendors of Business Rule Engines (BREs) (sometimes termed Business Rules Engine(s), like in Wikipedia). Contributors have included all the major BRE vendors, members of RuleML, and leading UML vendors. == Evolution == The PRR RFP originally suggested that PRR use a combination of UML OCL and Action Semantics for rule conditions and actions. However, expecting modellers to learn 2 relatively obscure UML languages in order to define a production rule proved unpalatable. Therefore, PRR OCL was defined that included OCL extensions for simple rule actions (as well as external functions). PRR OCL is currently considered "non-normative" i.e. is not part of the PRR standard per se. PRR beta applies just to a PRR Core that excludes an explicit expression language. The PRR RFP envisaged covering both forward and backward chaining rule engines. However, the lack of vendor support for / interest in backward chaining caused this to be revise to forward chaining and "sequential" semantics. The latter is simply the scripting mode provided by many BPM tools, where rules are listed and executed sequentially as if programmed. This provides PRR with better compatibility with typical BPM scripting engines (and acknowledges the fact that most BREs today support a "sequential" mode of operation, improving performance in some circumstances). == Status == PRR is currently at version 1.0.