Round-trip engineering

Round-trip engineering (RTE) in the context of model-driven architecture is a functionality of software development tools that synchronizes two or more related software artifacts, such as, source code, models, configuration files, documentation, etc. between each other. The need for round-trip engineering arises when the same information is present in multiple artifacts and when an inconsistency may arise in case some artifacts are updated. For example, some piece of information was added to/changed in only one artifact (source code) and, as a result, it became missing in/inconsistent with the other artifacts (in models). == Overview == Round-trip engineering is closely related to traditional software engineering disciplines: forward engineering (creating software from specifications), reverse engineering (creating specifications from existing software), and reengineering (understanding existing software and modifying it). Round-trip engineering is often wrongly defined as simply supporting both forward and reverse engineering. In fact, the key characteristic of round-trip engineering that distinguishes it from forward and reverse engineering is the ability to synchronize existing artifacts that evolved concurrently by incrementally updating each artifact to reflect changes made to the other artifacts. Furthermore, forward engineering can be seen as a special instance of RTE in which only the specification is present and reverse engineering can be seen as a special instance of RTE in which only the software is present. Many reengineering activities can also be understood as RTE when the software is updated to reflect changes made to the previously reverse engineered specification. === Types === Various books describe two types of RTE: partial or uni-directional RTE: changes made to a higher level representation of a code and model are reflected in lower level, but not otherwise; the latter might be allowed, but with limitations that may not affect higher-level abstractions full or bi-directional RTE: regardless of changes, both higher and lower-level code and model representations are synchronized if any of them altered === Auto synchronization === Another characteristic of round-trip engineering is automatic update of the artifacts in response to automatically detected inconsistencies. In that sense, it is different from forward- and reverse engineering which can be both manual (traditionally) and automatic (via automatic generation or analysis of the artifacts). The automatic update can be either instantaneous or on-demand. In instantaneous RTE, all related artifacts are immediately updated after each change made to one of them. In on-demand RTE, authors of the artifacts may concurrently update the artifacts (even in a distributed setting) and at some point choose to execute matching to identify inconsistencies and choose to propagate some of them and reconcile potential conflicts. === Iterative approach === Round trip engineering may involve an iterative development process. After you have synchronized your model with revised code, you are still free to choose the best way to work – make further modifications to the code or make changes to your model. You can synchronize in either direction at any time and you can repeat the cycle as many times as necessary. == Software == Many commercial tools and research prototypes support this form of RTE; a 2007 book lists Rational Rose, Together, ESS-Model, BlueJ, and Fujaba among those capable, with Fujaba said to be capable to also identify design patterns. == Limitations == A 2005 book on Visual Studio notes for instance that a common problem in RTE tools is that the model reversed is not the same as the original one, unless the tools are aided by leaving laborious annotations in the source code. The behavioral parts of UML impose even more challenges for RTE. Usually, UML class diagrams are supported to some degree; however, certain UML concepts, such as associations and containment do not have straightforward representations in many programming languages which limits the usability of the created code and accuracy of code analysis/reverse engineering (e.g., containment is hard to recognize in the code). A more tractable form of round-trip engineering is implemented in the context of framework application programming interfaces (APIs), whereby a model describing the usage of a framework API by an application is synchronized with that application's code. In this setting, the API prescribes all correct ways the framework can be used in applications, which allows precise and complete detection of API usages in the code as well as creation of useful code implementing correct API usages. Two prominent RTE implementations in this category are framework-specific modeling languages and Spring Roo (Java). Round-trip engineering is critical for maintaining consistency among multiple models and between the models and the code in Object Management Group's (OMG) Model-driven architecture. OMG proposed the QVT (query/view/transformation) standard to handle model transformations required for MDA. To date, a few implementations of the standard have been created. (Need to present practical experiences with MDA in relation to RTE). == Controversies == === Code generation controversy === Code generation (forward-engineering) from models means that the user abstractly models solutions, which are connoted by some model data, and then an automated tool derives from the models parts or all of the source code for the software system. In some tools, the user can provide a skeleton of the program source code, in the form of a source code template where predefined tokens are then replaced with program source code parts during the code generation process. UML (if used for MDA) diagrams specification was criticized for lack the detail which is needed to contain the same information as is covered with the program source. Some developers even claim that "the Code is the design". == Disadvantages == There is a serious risk that the generated code will rapidly differ from the model or that the reverse-engineered model will lose its reflection on the code or a mix of these two problems as result of cycled reengineering efforts. Regarding behavioral/dynamic part of UML for features like statechart diagram there is no equivalents in programming languages. Their translation during code-generation will result in common programming statement (.e.g if,switch,enum) being either missing or misinterpreted. If edited and imported back may result in different or incomplete model. The same goes for code snippets used for code generation stage for the pattern-implementation and user-specific logic: intermixed they may not be easily reverse-engineered back. There is also general lack of advanced tooling for modelling that are comparable to that of modern IDEs (for testing, debugging, navigation, etc.) for general-purpose programming languages and domain-specific languages. == Examples in software engineering == Perhaps the most common form of round-trip engineering is synchronization between UML (Unified Modeling Language) models and the corresponding source code and entity–relationship diagrams in data modelling and database modelling. Round-trip engineering based on Unified Modeling Language (UML) needs three basic tools for software development: Source Code Editor; UML Editor for the Attributes and Methods; Visualisation of UML structure

LMArena

Arena (formerly LMArena and Chatbot Arena) is a public, web-based platform that evaluates large language models (LLMs). Users enter prompts for two anonymous models to respond to and vote on the model that gave the better response, after which the models' identities are revealed. Users can also choose models to test themselves via the "Direct" selection. Companies which have supplied the company with their large language models include OpenAI, Google DeepMind, and Anthropic. The website has been used for preview releases of upcoming models. Chinese company DeepSeek tested its prototype models in the Arena months before its R1 model gained attention in Western media. Other notable pre-release models include OpenAI's GPT-5 under the codename "summit" and Google DeepMind's Gemini 2.5 Flash Image (an image-generation and editing model) under the codename "Nano Banana". Research has identified specific limitations in Arena's methodology. == History == Chatbot Arena was released on April 24, 2023. In June 2024, Chatbot Arena added image support. In September 2024, Chatbot Arena moved to its own dedicated domain name, lmarena.ai (or LMArena). In April 2025, Meta released Llama 4. Llama 4 Maverick beat GPT-4o and Gemini 2.0 Flash on LMArena, but the version of Maverick on LMArena unfairly differed from the publicly available version. LMArena updated their policies in response. In April 2025, LMArena incorporated as an independent company. That May, LMArena raised $100 million in a seed funding round, valuing the company at $600 million. Participants in the seed funding round included Andreessen Horowitz, UC Investments, Lightspeed Venture Partners, Felicis Ventures, and Kleiner Perkins. On January 6, 2026, LMArena announced the closing of a $150 million Series A funding round, bringing the company’s post-money valuation to approximately $1.7 billion. The round was led by Felicis and UC Investments (University of California), with participation from Andreessen Horowitz, The House Fund, LDVP, Kleiner Perkins, Lightspeed Venture Partners, and Laude Ventures. In January 2026, LMArena added video support. On January 28, 2026, LMArena rebranded to "Arena".

Agent mining

Agent mining is a research field that combines two areas of computer science: multiagent systems and data mining. It explores how intelligent computer agents can work together to discover, analyze, and learn from large amounts of data more effectively than traditional methods. == Historical context == The interaction and the integration between multiagent systems and data mining have a long history. The very early work on agent mining focused on agent-based knowledge discovery, agent-based distributed data mining, and agent-based distributed machine learning, and using data mining to enhance agent intelligence. The International Workshop on Agents and Data Mining Interaction has been held for more than 10 times, co-located with the International Conference on Autonomous Agents and Multi-Agent Systems. Several proceedings are available from Springer Lecture Notes in Computer Science.

Tensor network

Tensor networks or tensor network states are a class of variational wave functions used in the study of many-body quantum systems and fluids. Tensor networks extend one-dimensional matrix product states to higher dimensions while preserving some of their useful mathematical properties. The wave function is encoded as a tensor contraction of a network of individual tensors. The structure of the individual tensors can impose global symmetries on the wave function (such as antisymmetry under exchange of fermions) or restrict the wave function to specific quantum numbers, like total charge, angular momentum, or spin. It is also possible to derive strict bounds on quantities like entanglement and correlation length using the mathematical structure of the tensor network. This has made tensor networks useful in theoretical studies of quantum information in many-body systems. They have also proved useful in variational studies of ground states, excited states, and dynamics of strongly correlated many-body systems. == Diagrammatic notation == In general, a tensor network diagram (Penrose diagram) can be viewed as a graph where nodes (or vertices) represent individual tensors, while edges represent summation over an index. Free indices are depicted as edges (or legs) attached to a single vertex only. Sometimes, there is also additional meaning to a node's shape. For instance, one can use trapezoids for unitary matrices or tensors with similar behaviour. This way, flipped trapezoids would be interpreted as complex conjugates to them. == History == Foundational research on tensor networks began in 1971 with a paper by Roger Penrose. In "Applications of negative dimensional tensors" Penrose developed tensor diagram notation, describing how the diagrammatic language of tensor networks could be used in applications in physics. In 1992, Steven R. White developed the density matrix renormalization group (DMRG) for quantum lattice systems. The DMRG was the first successful tensor network and associated algorithm. In 2002, Guifré Vidal and Reinhard Werner attempted to quantify entanglement, laying the groundwork for quantum resource theories. This was also the first description of the use of tensor networks as mathematical tools for describing quantum systems. In 2004, Frank Verstraete and Ignacio Cirac developed the theory of matrix product states, projected entangled pair states, and variational renormalization group methods for quantum spin systems. In 2006, Vidal developed the multi-scale entanglement renormalization ansatz (MERA). In 2007 he developed entanglement renormalization for quantum lattice systems. In 2010, Ulrich Schollwock developed the density-matrix renormalization group for the simulation of one-dimensional strongly correlated quantum lattice systems. In 2014, Román Orús introduced tensor networks for complex quantum systems and machine learning, as well as tensor network theories of symmetries, fermions, entanglement and holography. == Connection to machine learning == Tensor networks have been adapted for supervised learning, taking advantage of similar mathematical structure in variational studies in quantum mechanics and large-scale machine learning. This crossover has spurred collaboration between researchers in artificial intelligence and quantum information science. In June 2019, Google, the Perimeter Institute for Theoretical Physics, and X (company), released TensorNetwork, an open-source library for efficient tensor calculations. The main interest in tensor networks and their study from the perspective of machine learning is to reduce the number of trainable parameters (in a layer) by approximating a high-order tensor with a network of lower-order ones. Using the so-called tensor train technique (TT), one can reduce an N-order tensor (containing exponentially many trainable parameters) to a chain of N tensors of order 2 or 3, which gives us a polynomial number of parameters.

Bixonimania

Bixonimania is a fake disease invented by researchers to examine artificial intelligence and its ability to utilize information in medical and healthcare applications. The fake enabled researchers to show that some AI chatbots would report as fact fake research that to an expert would be obviously implausible. == Characteristics == The disorder, with symptoms of sore eyes and darkening around them ("periorbital hyperpigmentation"), is supposedly caused by blue light from screens. The experiment was conducted by a team from the University of Gothenburg led by Almira Osmanovic Thunström. Many steps were taken to ensure that any person who read the actual paper could tell it was not a real condition. The team chose an obviously inappropriate name ending in -mania, a description used only in psychiatry. The lead author was noted as belonging to Asteria Horizon University located in Nova City, California, neither of which exist. An acknowledgement was made to "Professor Maria Bohm at The Starfleet Academy for her kindness and generosity in contributing with her knowledge and her lab onboard the USS Enterprise". == Distribution == The name was first used in a blog posted on Medium titled "How many people suffer from Bixonimania?" A more scholarly-looking paper describing it was posted later in April 2024 on a preprint server with several fake authors. A second paper was posted in May. By 2026, AI chatbots suggested bixonimania based on the list of symptoms provided. Thunström and her team discovered that many LLMs processed the information and gave it as health advice. Microsoft Copilot declared that "Bixonimania is indeed an intriguing and relatively rare condition" while Gemini gave the information that "Bixonimania is a condition caused by excessive exposure to blue light". Three Indian researchers published a research paper that cited the preprint on the fake disease in Cureus, a peer-reviewed journal published by Springer-Nature. It was subsequently retracted. Following the revelations and a news article in Nature describing the experiment, several AI systems began to generate corrected output.

Apache ORC

Apache ORC (Optimized Row Columnar) is a free and open-source column-oriented data storage format. It is similar to the other columnar-storage file formats available in the Hadoop ecosystem such as RCFile and Parquet. It is used by most of the data processing frameworks Apache Spark, Apache Hive, Apache Flink, and Apache Hadoop. In February 2013, the Optimized Row Columnar (ORC) file format was announced by Hortonworks in collaboration with Facebook. A calendar month later, the Apache Parquet format was announced, developed by Cloudera and Twitter. Apache ORC format is widely supported including Amazon Web Services' Glue,Google Cloud Platform's BigQuery, and Pandas (software). == History ==

Blocks world

The blocks world is a planning domain in artificial intelligence. It consists of a set of wooden blocks of various shapes and colors sitting on a table. The goal is to build one or more vertical stacks of blocks. Only one block may be moved at a time: it may either be placed on the table or placed atop another block. Because of this, any blocks that are, at a given time, under another block cannot be moved. Moreover, some kinds of blocks cannot have other blocks stacked on top of them. The simplicity of this toy world lends itself readily to classical symbolic artificial intelligence approaches, in which the world is modeled as a set of abstract symbols which may be reasoned about. == Motivation == Artificial Intelligence can be researched in theory and with practical applications. The problem with most practical applications is that the engineers don't know how to program an AI system. Instead of rejecting the challenge at all the idea is to invent an easy to solve domain which is called a toy problem. Toy problems were invented with the aim to program an AI which can solve it. The blocks world domain is an example of a toy problem. Its major advantage over more realistic AI applications is that many algorithms and software programs are available which can handle the situation. This allows comparing different theories against each other. In its basic form, the blocks world problem consists of cubes of the same size which have all the color black. A mechanical robot arm has to pick and place the cubes. More complicated derivatives of the problem consist of cubes of different sizes, shapes and colors. From an algorithmic perspective, blocks world is an NP-hard search and planning problem. The task is to bring the system from an initial state into a goal state. Automated planning and scheduling problems are usually described in the Planning Domain Definition Language (PDDL) notation which is an AI planning language for symbolic manipulation tasks. If something was formulated in the PDDL notation, it is called a domain. Therefore, the task of stacking blocks is a blocks world domain which stands in contrast to other planning problems like the dock worker robot domain and the monkey and banana problem. == Theses/projects which took place in a blocks world == Terry Winograd's SHRDLU Patrick Winston's Learning Structural Descriptions from Examples and Copy Demo Gerald Jay Sussman's Sussman anomaly Decision problem (Gupta and Nau, 1992): Given a starting Blocks World, an ending Blocks World, and an integer L > 0, is there a way to move the blocks to change the starting position to the ending position with L or less steps? This decision problem is NP-hard.