ARIS Express

ARIS Express

ARIS Express is a free-of-charge modeling tool for business process analysis and management. It supports different modeling notations such as BPMN 2, Event-driven Process Chains (EPC), Organizational charts, process landscapes, whiteboards, etc. ARIS Express was initially developed by IDS Scheer, which was bought by Software AG in December 2010. The tool is provided as freeware on the ARIS Community webpage. ARIS Express is notable - having been mentioned in research published by Schumm, Garcia, Krumnow and Greenwood amongst others. == History == ARIS Express was first announced on April 28, 2009 in a press release by IDS Scheer. The first release was on July 28, 2009 in a public beta test on ARIS Community. Only people, who registered before for the beta test were allowed to download and test this beta version. This closed beta test was followed with another public beta test. The official release of ARIS Express 1.0 was on September 9, 2009. In this first stable version, features such as Microsoft Visio import were added, which were not present in the version for the public beta test. On February 26, 2010, ARIS Express 2.0 was released. Major changes compared to version 1.0 include BPMN 2 support, integrated spellchecking and ARISalign integration. On May 25, 2010, version 2.1 of ARIS Express was released. This update improves BPMN 2 support, provides a new online help system for instant feedback, better ARISalign integration and some new symbols in different diagrams. Along with the release, a poster showing the most important modeling concepts supported by ARIS Express was released. In addition, an executable setup is provided for Microsoft Windows-based systems. Beginning of July, an update was released as ARIS Express 2.2, providing bug fixes only. ARIS Express version 2.2 is the current stable release. An official press release published mid of August 2010 said there are more than 50,000 downloads of ARIS Express. On February 2, 2011, version 2.3 of ARIS Express was released. This new version changes the file format of ARIS Express so that models can be shown in an interactive model viewer in ARIS Community. The release announcement contained no details about additional features or changes. == Functionality == === Overview === ARIS Express is a standalone single-user application. It is divided in a home screen and a modeling environment. The home screen is used to create new models or open recently edited ones. The modeling environment is used to edit diagrams. === Supported notations === The following notations are supported by ARIS Express. Users can create diagrams containing an unlimited number of modeling objects. BPMN 2 Collaboration Diagrams Event-driven Process Chains (EPC) Organizational charts Process landscape (value-added chain diagram) Data model in ERM notation IT infrastructure (network diagram) System landscape (component diagram) Whiteboard General diagram === Noteworthy features === Besides common features such as creating new diagrams, saving them as files or adding objects to the modeling canvas, ARIS Express also provides some noteworthy features, which can't be found in most comparable modeling tools. fragments - Often used modeling constructs such as an exclusive decision in a process model can be stored as fragments so that they are available for direct reuse in another model. smart designs - The flow of a process model or hierarchies of other models can be captured in a spreadsheet-like interface. While entering the data in the spreadsheet, the model is generated and laid out in the background while typing. mini toolbar - While moving the mouse pointer over an object in a diagram, a small toolbar is shown allowing quick access to the most important modeling actions. Microsoft Visio import - Diagrams created with Microsoft Visio 2007 or above can be imported to and edited in ARIS Express. A Microsoft Visio export is not provided. ARISalign import - Models created on the online collaboration platform ARISalign can be opened and edited in ARIS Express. === Exports === ARIS Express can export diagrams to different formats such as: PDF JPEG PNG EMF ADF ADF is the file format of ARIS Express. The professional tools of ARIS Platform are able to import diagrams stored in the ADF format. Yet, there are major limitations during import - namely, each object in diagram will be treated as unique object, despite having same type and name, forcing redrawing large sections of diagrams after import. Besides export formats, it is also possible to use the clipboard to copy and paste an ARIS Express diagram into typical office suites such as Microsoft PowerPoint. == Technology == ARIS Express is a Java-based application, which shares some of the features of ARIS Platform products such as ARIS Business Architect and ARIS Business Designer. In contrast to ARIS Platform products, ARIS Express doesn't use a central database for model storage. Instead, each diagram is stored in an ADF file. ARIS Express uses Java Web Start. After download, the application can be started immediately without installation procedure. For Microsoft Windows based systems, an ordinary setup is provided, too. ARIS Express requires Java 1.6.10 or above. On first startup, the user must enter a valid ARIS Community account to register the application. Creating an ARIS Community account is free-of-charge. After installation, no Internet connection is needed to use ARIS Express. ARIS Express uses a mechanism provided by Java Web Start to automatically update the application as soon as a new version becomes available and the user is connected to the Internet during startup. There are reports that this automated update failed while upgrading from version 1.0 to version 2.0. As ARIS Express is based on Java Web Start, it can be installed on any platform supported by Java. The ARIS Community and other Internet sources have reports of successful deployment of ARIS Express on other operating systems than Microsoft Windows. However, ARIS Express is officially supported only on Microsoft Windows. == Miscellaneous == A quick reference sheet is available for ARIS Express. The poster shows all supported diagrams plus the most important modelling concepts for each supported modelling language. ARIS Express contains a hidden game, a so-called Easter Egg. The game can be started by clicking several times on the product logo in the about dialog. Highscores achieved in the game can be submitted to a special page in ARIS Community. A Firefox Personas is available for ARIS Express.

Tensor (machine learning)

In machine learning, the term tensor informally refers to two different concepts: (i) a way of organizing data and (ii) a multilinear (tensor) transformation. Data may be organized in a multidimensional array (M-way array), informally referred to as a "data tensor"; however, in the strict mathematical sense, a tensor is a multilinear mapping over a set of domain vector spaces to a range vector space. Observations, such as images, movies, volumes, sounds, and relationships among words and concepts, stored in an M-way array ("data tensor"), may be analyzed either by artificial neural networks or tensor methods. Tensor decomposition factors data tensors into smaller tensors. Operations on data tensors can be expressed in terms of matrix multiplication and the Kronecker product. The computation of gradients, a crucial aspect of backpropagation, can be performed using software libraries such as PyTorch and TensorFlow. Computations are often performed on graphics processing units (GPUs) using CUDA, and on dedicated hardware such as Google's Tensor Processing Unit or Nvidia's Tensor core. These developments have greatly accelerated neural network architectures, and increased the size and complexity of models that can be trained. == History == A tensor is by definition a multilinear map. In mathematics, this may express a multilinear relationship between sets of algebraic objects. In physics, tensor fields, considered as tensors at each point in space, are useful in expressing mechanics such as stress or elasticity. In machine learning, the exact use of tensors depends on the statistical approach being used. In 2001, the field of signal processing and statistics were making use of tensor methods. Pierre Comon surveys the early adoption of tensor methods in the fields of telecommunications, radio surveillance, chemometrics and sensor processing. Linear tensor rank methods (such as, Parafac/CANDECOMP) analyzed M-way arrays ("data tensors") composed of higher order statistics that were employed in blind source separation problems to compute a linear model of the data. He noted several early limitations in determining the tensor rank and efficient tensor rank decomposition. In the early 2000s, multilinear tensor methods crossed over into computer vision, computer graphics and machine learning with papers by Vasilescu or in collaboration with Terzopoulos, such as Human Motion Signatures, TensorFaces TensorTextures and Multilinear Projection. Multilinear algebra, the algebra of higher-order tensors, is a suitable and transparent framework for analyzing the multifactor structure of an ensemble of observations and for addressing the difficult problem of disentangling the causal factors based on second order or higher order statistics associated with each causal factor. Tensor (multilinear) factor analysis disentangles and reduces the influence of different causal factors with multilinear subspace learning. When treating an image or a video as a 2- or 3-way array, i.e., "data matrix/tensor", tensor methods reduce spatial or time redundancies as demonstrated by Wang and Ahuja. Yoshua Bengio, Geoff Hinton and their collaborators briefly discuss the relationship between deep neural networks and tensor factor analysis beyond the use of M-way arrays ("data tensors") as inputs. One of the early uses of tensors for neural networks appeared in natural language processing. A single word can be expressed as a vector via Word2vec. Thus a relationship between two words can be encoded in a matrix. However, for more complex relationships such as subject-object-verb, it is necessary to build higher-dimensional networks. In 2009, the work of Sutskever introduced Bayesian Clustered Tensor Factorization to model relational concepts while reducing the parameter space. From 2014 to 2015, tensor methods become more common in convolutional neural networks (CNNs). Tensor methods organize neural network weights in a "data tensor", analyze and reduce the number of neural network weights. Lebedev et al. accelerated CNN networks for character classification (the recognition of letters and digits in images) by using 4D kernel tensors. == Definition == Let F {\displaystyle \mathbb {F} } be a field (such as the real numbers R {\displaystyle \mathbb {R} } or the complex numbers C {\displaystyle \mathbb {C} } ). A tensor T ∈ F I 1 × I 2 × … × I C {\displaystyle {\mathcal {T}}\in {\mathbb {F} }^{I_{1}\times I_{2}\times \ldots \times I_{C}}} is a multilinear transformation from a set of domain vector spaces to a range vector space: T : { F I 1 × F I 2 × … F I C } ↦ F I 0 {\displaystyle {\mathcal {T}}:\{{\mathbb {F} }^{I_{1}}\times {\mathbb {F} }^{I_{2}}\times \ldots {\mathbb {F} }^{I_{C}}\}\mapsto {\mathbb {F} }^{I_{0}}} Here, C {\displaystyle C} and I 0 , I 1 , … , I C {\displaystyle I_{0},I_{1},\ldots ,I_{C}} are positive integers, and ( C + 1 ) {\displaystyle (C+1)} is the number of modes of a tensor (also known as the number of ways of a multi-way array). The dimensionality of mode c {\displaystyle c} is I c {\displaystyle I_{c}} , for 0 ≤ c ≤ C {\displaystyle 0\leq c\leq C} . In statistics and machine learning, an image is vectorized when viewed as a single observation, and a collection of vectorized images is organized as a "data tensor". For example, a set of facial images { d i p , i e , i l , i v ∈ R I X } {\displaystyle \{{\mathbb {d} }_{i_{p},i_{e},i_{l},i_{v}}\in {\mathbb {R} }^{I_{X}}\}} with I X {\displaystyle I_{X}} pixels that are the consequences of multiple causal factors, such as a facial geometry i p ( 1 ≤ i p ≤ I P ) {\displaystyle i_{p}(1\leq i_{p}\leq I_{P})} , an expression i e ( 1 ≤ i e ≤ I E ) {\displaystyle i_{e}(1\leq i_{e}\leq I_{E})} , an illumination condition i l ( 1 ≤ i l ≤ I L ) {\displaystyle i_{l}(1\leq i_{l}\leq I_{L})} , and a viewing condition i v ( 1 ≤ i v ≤ I V ) {\displaystyle i_{v}(1\leq i_{v}\leq I_{V})} may be organized into a data tensor (ie. multiway array) D ∈ R I X × I P × I E × I L × V {\displaystyle {\mathcal {D}}\in {\mathbb {R} }^{I_{X}\times I_{P}\times I_{E}\times I_{L}\times V}} where I P {\displaystyle I_{P}} are the total number of facial geometries, I E {\displaystyle I_{E}} are the total number of expressions, I L {\displaystyle I_{L}} are the total number of illumination conditions, and I V {\displaystyle I_{V}} are the total number of viewing conditions. Tensor factorizations methods such as TensorFaces and multilinear (tensor) independent component analysis factorizes the data tensor into a set of vector spaces that span the causal factor representations, where an image is the result of tensor transformation T {\displaystyle {\mathcal {T}}} that maps a set of causal factor representations to the pixel space. Another approach to using tensors in machine learning is to embed various data types directly. For example, a grayscale image, commonly represented as a discrete 2-way array D ∈ R I R X × I C X {\displaystyle {\mathbf {D} }\in {\mathbb {R} }^{I_{RX}\times I_{CX}}} with dimensionality I R X × I C X {\displaystyle I_{RX}\times I_{CX}} where I R X {\displaystyle I_{RX}} are the number of rows and I C X {\displaystyle I_{CX}} are the number of columns. When an image is treated as 2-way array or 2nd order tensor (i.e. as a collection of column/row observations), tensor factorization methods compute the image column space, the image row space and the normalized PCA coefficients or the ICA coefficients. Similarly, a color image with RGB channels, D ∈ R N × M × 3 . {\displaystyle {\mathcal {D}}\in \mathbb {R} ^{N\times M\times 3}.} may be viewed as a 3rd order data tensor or 3-way array.-------- In natural language processing, a word might be expressed as a vector v {\displaystyle v} via the Word2vec algorithm. Thus v {\displaystyle v} becomes a mode-1 tensor v ↦ A ∈ R N . {\displaystyle v\mapsto {\mathcal {A}}\in \mathbb {R} ^{N}.} The embedding of subject-object-verb semantics requires embedding relationships among three words. Because a word is itself a vector, subject-object-verb semantics could be expressed using mode-3 tensors v a × v b × v c ↦ A ∈ R N × N × N . {\displaystyle v_{a}\times v_{b}\times v_{c}\mapsto {\mathcal {A}}\in \mathbb {R} ^{N\times N\times N}.} In practice the neural network designer is primarily concerned with the specification of embeddings, the connection of tensor layers, and the operations performed on them in a network. Modern machine learning frameworks manage the optimization, tensor factorization and backpropagation automatically. === As unit values === Tensors may be used as the unit values of neural networks which extend the concept of scalar, vector and matrix values to multiple dimensions. The output value of single layer unit y m {\displaystyle y_{m}} is the sum-product of its input units and the connection weights filtered through the activation function f {\displaystyle f} : y m = f ( ∑ n x n u m , n ) , {\displaystyle y_{m}=f\left(\sum _{n}x_{n}u_{m,n}\right),} where y m ∈ R .

Torus interconnect

A torus interconnect is a switch-less network topology for connecting processing nodes in a parallel computer system. == Introduction == In geometry, a torus is created by revolving a circle about an axis coplanar to the circle. While this is a general definition in geometry, the topological properties of this type of shape describes the network topology in its essence. === Geometry illustration === In the representations below, the first is a one dimension torus, a simple circle. The second is a two dimension torus, in the shape of a 'doughnut'. The animation illustrates how a two dimension torus is generated from a rectangle by connecting its two pairs of opposite edges. At one dimension, a torus topology is equivalent to a ring interconnect network, in the shape of a circle. At two dimensions, it becomes equivalent to a two dimension mesh, but with extra connection at the edge nodes. === Torus network topology === A torus interconnect is a switch-less topology that can be seen as a mesh interconnect with nodes arranged in a rectilinear array of N = 2, 3, or more dimensions, with processors connected to their nearest neighbors, and corresponding processors on opposite edges of the array connected.[1] In this lattice, each node has 2N connections. This topology is named for the lattice formed in this way, which is topologically homogeneous to an N-dimensional torus. == Visualization == The first 3 dimensions of torus network topology are easier to visualize and are described below: 1D Torus: one dimension, n nodes are connected in closed loop with each node connected to its two nearest neighbors. Communication can take place in two directions, +x and −x. A 1D Torus is the same as ring interconnection. 2D Torus: two dimensions with degree of four, the nodes are imagined laid out in a two-dimensional rectangular lattice of n rows and n columns, with each node connected to its four nearest neighbors, and corresponding nodes on opposite edges connected. Communication can take place in four directions, +x, −x, +y, and −y. The total nodes of a 2D Torus is n2. 3D Torus: three dimensions, the nodes are imagined in a three-dimensional lattice in the shape of a rectangular prism, with each node connected with its six neighbors, with corresponding nodes on opposing faces of the array connected. Each edge consists of n nodes. communication can take place in six directions, +x, −x, +y, −y, +z, −z. Each edge of a 3D Torus consist of n nodes. The total nodes of 3D Torus is n3. ND Torus: N dimensions, each node of an N dimension torus has 2N neighbors, Communication can take place in 2N directions. Each edge consists of n nodes. Total nodes of this torus is nN. The main motivation of having higher dimension of torus is to achieve higher bandwidth, lower latency, and higher scalability. Higher-dimensional arrays are difficult to visualize. The above ruleset shows that each higher dimension adds another pair of nearest neighbor connections to each node. == Performance == A number of supercomputers on the TOP500 list use three-dimensional torus networks, e.g. IBM's Blue Gene/L and Blue Gene/P, and the Cray XT3. IBM's Blue Gene/Q uses a five-dimensional torus network. Fujitsu's K computer and the PRIMEHPC FX10 use a proprietary three-dimensional torus 3D mesh interconnect called Tofu. === 3D Torus performance simulation === Sandeep Palur and Dr. Ioan Raicu from Illinois Institute of Technology conducted experiments to simulate 3D torus performance. Their experiments ran on a computer with 250GB RAM, 48 cores and x86_64 architecture. The simulator they used was ROSS (Rensselaer’s Optimistic Simulation System). They mainly focused on three aspects: Varying network size Varying number of servers Varying message size They concluded that throughput decreases with the increase of servers and network size. Otherwise, throughput increases with the increase of message size. === 6D Torus product performance === Fujitsu Limited developed a 6D torus computer model called "Tofu". In their model, a 6D torus can achieve 100 GB/s off-chip bandwidth, 12 times higher scalability than a 3D torus, and high fault tolerance. The model is used in the K computer and Fugaku. === Cost === While long wrap-around links may be the easiest way to visualize the connection topology, in practice, restrictions on cable lengths often make long wrap-around links impractical. Instead, directly connected nodes—including nodes that the above visualization places on opposite edges of a grid, connected by a long wrap-around link—are physically placed nearly adjacent to each other in a folded torus network. Every link in the folded torus network is very short—almost as short as the nearest-neighbor links in a simple grid interconnect—and therefore low-latency.

HTTP Strict Transport Security

HTTP Strict Transport Security (HSTS) is a policy mechanism that helps to protect websites against man-in-the-middle attacks such as protocol downgrade attacks and cookie hijacking. It allows web servers to declare that web browsers (or other complying user agents) should automatically interact with it using only HTTPS connections, which provide Transport Layer Security (TLS/SSL), unlike the insecure HTTP used alone. HSTS is an IETF standards track protocol and is specified in RFC 6797. The HSTS Policy is communicated by the server to the user agent via an HTTP response header field named Strict-Transport-Security. HSTS Policy specifies a period of time during which the user agent should only access the server in a secure fashion. Websites using HSTS often do not accept clear text HTTP, either by rejecting connections over HTTP or systematically redirecting users to HTTPS (though this is not required by the specification). The consequence of this is that a user-agent not capable of doing TLS will not be able to connect to the site. The protection normally only applies after a user has visited the site at least once, relying on the principle of "trust on first use". The way this protection works is that when a user entering or selecting an HTTP (not HTTPS) URL to the site, the client, such as a Web browser, will automatically upgrade to HTTPS without making an HTTP request, thereby preventing any HTTP man-in-the-middle attack from occurring. To counteract this problem, an HSTS preload list maintained by Google Chrome and used by other major web browsers is maintained. If a domain is on this list, the browser skips the initial request and encrypts all communication immediately. Additional domains can be registered at no cost. == Specification history == The HSTS specification was published as RFC 6797 on 19 November 2012 after being approved on 2 October 2012 by the IESG for publication as a Proposed Standard RFC. The authors originally submitted it as an Internet Draft on 17 June 2010. With the conversion to an Internet Draft, the specification name was altered from "Strict Transport Security" (STS) to "HTTP Strict Transport Security", because the specification applies only to HTTP. The HTTP response header field defined in the HSTS specification however remains named "Strict-Transport-Security". The last so-called "community version" of the then-named "STS" specification was published on 18 December 2009, with revisions based on community feedback. The original draft specification by Jeff Hodges from PayPal, Collin Jackson, and Adam Barth was published on 18 September 2009. The HSTS specification is based on original work by Jackson and Barth as described in their paper "ForceHTTPS: Protecting High-Security Web Sites from Network Attacks". Additionally, HSTS is the realization of one facet of an overall vision for improving web security, put forward by Jeff Hodges and Andy Steingruebl in their 2010 paper The Need for Coherent Web Security Policy Framework(s). == HSTS mechanism overview == A server implements an HSTS policy by supplying a header over an HTTPS connection (HSTS headers over HTTP are ignored). For example, a server could send a header such that future requests to the domain for the next year (max-age is specified in seconds; 31,536,000 is equal to one non-leap year) use only HTTPS: Strict-Transport-Security: max-age=31536000. When a web application issues HSTS Policy to user agents, conformant user agents behave as follows: Automatically turn any insecure links referencing the web application into secure links (e.g. http://example.com/some/page/ will be modified to https://example.com/some/page/ before accessing the server). If the security of the connection cannot be ensured (e.g. the server's TLS certificate is not trusted), the user agent must terminate the connection and should not allow the user to access the web application. This helps protect web application users against some passive (eavesdropping) and active network attacks. A man-in-the-middle attacker has a greatly reduced ability to intercept requests and responses between a user and a web application server while the user's browser has HSTS Policy in effect for that web application. == Applicability == The most important security vulnerability that HSTS can fix is SSL-stripping man-in-the-middle attacks, first publicly introduced by Moxie Marlinspike in his 2009 BlackHat Federal talk "New Tricks For Defeating SSL In Practice". The SSL (and TLS) stripping attack works by transparently converting a secure HTTPS connection into a plain HTTP connection. The user can see that the connection is insecure, but crucially there is no way of knowing whether the connection should be secure. At the time of Marlinspike's talk, many websites did not use TLS/SSL, therefore there was no way of knowing (without prior knowledge) whether the use of plain HTTP was due to an attack, or simply because the website had not implemented TLS/SSL. Additionally, no warnings are presented to the user during the downgrade process, making the attack fairly subtle to all but the most vigilant. Marlinspike's sslstrip tool, presented at Black Hat DC 2009, fully automates the attack. HSTS addresses this problem by informing the browser that connections to the site should always use TLS/SSL. The HSTS header can be stripped by the attacker if this is the user's first visit. Google Chrome, Mozilla Firefox, Internet Explorer, and Microsoft Edge attempt to limit this problem by including a "pre-loaded" list of HSTS sites. Unfortunately this solution cannot scale to include all websites on the internet. See limitations, below. HSTS can also help to prevent having one's cookie-based website login credentials stolen by widely available tools such as Firesheep. Because HSTS is time limited, it is sensitive to attacks involving shifting the victim's computer time e.g. using false NTP packets. == Limitations == The initial request remains unprotected from active attacks if it uses an insecure protocol such as plain HTTP or if the URI for the initial request was obtained over an insecure channel. The same applies to the first request after the activity period specified in the advertised HSTS Policy max-age (sites should set a period of several days or months depending on user activity and behavior). === Solutions with preload list === Google Chrome, Mozilla Firefox, and Internet Explorer/Microsoft Edge address this limitation by implementing a "HSTS preloaded list", which is a list that contains known sites supporting HSTS. This list is distributed with the browser so that it uses HTTPS for the initial request to the listed sites as well. As previously mentioned, these pre-loaded lists cannot scale to cover the entire Web. A potential solution might be achieved by using DNS records to declare HSTS Policy, and accessing them securely via DNSSEC, optionally with certificate fingerprints to ensure validity (which requires running a validating resolver to avoid last mile issues). Junade Ali has noted that HSTS is ineffective against the use of false domains; by using DNS-based attacks, it is possible for a man-in-the-middle interceptor to serve traffic from an artificial domain which is not on the HSTS Preload list, this can be made possible by DNS Spoofing Attacks, or simply a domain name that misleadingly resembles the real domain name such as www.example.org instead of www.example.com. Even with an HSTS preloaded list, HSTS cannot prevent advanced attacks against TLS itself, such as the BEAST or CRIME attacks introduced by Juliano Rizzo and Thai Duong. Attacks against TLS itself are orthogonal to HSTS policy enforcement. Neither can it protect against attacks on the server - if someone compromises it, it will happily serve any content over TLS. === Privacy issues === HSTS can be used to near-indelibly tag visiting browsers with recoverable identifying data (supercookies) which can persist in and out of browser "incognito" privacy modes. By creating a web page that makes multiple HTTP requests to selected domains, for example, if twenty browser requests to twenty different domains are used, theoretically over one million visitors can be distinguished (220) due to the resulting requests arriving via HTTP vs. HTTPS; the latter being the previously recorded binary "bits" established earlier via HSTS headers. == Browser support == Chromium and Google Chrome since version 4.0.211.0 Firefox since version 4; with Firefox 17, Mozilla integrates a list of websites supporting HSTS. Opera since version 12 Safari since OS X Mavericks (version 10.9, late 2013) Internet Explorer 11 on Windows 8.1 and Windows 7 with KB3058515 installed (Released as a Windows Update in June 2015) Microsoft Edge and Internet Explorer 11 on Windows 10 BlackBerry 10 Browser and WebView since BlackBerry OS 10.3.3. == Deployment best practices == Depending on the actual deployment there are certain threats (e.g. cookie injection attacks) t

Yao's test

In cryptography and the theory of computation, Yao's test is a test defined by Andrew Chi-Chih Yao in 1982, against pseudo-random sequences. A sequence of words passes Yao's test if an attacker with reasonable computational power cannot distinguish it from a sequence generated uniformly at random. == Formal statement == === Boolean circuits === Let P {\displaystyle P} be a polynomial, and S = { S k } k {\displaystyle S=\{S_{k}\}_{k}} be a collection of sets S k {\displaystyle S_{k}} of P ( k ) {\displaystyle P(k)} -bit long sequences, and for each k {\displaystyle k} , let μ k {\displaystyle \mu _{k}} be a probability distribution on S k {\displaystyle S_{k}} , and P C {\displaystyle P_{C}} be a polynomial. A predicting collection C = { C k } {\displaystyle C=\{C_{k}\}} is a collection of boolean circuits of size less than P C ( k ) {\displaystyle P_{C}(k)} . Let p k , S C {\displaystyle p_{k,S}^{C}} be the probability that on input s {\displaystyle s} , a string randomly selected in S k {\displaystyle S_{k}} with probability μ ( s ) {\displaystyle \mu (s)} , C k ( s ) = 1 {\displaystyle C_{k}(s)=1} , i.e. Moreover, let p k , U C {\displaystyle p_{k,U}^{C}} be the probability that C k ( s ) = 1 {\displaystyle C_{k}(s)=1} on input s {\displaystyle s} a P ( k ) {\displaystyle P(k)} -bit long sequence selected uniformly at random in { 0 , 1 } P ( k ) {\displaystyle \{0,1\}^{P(k)}} . We say that S {\displaystyle S} passes Yao's test if for all predicting collection C {\displaystyle C} , for all but finitely many k {\displaystyle k} , for all polynomial Q {\displaystyle Q} : === Probabilistic formulation === As in the case of the next-bit test, the predicting collection used in the above definition can be replaced by a probabilistic Turing machine, working in polynomial time. This also yields a strictly stronger definition of Yao's test (see Adleman's theorem). Indeed, one could decide undecidable properties of the pseudo-random sequence with the non-uniform circuits described above, whereas BPP machines can always be simulated by exponential-time deterministic Turing machines.

Protocol engineering

Protocol engineering is the application of systematic methods to the development of communication protocols. It uses many of the principles of software engineering, but it is specific to the development of distributed systems. == History == When the first experimental and commercial computer networks were developed in the 1970s, the concept of protocols was not yet well developed. These were the first distributed systems. In the context of the newly adopted layered protocol architecture (see OSI model), the definition of the protocol of a specific layer should be such that any entity implementing that specification in one computer would be compatible with any other computer containing an entity implementing the same specification, and their interactions should be such that the desired communication service would be obtained. On the other hand, the protocol specification should be abstract enough to allow different choices for the implementation on different computers. It was recognized that a precise specification of the expected service provided by the given layer was important. It is important for the verification of the protocol, which should demonstrate that the communication service is provided if both protocol entities implement the protocol specification correctly. This principle was later followed during the standardization of the OSI protocol stack, in particular for the transport layer. It was also recognized that some kind of formalized protocol specification would be useful for the verification of the protocol and for developing implementations, as well as test cases for checking the conformance of an implementation against the specification. While initially mainly finite-state machine were used as (simplified) models of a protocol entity, in the 1980s three formal specification languages were standardized, two by ISO and one by ITU. The latter, called SDL, was later used in industry and has been merged with UML state machines. == Principles == The following are the most important principles for the development of protocols: Layered architecture: A protocol layer at the level n consists of two (or more) entities that have a service interface through which the service of the layer is provided to the users of the protocol, and which uses the service provided by a local entity of level (n-1). The service specification of a layer describes, in an abstract and global view, the behavior of the layer as visible at the service interfaces of the layer. The protocol specification defines the requirements that should be satisfied by each entity implementation. Protocol verification consists of showing that two (or more) entities satisfying the protocol specification will provide at their service interfaces the specified service of that layer. The (verified) protocol specification is used mainly for the following two activities: The development of an entity implementation. Note that the abstract properties of the service interface are defined by the service specification (and also used by the protocol specification), but the detailed nature of the interface can be chosen during the implementation process, separately for each entity. Test suite development for conformance testing. Protocol conformance testing checks that a given entity implementation conforms to the protocol specification. The conformance test cases are developed based on the protocol specification and are applicable to all entity implementations. Therefore standard conformance test suites have been developed for certain protocol standards. == Methods and tools == Tools for the activities of protocol verification, entity implementation and test suite development can be developed when the protocol specification is written in a formalized language which can be understood by the tool. As mentioned, formal specification languages have been proposed for protocol specification, and the first methods and tools where based on finite-state machine models. Reachability analysis was proposed to understand all possible behaviors of a distributed system, which is essential for protocol verification. This was later complemented with model checking. However, finite-state descriptions are not powerful enough to describe constraints between message parameters and the local variables in the entities. Such constraints can be described by the standardized formal specification languages mentioned above, for which powerful tools have been developed. It is in the field of protocol engineering that model-based development was used very early. These methods and tools have later been used for software engineering as well as hardware design, especially for distributed and real-time systems. On the other hand, many methods and tools developed in the more general context of software engineering can also be used of the development of protocols, for instance model checking for protocol verification, and agile methods for entity implementations. == Constructive methods for protocol design == Most protocols are designed by human intuition and discussions during the standardization process. However, some methods have been proposed for using constructive methods possibly supported by tools to automatically derive protocols that satisfy certain properties. The following are a few examples: Semi-automatic protocol synthesis: The user defines all message sending actions of the entities, and the tool derives all necessary reception actions (even if several messages are in transit). Synchronizing protocol: The state transitions of one protocol entity are given by the user, and the method derives the behavior of the other entity such that it remains in states that correspond to the former entity. Protocol derived from service specification: The service specification is given by the user and the method derives a suitable protocol for all entities. Protocol for control applications: The specification of one entity (called the plant - which must be controlled) is given, and the method derives a specification of the other entity such that certain fail states of the plant are never reached and certain given properties of the plant's service interactions are satisfied. This is a case of supervisory control. == Books == Ming T. Liu, Protocol Engineering, Advances in Computers, Volume 29, 1989, Pages 79–195. G.J. Holzmann, Design and Validation of Computer Protocols, Prentice Hall, 1991. H. König, Protocol Engineering, Springer, 2012. M. Popovic, Communication Protocol Engineering, CRC Press, 2nd Ed. 2018. P. Venkataram, S.S. Manvi, B.S. Babu, Communication Protocol Engineering, 2014.

Code (cryptography)

In cryptology, a code is a method used to encrypt a message that operates at the level of meaning; that is, words or phrases are converted into something else. A code might transform "change" into "CVGDK" or "cocktail lounge". The U.S. National Security Agency defined a code as "A substitution cryptosystem in which the plaintext elements are primarily words, phrases, or sentences, and the code equivalents (called "code groups") typically consist of letters or digits (or both) in otherwise meaningless combinations of identical length." A codebook is needed to encrypt, and decrypt the phrases or words. By contrast, ciphers encrypt messages at the level of individual letters, or small groups of letters, or even, in modern ciphers, individual bits. Messages can be transformed first by a code, and then by a cipher. Such multiple encryption, or "superencryption" aims to make cryptanalysis more difficult. Another comparison between codes and ciphers is that a code typically represents a letter or groups of letters directly without the use of mathematics. As such the numbers are configured to represent these three values: 1001 = A, 1002 = B, 1003 = C, ... . The resulting message, then would be 1001 1002 1003 to communicate ABC. Ciphers, however, utilize a mathematical formula to represent letters or groups of letters. For example, A = 1, B = 2, C = 3, ... . Thus the message ABC results by multiplying each letter's value by 13. The message ABC, then would be 13 26 39. Codes have a variety of drawbacks, including susceptibility to cryptanalysis and the difficulty of managing the cumbersome codebooks, so ciphers are now the dominant technique in modern cryptography. In contrast, because codes are representational, they are not susceptible to mathematical analysis of the individual codebook elements. In the example, the message 13 26 39 can be cracked by dividing each number by 13 and then ranking them alphabetically. However, the focus of codebook cryptanalysis is the comparative frequency of the individual code elements matching the same frequency of letters within the plaintext messages using frequency analysis. In the above example, the code group, 1001, 1002, 1003, might occur more than once and that frequency might match the number of times that ABC occurs in plain text messages. (In the past, or in non-technical contexts, code and cipher are often used to refer to any form of encryption). == One- and two-part codes == Codes are defined by "codebooks" (physical or notional), which are dictionaries of codegroups listed with their corresponding plaintext. Codes originally had the codegroups assigned in 'plaintext order' for convenience of the code designed, or the encoder. For example, in a code using numeric code groups, a plaintext word starting with "a" would have a low-value group, while one starting with "z" would have a high-value group. The same codebook could be used to "encode" a plaintext message into a coded message or "codetext", and "decode" a codetext back into plaintext message. In order to make life more difficult for codebreakers, codemakers designed codes with no predictable relationship between the codegroups and the ordering of the matching plaintext. In practice, this meant that two codebooks were now required, one to find codegroups for encoding, the other to look up codegroups to find plaintext for decoding. Such "two-part" codes required more effort to develop, and twice as much effort to distribute (and discard safely when replaced), but they were harder to break. The Zimmermann Telegram in January 1917 used the German diplomatic "0075" two-part code system which contained upwards of 10,000 phrases and individual words. == One-time code == A one-time code is a prearranged word, phrase or symbol that is intended to be used only once to convey a simple message, often the signal to execute or abort some plan or confirm that it has succeeded or failed. One-time codes are often designed to be included in what would appear to be an innocent conversation. Done properly they are almost impossible to detect, though a trained analyst monitoring the communications of someone who has already aroused suspicion might be able to recognize a comment like "Aunt Bertha has gone into labor" as having an ominous meaning. Famous example of one time codes include: In the Bible, Jonathan prearranges a code with David, who is going into hiding from Jonathan's father, King Saul. If, during archery practice, Jonathan tells the servant retrieving arrows "the arrows are on this side of you," David may safely return to court; if the command is "the arrows are beyond you," David must flee. "One if by land; two if by sea" in "Paul Revere's Ride" made famous in the poem by Henry Wadsworth Longfellow "Climb Mount Niitaka" - the signal to Japanese planes to begin the attack on Pearl Harbor During World War II the British Broadcasting Corporation's overseas service frequently included "personal messages" as part of its regular broadcast schedule. The seemingly nonsensical stream of messages read out by announcers were actually one time codes intended for Special Operations Executive (SOE) agents operating behind enemy lines. An example might be "The princess wears red shoes" or "Mimi's cat is asleep under the table". Each code message was read out twice. By such means, the French Resistance were instructed to start sabotaging rail and other transport links the night before D-day. "Over all of Spain, the sky is clear" was a signal (broadcast on radio) to start the nationalist military revolt in Spain on July 17, 1936. Sometimes messages are not prearranged and rely on shared knowledge hopefully known only to the recipients. An example is the telegram sent to U.S. President Harry Truman, then at the Potsdam Conference to meet with Soviet premier Joseph Stalin, informing Truman of the first successful test of an atomic bomb. "Operated on this morning. Diagnosis not yet complete but results seem satisfactory and already exceed expectations. Local press release necessary as interest extends great distance. Dr. Groves pleased. He returns tomorrow. I will keep you posted." == Idiot code == An idiot code is a code that is created by the parties using it. This type of communication is akin to the hand signals used by armies in the field. Example: Any sentence where 'day' and 'night' are used means 'attack'. The location mentioned in the following sentence specifies the location to be attacked. Plaintext: Attack X. Codetext: We walked day and night through the streets but couldn't find it! Tomorrow we'll head into X. An early use of the term appears to be by George Perrault, a character in the science fiction book Friday by Robert A. Heinlein: The simplest sort [of code] and thereby impossible to break. The first ad told the person or persons concerned to carry out number seven or expect number seven or it said something about something designated as seven. This one says the same with respect to code item number ten. But the meaning of the numbers cannot be deduced through statistical analysis because the code can be changed long before a useful statistical universe can be reached. It's an idiot code... and an idiot code can never be broken if the user has the good sense not to go too often to the well. Terrorism expert Magnus Ranstorp said that the men who carried out the September 11 attacks on the United States used basic e-mail and what he calls "idiot code" to discuss their plans. == Cryptanalysis of codes == While solving a monoalphabetic substitution cipher is easy, solving even a simple code is difficult. Decrypting a coded message is a little like trying to translate a document written in a foreign language, with the task basically amounting to building up a "dictionary" of the codegroups and the plaintext words they represent. One fingerhold on a simple code is the fact that some words are more common than others, such as "the" or "a" in English. In telegraphic messages, the codegroup for "STOP" (i.e., end of sentence or paragraph) is usually very common. This helps define the structure of the message in terms of sentences, if not their meaning, and this is cryptanalytically useful. Further progress can be made against a code by collecting many codetexts encrypted with the same code and then using information from other sources spies newspapers diplomatic cocktail party chat the location from where a message was sent where it was being sent to (i.e., traffic analysis) the time the message was sent, events occurring before and after the message was sent the normal habits of the people sending the coded messages etc. For example, a particular codegroup found almost exclusively in messages from a particular army and nowhere else might very well indicate the commander of that army. A codegroup that appears in messages preceding an attack on a particular location may very well stand for that location. Cribs can be an immediate giveaway to the definiti