AI For Economics Students

AI For Economics Students — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • MoltenVK

    MoltenVK

    MoltenVK is a software library which allows Vulkan applications to run on top of Metal on Apple's macOS, iOS, and tvOS operating systems. It is the first software component to be released for the Vulkan Portability Initiative, a project to have a subset of Vulkan run on platforms lacking native Vulkan drivers. There are some limitations compared with a native Vulkan implementation. == History == MoltenVK was first released as a proprietary and commercially licensed product by The Brenwill Workshop on July 27, 2016. On July 31, 2017, Khronos announced the formation of the Vulkan Portability Technical Subgroup. === Open source === On February 26, 2018, Khronos announced that Vulkan became available on macOS and iOS products through the MoltenVK library. Valve announced that Dota 2 will run on macOS using the Vulkan API with the aid of MoltenVK, and that they had made an arrangement with developer The Brenwill Workshop Ltd to release MoltenVK as open-source software under the Apache License version 2.0. On May 30, 2018, Qt was updated with Vulkan for Qt on macOS using MoltenVK. On May 31, 2018, optional Vulkan support for Dota 2 on macOS was released. Benchmarks for the game were available the following day, showing better performance using Vulkan and MoltenVK compared to OpenGL. On July 20, 2018, Wine was updated with Vulkan support on macOS using MoltenVK. On 29 July 2018, the first app using MoltenVK was accepted onto the App Store, after initially being rejected. On 6 August 2018, Google open-sourced Filament, a crossplatform real-time physically based rendering engine with MoltenVK for macOS/iOS. On November 28, 2018, Valve released Artifact, their first Vulkan-only game on macOS using MoltenVK. === Version 1.0 === On 29 January 2019, MoltenVK 1.0.32 was released with early prototype of Vulkan Portability Extensions. RPCS3 and Dolphin emulators were updated with Vulkan support on macOS using MoltenVK. On 13 April 2019, MoltenVK 1.0.34 was released with support for tessellation. On July 30, 2019, MoltenVK 1.0.36 was released targeting Metal 3.0. On July 31, 2020, MoltenVK 1.0.44 was released, adding support for the tvOS platform. On January 23, 2020, MoltenVK was updated to support for some of the new features of Vulkan 1.2, as of Vulkan SDK 1.2.121. === Version 1.1 === On October 1, 2020, MoltenVK 1.1.0 was released, adding full support for Vulkan 1.1, as of Vulkan SDK 1.2.154. On 9 December 2020, MoltenVK 1.1.1 was released, providing support for Vulkan on Apple silicon GPUs and support for the Mac Catalyst platform for porting iOS/iPadOS apps to macOS. === Version 1.2 === On October 18, 2022, MoltenVK 1.2.0 was released, adding full support for Vulkan 1.2 as of Vulkan SDK 1.3.231. In January 2023, MoltenVK 1.2.2 added support for Vulkan as of SDK 1.3.239, while this version of Vulkan SDK fixed some issues with the interconnectivity with Metal API, while version 1.2.3 supported some additional extensions. === Version 1.3 === On May 1, 2025, MoltenVK 1.3 was released with support for Vulkan 1.3. === Version 1.4 === On August 20, 2025, MoltenVK 1.4 was released with support for Vulkan 1.4.

    Read more →
  • Quadratic unconstrained binary optimization

    Quadratic unconstrained binary optimization

    Quadratic unconstrained binary optimization (QUBO), also known as unconstrained binary quadratic programming (UBQP), is a combinatorial optimization problem with a wide range of applications from finance and economics to machine learning. QUBO is an NP hard problem, and for many classical problems from theoretical computer science, like maximum cut, graph coloring and the partition problem, embeddings into QUBO have been formulated. Embeddings for machine learning models include support-vector machines, clustering and probabilistic graphical models. Moreover, due to its close connection to Ising models, QUBO constitutes a central problem class for adiabatic quantum computation, where it is solved through a physical process called quantum annealing. == Definition == Let B = { 0 , 1 } {\displaystyle \mathbb {B} =\lbrace 0,1\rbrace } the set of binary digits (or bits), then B n {\displaystyle \mathbb {B} ^{n}} is the set of binary vectors of fixed length n ∈ N {\displaystyle n\in \mathbb {N} } . Given a symmetric or upper triangular matrix Q ∈ R n × n {\displaystyle {\boldsymbol {Q}}\in \mathbb {R} ^{n\times n}} , whose entries Q i j {\displaystyle Q_{ij}} define a weight for each pair of indices i , j ∈ { 1 , … , n } {\displaystyle i,j\in \lbrace 1,\dots ,n\rbrace } , we can define the function f Q : B n → R {\displaystyle f_{\boldsymbol {Q}}:\mathbb {B} ^{n}\rightarrow \mathbb {R} } that assigns a value to each binary vector x {\displaystyle {\boldsymbol {x}}} through f Q ( x ) = x ⊺ Q x = ∑ i = 1 n ∑ j = 1 n Q i j x i x j . {\displaystyle f_{\boldsymbol {Q}}({\boldsymbol {x}})={\boldsymbol {x}}^{\intercal }{\boldsymbol {Qx}}=\sum _{i=1}^{n}\sum _{j=1}^{n}Q_{ij}x_{i}x_{j}.} Alternatively, the linear and quadratic parts can be separated as f Q ′ , q ( x ) = x ⊺ Q ′ x + q ⊺ x , {\displaystyle f_{{\boldsymbol {Q}}',{\boldsymbol {q}}}({\boldsymbol {x}})={\boldsymbol {x}}^{\intercal }{\boldsymbol {Q}}'{\boldsymbol {x}}+{\boldsymbol {q}}^{\intercal }{\boldsymbol {x}},} where Q ′ ∈ R n × n {\displaystyle {\boldsymbol {Q}}'\in \mathbb {R} ^{n\times n}} and q ∈ R n {\displaystyle {\boldsymbol {q}}\in \mathbb {R} ^{n}} . This is equivalent to the previous definition through Q = Q ′ + diag ⁡ [ q ] {\displaystyle {\boldsymbol {Q}}={\boldsymbol {Q}}'+\operatorname {diag} [{\boldsymbol {q}}]} using the diag operator, exploiting that x = x ⋅ x {\displaystyle x=x\cdot x} for all binary values x {\displaystyle x} . Intuitively, the weight Q i j {\displaystyle Q_{ij}} is added if both x i = 1 {\displaystyle x_{i}=1} and x j = 1 {\displaystyle x_{j}=1} . The QUBO problem consists of finding a binary vector x ∗ {\displaystyle {\boldsymbol {x}}^{}} that minimizes f Q {\displaystyle f_{\boldsymbol {Q}}} , i.e., ∀ x ∈ B n : f Q ( x ∗ ) ≤ f Q ( x ) {\displaystyle \forall {\boldsymbol {x}}\in \mathbb {B} ^{n}:~f_{\boldsymbol {Q}}({\boldsymbol {x}}^{})\leq f_{\boldsymbol {Q}}({\boldsymbol {x}})} . In general, x ∗ {\displaystyle {\boldsymbol {x}}^{}} is not unique, meaning there may be a set of minimizing vectors with equal value w.r.t. f Q {\displaystyle f_{\boldsymbol {Q}}} . The complexity of QUBO arises from the number of candidate binary vectors to be evaluated, as | B n | = 2 n {\displaystyle \left|\mathbb {B} ^{n}\right|=2^{n}} grows exponentially in n {\displaystyle n} . Sometimes, QUBO is defined as the problem of maximizing f Q {\displaystyle f_{\boldsymbol {Q}}} , which is equivalent to minimizing f − Q = − f Q {\displaystyle f_{-{\boldsymbol {Q}}}=-f_{\boldsymbol {Q}}} . == Properties == QUBO is scale invariant for positive factors α > 0 {\displaystyle \alpha >0} , which leave the optimum x ∗ {\displaystyle {\boldsymbol {x}}^{}} unchanged: f α Q ( x ) = x ⊺ ( α Q ) x = α ( x ⊺ Q x ) = α f Q ( x ) {\displaystyle f_{\alpha {\boldsymbol {Q}}}({\boldsymbol {x}})={\boldsymbol {x}}^{\intercal }(\alpha {\boldsymbol {Q}}){\boldsymbol {x}}=\alpha ({\boldsymbol {x}}^{\intercal }{\boldsymbol {Qx}})=\alpha f_{\boldsymbol {Q}}({\boldsymbol {x}})} . In its general form, QUBO is NP-hard and cannot be solved efficiently by any known polynomial-time algorithm. However, there are polynomially-solvable special cases, where Q {\displaystyle {\boldsymbol {Q}}} has certain properties, for example: If all coefficients are positive, the optimum is trivially x ∗ = ( 0 , … , 0 ) ⊺ {\displaystyle {\boldsymbol {x}}^{}=(0,\dots ,0)^{\intercal }} . Similarly, if all coefficients are negative, the optimum is x ∗ = ( 1 , … , 1 ) ⊺ {\displaystyle {\boldsymbol {x}}^{}=(1,\dots ,1)^{\intercal }} . If Q {\displaystyle {\boldsymbol {Q}}} is diagonal, the bits can be optimized independently, and the problem is solvable in O ( n ) {\displaystyle {\mathcal {O}}(n)} . The optimal variable assignments are simply x i ∗ = 1 {\displaystyle x_{i}^{}=1} if Q i i < 0 {\displaystyle Q_{ii}<0} , and x i ∗ = 0 {\displaystyle x_{i}^{}=0} otherwise. If all off-diagonal elements of Q {\displaystyle {\boldsymbol {Q}}} are non-positive, the corresponding QUBO problem is solvable in polynomial time. QUBO can be solved using integer linear programming solvers like CPLEX or Gurobi Optimizer. This is possible since QUBO can be reformulated as a linear constrained binary optimization problem. To achieve this, substitute the product x i x j {\displaystyle x_{i}x_{j}} by an additional binary variable z i j ∈ B {\displaystyle z_{ij}\in \mathbb {B} } and add the constraints x i ≥ z i j {\displaystyle x_{i}\geq z_{ij}} , x j ≥ z i j {\displaystyle x_{j}\geq z_{ij}} and x i + x j − 1 ≤ z i j {\displaystyle x_{i}+x_{j}-1\leq z_{ij}} . Note that z i j {\displaystyle z_{ij}} can also be relaxed to continuous variables within the bounds zero and one. == Applications == QUBO is a structurally simple, yet computationally hard optimization problem. It can be used to encode a wide range of optimization problems from various scientific areas. === Maximum Cut === Given a graph G = ( V , E ) {\displaystyle G=(V,E)} with vertex set V = { 1 , … , n } {\displaystyle V=\lbrace 1,\dots ,n\rbrace } and edges E ⊆ V × V {\displaystyle E\subseteq V\times V} , the maximum cut (max-cut) problem consists of finding two subsets S , T ⊆ V {\displaystyle S,T\subseteq V} with T = V ∖ S {\displaystyle T=V\setminus S} , such that the number of edges between S {\displaystyle S} and T {\displaystyle T} is maximized. The more general weighted max-cut problem assumes edge weights w i j ≥ 0 ∀ i , j ∈ V {\displaystyle w_{ij}\geq 0~\forall i,j\in V} , with ( i , j ) ∉ E ⇒ w i j = 0 {\displaystyle (i,j)\notin E\Rightarrow w_{ij}=0} , and asks for a partition S , T ⊆ V {\displaystyle S,T\subseteq V} that maximizes the sum of edge weights between S {\displaystyle S} and T {\displaystyle T} , i.e., max S ⊆ V ∑ i ∈ S , j ∉ S w i j . {\displaystyle \max _{S\subseteq V}\sum _{i\in S,j\notin S}w_{ij}.} By setting w i j = 1 {\displaystyle w_{ij}=1} for all ( i , j ) ∈ E {\displaystyle (i,j)\in E} this becomes equivalent to the original max-cut problem above, which is why we focus on this more general form in the following. For every vertex in i ∈ V {\displaystyle i\in V} we introduce a binary variable x i {\displaystyle x_{i}} with the interpretation x i = 0 {\displaystyle x_{i}=0} if i ∈ S {\displaystyle i\in S} and x i = 1 {\displaystyle x_{i}=1} if i ∈ T {\displaystyle i\in T} . As T = V ∖ S {\displaystyle T=V\setminus S} , every i {\displaystyle i} is in exactly one set, meaning there is a 1:1 correspondence between binary vectors x ∈ B n {\displaystyle {\boldsymbol {x}}\in \mathbb {B} ^{n}} and partitions of V {\displaystyle V} into two subsets. We observe that, for any i , j ∈ V {\displaystyle i,j\in V} , the expression x i ( 1 − x j ) + ( 1 − x i ) x j {\displaystyle x_{i}(1-x_{j})+(1-x_{i})x_{j}} evaluates to 1 if and only if i {\displaystyle i} and j {\displaystyle j} are in different subsets, equivalent to logical XOR. Let W ∈ R + n × n {\displaystyle {\boldsymbol {W}}\in \mathbb {R} _{+}^{n\times n}} with W i j = w i j ∀ i , j ∈ V {\displaystyle W_{ij}=w_{ij}~\forall i,j\in V} . By extending above expression to matrix-vector form we find that x ⊺ W ( 1 − x ) + ( 1 − x ) ⊺ W x = − 2 x ⊺ W x + ( W 1 + W ⊺ 1 ) ⊺ x {\displaystyle {\boldsymbol {x}}^{\intercal }{\boldsymbol {W}}({\boldsymbol {1}}-{\boldsymbol {x}})+({\boldsymbol {1}}-{\boldsymbol {x}})^{\intercal }{\boldsymbol {Wx}}=-2{\boldsymbol {x}}^{\intercal }{\boldsymbol {Wx}}+({\boldsymbol {W1}}+{\boldsymbol {W}}^{\intercal }{\boldsymbol {1}})^{\intercal }{\boldsymbol {x}}} is the sum of weights of all edges between S {\displaystyle S} and T {\displaystyle T} , where 1 = ( 1 , 1 , … , 1 ) ⊺ ∈ R n {\displaystyle {\boldsymbol {1}}=(1,1,\dots ,1)^{\intercal }\in \mathbb {R} ^{n}} . As this is a quadratic function over x {\displaystyle {\boldsymbol {x}}} , it is a QUBO problem whose parameter matrix we can read from above expression as Q = 2 W − diag ⁡ [ W 1 + W ⊺ 1 ] , {\displaystyle {\boldsymbol {Q}}=2{\boldsymbol {W}}-\operatorname {diag} [{\boldsymbol {W1}}+{\boldsymbol {W}}^{\intercal }{\bol

    Read more →
  • Plate notation

    Plate notation

    In Bayesian inference, plate notation is a method of representing variables that repeat in a graphical model. Instead of drawing each repeated variable individually, a plate or rectangle is used to group variables into a subgraph that repeat together, and a number is drawn on the plate to represent the number of repetitions of the subgraph in the plate. The assumptions are that the subgraph is duplicated that many times, the variables in the subgraph are indexed by the repetition number, and any links that cross a plate boundary are replicated once for each subgraph repetition. == Example == In this example, we consider Latent Dirichlet allocation, a Bayesian network that models how documents in a corpus are topically related. There are two variables not in any plate; α is the parameter of the uniform Dirichlet prior on the per-document topic distributions, and β is the parameter of the uniform Dirichlet prior on the per-topic word distribution. The outermost plate represents all the variables related to a specific document, including θ i {\displaystyle \theta _{i}} , the topic distribution for document i. The M in the corner of the plate indicates that the variables inside are repeated M times, once for each document. The inner plate represents the variables associated with each of the N i {\displaystyle N_{i}} words in document i: z i j {\displaystyle z_{ij}} is the topic distribution for the jth word in document i, and w i j {\displaystyle w_{ij}} is the actual word used. The N in the corner represents the repetition of the variables in the inner plate N j {\displaystyle N_{j}} times, once for each word in document i. The circle representing the individual words is shaded, indicating that each w i j {\displaystyle w_{ij}} is observable, and the other circles are empty, indicating that the other variables are latent variables. The directed edges between variables indicate dependencies between the variables: for example, each w i j {\displaystyle w_{ij}} depends on z i j {\displaystyle z_{ij}} and β. == Extensions == A number of extensions have been created by various authors to express more information than simply the conditional relationships. However, few of these have become standard. Perhaps the most commonly used extension is to use rectangles in place of circles to indicate non-random variables—either parameters to be computed, hyperparameters given a fixed value (or computed through empirical Bayes), or variables whose values are computed deterministically from a random variable. The diagram on the right shows a few more non-standard conventions used in some articles in Wikipedia (e.g. variational Bayes): Variables that are actually random vectors are indicated by putting the vector size in brackets in the middle of the node. Variables that are actually random matrices are similarly indicated by putting the matrix size in brackets in the middle of the node, with commas separating row size from column size. Categorical variables are indicated by placing their size (without a bracket) in the middle of the node. Categorical variables that act as "switches", and which pick one or more other random variables to condition on from a large set of such variables (e.g. mixture components), are indicated with a special type of arrow containing a squiggly line and ending in a T junction. Boldface is consistently used for vector or matrix nodes (but not categorical nodes). == Software implementation == Plate notation has been implemented in various TeX/LaTeX drawing packages, but also as part of graphical user interfaces to Bayesian statistics programs such as BUGS and BayesiaLab and PyMC.

    Read more →
  • Exploratory blockmodeling

    Exploratory blockmodeling

    Exploratory blockmodeling is an (inductive) approach (or a group of approaches) in blockmodeling regarding the specification of an ideal blockmodel. This approach, also known as hypotheses-generating, is the simplest approach, as it "merely involves the definition of the block types permitted as well as of the number of clusters." With this approach, researcher usually defines the best possible blockmodel, which then represent the base for the analysis of the whole network. This approach is usually based on: previous analyses and theoretical considerations, using stricker blockmodel and block types, where the structural equivalence is stricker than the regular equivalence and using smaller number of classes. The opposite approach is called a confirmatory blockmodeling.

    Read more →
  • Server.com

    Server.com

    Server.com is a domain name that was owned by software as a service (SaaS) company Server Corporation. They offered a suite of services from 1996 until 2007. It was the first SaaS site to offer a variety of services and the first to use the term WebApp to describe its services. It was selected as an Incredibly Useful Site by Yahoo! Internet Life magazine. net magazine listed Server.com among the 100 most influential websites of all time. Server.com launched in 1996 offering the first online personal information manager. In 1997, they rolled out the first threaded message board service; the first web based mailing list manager; one of the first online calendar services; and one of the first online form builders. In 2000, Server.com partnered with NBCi and became server.snap.com until 2001. In 2001, Server.com was serving 100 million monthly pageviews. Media Life declared it one of the 20 biggest ad domains on the Web. In 2002, Server.com developed one of the first web-based RSS aggregators. In 2007, all services were moved to YourWebApps.com. The domain name Server.com was sold in 2009 for $770,000.

    Read more →
  • NOMINATE (scaling method)

    NOMINATE (scaling method)

    NOMINATE (an acronym for nominal three-step estimation) is a multidimensional scaling application developed by US political scientists Keith T. Poole and Howard Rosenthal in the early 1980s to analyze preferential and choice data, such as legislative roll-call voting behavior. In its most well-known application, members of the US Congress are placed on a two-dimensional map, with politicians who are ideologically similar (i.e. who often vote the same) being close together. One of these two dimensions corresponds to the familiar left–right political spectrum (liberal–conservative in the United States). As computing capabilities grew, Poole and Rosenthal developed multiple iterations of their NOMINATE procedure: the original D-NOMINATE method, W-NOMINATE, and most recently DW-NOMINATE (for dynamic, weighted NOMINATE). In 2009, Poole and Rosenthal were the first recipients of the Society for Political Methodology's Best Statistical Software Award for their development of NOMINATE. In 2016, the society awarded Poole its Career Achievement Award, stating that "the modern study of the U.S. Congress would be simply unthinkable without NOMINATE legislative roll call voting scores." == Procedure == The main procedure is an application of multidimensional scaling techniques to political choice data. Though there are important technical differences between these types of NOMINATE scaling procedures, all operate under the same fundamental assumptions. First, that alternative choices can be projected on a basic, low-dimensional (often two-dimensional) Euclidean space. Second, within that space, individuals have utility functions which are bell-shaped (normally distributed), and maximized at their ideal point. Because individuals also have symmetric, single-peaked utility functions which center on their ideal point, ideal points represent individuals' most preferred outcomes. That is, individuals most desire outcomes closest their ideal point, and will choose/vote probabilistically for the closest outcome. Ideal points can be recovered from observing choices, with individuals exhibiting similar preferences placed more closely than those behaving dissimilarly. It is helpful to compare this procedure to producing maps based on driving distances between cities. For example, Los Angeles is about 1,800 miles from St. Louis; St. Louis is about 1,200 miles from Miami; and Miami is about 2,700 miles from Los Angeles. From this (dis)similarities data, any map of these three cities should place Miami far from Los Angeles, with St. Louis somewhere in between (though a bit closer to Miami than Los Angeles). Just as cities like Los Angeles and San Francisco would be clustered on a map, NOMINATE places ideologically similar legislators (e.g., liberal Senators Barbara Boxer (D-Calif.) and Al Franken (D-Minn.)) closer to each other, and farther from dissimilar legislators (e.g., conservative Senator Tom Coburn (R-Okla.)) based on the degree of agreement between their roll call voting records. At the heart of the NOMINATE procedures (and other multidimensional scaling methods, such as Poole's Optimal Classification method) are algorithms they utilize to arrange individuals and choices in low dimensional (usually two-dimensional) space. Thus, NOMINATE scores provide "maps" of legislatures. Using NOMINATE procedures to study congressional roll call voting behavior from the First Congress to the present-day, Poole and Rosenthal published Congress: A Political-Economic History of Roll Call Voting in 1997 and the revised edition Ideology and Congress in 2007. In 2009, Poole and Rosenthal were named the first recipients of the Society for Political Methodology's Best Statistical Software Award for their development of NOMINATE, a recognition conferred to "individual(s) for developing statistical software that makes a significant research contribution". In 2016, Keith T. Poole was awarded the Society for Political Methodology's Career Achievement Award. The citation for this award reads, in part, "One can say perfectly correctly, and without any hyperbole: the modern study of the U.S. Congress would be simply unthinkable without NOMINATE legislative roll call voting scores. NOMINATE has produced data that entire bodies of our discipline—and many in the press—have relied on to understand the U.S. Congress." == Dimensions == Poole and Rosenthal demonstrate that—despite the many complexities of congressional representation and politics—roll call voting in both the House and the Senate can be organized and explained by no more than two dimensions throughout the sweep of American history. The first dimension (horizontal or x-axis) is the familiar left-right (or liberal-conservative) spectrum on economic matters. The second dimension (vertical or y-axis) picks up attitudes on cross-cutting, salient issues of the day (which include or have included slavery, bimetallism, civil rights, regional, and social/lifestyle issues). Rosenthal and Poole have initially argued that the first dimension refers to socio-economic matters and the second dimension to race-relations. However, the often confusing and residual nature of the second dimension has led to the second dimension being largely ignored by other researchers. For the most part, congressional voting is uni-dimensional, with most of the variation in voting patterns explained by placement along the liberal-conservative first dimension. While the first dimension of the DW-NOMINATE score is able to predict results at 83% accuracy, the addition of the second dimension only increases accuracy to 85%. Furthermore, the second dimension only provided a significant increase in accuracy for Congresses 1-99. As late as the 1990s, the second dimension was able to measure partisan splits in abortion and gun rights issues. However, a 2017 analysis found that since 1987, the votes of the US Congress had best fit a one-dimensional model, suggesting increasing party polarization after 1987. == Interpretation of nominate scores == For illustrative purposes, consider the following plots which use W-NOMINATE scores to scale members of Congress and uses the probabilistic voting model (in which legislators farther from the "cutting line" between "yea" and "nay" outcomes become more likely to vote in the predicted manner) to illustrate some major Congressional votes in the 1990s. Some of these votes, like the House's vote on President Clinton's welfare reform package (the Personal Responsibility and Work Opportunity Act of 1996) are best modeled through the use of the first (economic liberal-conservative) dimension. On the welfare reform vote, nearly all Republicans joined the moderate-conservative bloc of House Democrats in voting for the bill, while opposition was virtually confined to the most liberal Democrats in the House. The errors (those representatives on the "wrong" side of the cutting line which separates predicted "yeas" and predicted "nays") are generally close to the cutting line, which is what we would expect. A legislator directly on the cutting line is indifferent between voting "yea" and "nay" on the measure. All members are shown on the left panel of the plot, while only errors are shown on the right panel: Economic ideology also dominates the Senate vote on the Balanced Budget Amendment of 1995: On other votes, however, a second dimension (which has recently come to represent attitudes on cultural and lifestyle issues) is important. For example, roll call votes on gun control routinely split party coalitions, with socially conservative "blue dog" Democrats joining most Republicans in opposing additional regulation and socially liberal Republicans joining most Democrats in supporting gun control. The addition of the second dimension accounts for these inter-party differences, and the cutting line is more horizontal than vertical (meaning the cleavage is found on the second dimension rather than the first dimension on these votes) This pattern was evident in the 1991 House vote to require waiting periods on handguns: == Political ideology == DW-NOMINATE scores have been used widely to describe the political ideology of political actors, political parties and political institutions. For instance, a score in the first dimension that is close to either pole means that such score is located at one of the extremes in the liberal-conservative scale. So, a score closer to 1 is described as conservative whereas a score closer to −1 can be described as liberal. Finally, a score at zero or close to zero is described as moderate. == Political polarization == Poole and Rosenthal (beginning with their 1984 article "The Polarization of American Politics") have also used NOMINATE data to show that, since the 1970s, party delegations in Congress have become ideologically homogeneous and distant from one another (a phenomenon known as "polarization"). Using DW-NOMINATE scores (which permit direct comparisons between members of different Congress

    Read more →
  • Accumulated local effects

    Accumulated local effects

    Accumulated local effects (ALE) is a machine learning interpretability method. == Concepts == ALE uses a conditional feature distribution as an input and generates augmented data, creating more realistic data than a marginal distribution. It ignores far out-of-distribution (outlier) values. Unlike partial dependence plots and marginal plots, ALE is not defeated in the presence of correlated predictors. It analyzes differences in predictions instead of averaging them by calculating the average of the differences in model predictions over the augmented data, instead of the average of the predictions themselves. == Example == Given a model that predicts house prices based on its distance from city center and size of the building area, ALE compares the differences of predictions of houses of different sizes. The result separates the impact of the size from otherwise correlated features. == Limitations == Defining evaluation windows is subjective. High correlations between features can defeat the technique. ALE requires more and more uniformly distributed observations than PDP so that the conditional distribution can be reliably determined. The technique may produce inadequate results if the data is highly sparse, which is more common with high-dimensional data (curse of dimensionality).

    Read more →
  • Dynamic Bayesian network

    Dynamic Bayesian network

    A dynamic Bayesian network (DBN) is a Bayesian network (BN) which relates variables to each other over adjacent time steps. == History == A dynamic Bayesian network (DBN) is often called a "two-timeslice" BN (2TBN) because it says that at any point in time T, the value of a variable can be calculated from the internal regressors and the immediate prior value (time T-1). DBNs were developed by Paul Dagum in the early 1990s at Stanford University's Section on Medical Informatics. Dagum developed DBNs to unify and extend traditional linear state-space models such as Kalman filters, linear and normal forecasting models such as ARMA and simple dependency models such as hidden Markov models into a general probabilistic representation and inference mechanism for arbitrary nonlinear and non-normal time-dependent domains. Today, DBNs are common in robotics, and have shown potential for a wide range of data mining applications. For example, they have been used in speech recognition, digital forensics, protein sequencing, and bioinformatics. DBN is a generalization of hidden Markov models and Kalman filters. DBNs are conceptually related to probabilistic Boolean networks and can, similarly, be used to model dynamical systems at steady-state.

    Read more →
  • Apache Hama

    Apache Hama

    Apache Hama is a distributed computing framework based on bulk synchronous parallel computing techniques for massive scientific computations e.g., matrix, graph and network algorithms. Originally a sub-project of Hadoop, it became an Apache Software Foundation top level project in 2012. It was created by Edward J. Yoon, who named it (short for "Hadoop Matrix Algebra"), and Hama also means hippopotamus in Yoon's native Korean language (하마), following the trend of naming Apache projects after animals and zoology (such as Apache Pig). Hama was inspired by Google's Pregel large-scale graph computing framework described in 2010. When executing graph algorithms, Hama showed a fifty-fold performance increase relative to Hadoop. Retired in April 2020, project resources are made available as part of the Apache Attic. Yoon cited issues of installation, scalability, and a difficult programming model for its lack of adoption. == Architecture == Hama consists of three major components: BSPMaster, GroomServers and Zookeeper. === BSPMaster === BSPMaster is responsible for: Maintaining groom server status Controlling super steps in a cluster Maintaining job progress information Scheduling jobs and assigning tasks to groom servers Disseminating execution class across groom servers Controlling fault Providing users with the cluster control interface. A BSP Master and multiple grooms are started by the script. Then, the bsp master starts up with a RPC server for groom servers. Groom servers starts up with a BSPPeer instance and a RPC proxy to contact the bsp master. After started, each groom periodically sends a heartbeat message that encloses its groom server status, including maximum task capacity, unused memory, and so on. Each time the BSP master receives a heartbeat message, it brings the groom server status up-to-date. The bsp master makes use of groom servers' status in order to assign tasks to idle groom servers - and returns a heartbeat response containing assigned tasks and others actions for a groom server to do. Currently BSP master has a FIFO job scheduler and simple task assignment algorithms. === GroomServer === A groom server (shortly referred to as groom) is a process that performs BSP tasks assigned by BSPMaster. Each groom contacts the BSPMaster, and it takes assigned tasks and reports its status by means of periodical piggybacks with BSPMaster. Each groom is designed to run with HDFS or other distributed storages. Basically, a groom server and a data node should be run on one physical node. === Zookeeper === A Zookeeper is used to manage the efficient barrier synchronisation of the BSPPeers.

    Read more →
  • Policy gradient method

    Policy gradient method

    Policy gradient methods are a class of reinforcement learning algorithms and a sub-class of policy optimization methods. Unlike value-based methods which learn a value function to derive a policy, policy optimization methods directly learn a policy function π {\displaystyle \pi } that selects actions without consulting a value function. For policy gradient to apply, the policy function π θ {\displaystyle \pi _{\theta }} is parameterized by a differentiable parameter θ {\displaystyle \theta } . == Overview == In policy-based RL, the actor is a parameterized policy function π θ {\displaystyle \pi _{\theta }} , where θ {\displaystyle \theta } are the parameters of the actor. The actor takes as argument the state of the environment s {\displaystyle s} and produces a probability distribution π θ ( ⋅ ∣ s ) {\displaystyle \pi _{\theta }(\cdot \mid s)} . If the action space is discrete, then ∑ a π θ ( a ∣ s ) = 1 {\displaystyle \sum _{a}\pi _{\theta }(a\mid s)=1} . If the action space is continuous, then ∫ a π θ ( a ∣ s ) d a = 1 {\displaystyle \int _{a}\pi _{\theta }(a\mid s)\mathrm {d} a=1} . The goal of policy optimization is to find some θ {\displaystyle \theta } that maximizes the expected episodic reward J ( θ ) {\displaystyle J(\theta )} : J ( θ ) = E π θ [ ∑ t = 0 T γ t R t | S 0 = s 0 ] {\displaystyle J(\theta )=\mathbb {E} _{\pi _{\theta }}\left[\sum _{t=0}^{T}\gamma ^{t}R_{t}{\Big |}S_{0}=s_{0}\right]} where γ {\displaystyle \gamma } is the discount factor, R t {\displaystyle R_{t}} is the reward at step t {\displaystyle t} , s 0 {\displaystyle s_{0}} is the starting state, and T {\displaystyle T} is the time-horizon (which can be infinite). The policy gradient is defined as ∇ θ J ( θ ) {\displaystyle \nabla _{\theta }J(\theta )} . Different policy gradient methods stochastically estimate the policy gradient in different ways. The goal of any policy gradient method is to iteratively maximize J ( θ ) {\displaystyle J(\theta )} by gradient ascent. Since the key part of any policy gradient method is the stochastic estimation of the policy gradient, they are also studied under the title of "Monte Carlo gradient estimation". == REINFORCE == === Policy gradient === The REINFORCE algorithm, introduced by Ronald J. Williams in 1992, was the first policy gradient method. It is based on the identity for the policy gradient ∇ θ J ( θ ) = E π θ [ ∑ t = 0 T ∇ θ ln ⁡ π θ ( A t ∣ S t ) ∑ t = 0 T ( γ t R t ) | S 0 = s 0 ] {\displaystyle \nabla _{\theta }J(\theta )=\mathbb {E} _{\pi _{\theta }}\left[\sum _{t=0}^{T}\nabla _{\theta }\ln \pi _{\theta }(A_{t}\mid S_{t})\;\sum _{t=0}^{T}(\gamma ^{t}R_{t}){\Big |}S_{0}=s_{0}\right]} which can be improved via the "causality trick" ∇ θ J ( θ ) = E π θ [ ∑ t = 0 T ∇ θ ln ⁡ π θ ( A t ∣ S t ) ∑ τ = t T ( γ τ R τ ) | S 0 = s 0 ] {\displaystyle \nabla _{\theta }J(\theta )=\mathbb {E} _{\pi _{\theta }}\left[\sum _{t=0}^{T}\nabla _{\theta }\ln \pi _{\theta }(A_{t}\mid S_{t})\sum _{\tau =t}^{T}(\gamma ^{\tau }R_{\tau }){\Big |}S_{0}=s_{0}\right]} Thus, we have an unbiased estimator of the policy gradient: ∇ θ J ( θ ) ≈ 1 N ∑ n = 1 N [ ∑ t = 0 T ∇ θ ln ⁡ π θ ( A t , n ∣ S t , n ) ∑ τ = t T ( γ τ − t R τ , n ) ] {\displaystyle \nabla _{\theta }J(\theta )\approx {\frac {1}{N}}\sum _{n=1}^{N}\left[\sum _{t=0}^{T}\nabla _{\theta }\ln \pi _{\theta }(A_{t,n}\mid S_{t,n})\sum _{\tau =t}^{T}(\gamma ^{\tau -t}R_{\tau ,n})\right]} where the index n {\displaystyle n} ranges over N {\displaystyle N} rollout trajectories using the policy π θ {\displaystyle \pi _{\theta }} . The score function ∇ θ ln ⁡ π θ ( A t ∣ S t ) {\displaystyle \nabla _{\theta }\ln \pi _{\theta }(A_{t}\mid S_{t})} can be interpreted as the direction in the parameter space that increases the probability of taking action A t {\displaystyle A_{t}} in state S t {\displaystyle S_{t}} . The policy gradient, then, is a weighted average of all possible directions to increase the probability of taking any action in any state, but weighted by reward signals, so that if taking a certain action in a certain state is associated with high reward, then that direction would be highly reinforced, and vice versa. === Algorithm === The REINFORCE algorithm is a loop: Rollout N {\displaystyle N} trajectories in the environment, using π θ t {\displaystyle \pi _{\theta _{t}}} as the policy function. Compute the policy gradient estimation: g i ← 1 N ∑ n = 1 N [ ∑ t = 0 T ∇ θ t ln ⁡ π θ ( A t , n ∣ S t , n ) ∑ τ = t T ( γ τ R τ , n ) ] {\displaystyle g_{i}\leftarrow {\frac {1}{N}}\sum _{n=1}^{N}\left[\sum _{t=0}^{T}\nabla _{\theta _{t}}\ln \pi _{\theta }(A_{t,n}\mid S_{t,n})\sum _{\tau =t}^{T}(\gamma ^{\tau }R_{\tau ,n})\right]} Update the policy by gradient ascent: θ i + 1 ← θ i + α i g i {\displaystyle \theta _{i+1}\leftarrow \theta _{i}+\alpha _{i}g_{i}} Here, α i {\displaystyle \alpha _{i}} is the learning rate at update step i {\displaystyle i} . == Variance reduction == REINFORCE is an on-policy algorithm, meaning that the trajectories used for the update must be sampled from the current policy π θ {\displaystyle \pi _{\theta }} . This can lead to high variance in the updates, as the returns R ( τ ) {\displaystyle R(\tau )} can vary significantly between trajectories. Many variants of REINFORCE have been introduced, under the title of variance reduction. === REINFORCE with baseline === A common way for reducing variance is the REINFORCE with baseline algorithm, based on the following identity: ∇ θ J ( θ ) = E π θ [ ∑ t = 0 T ∇ θ ln ⁡ π θ ( A t | S t ) ( ∑ τ = t T ( γ τ R τ ) − b ( S t ) ) | S 0 = s 0 ] {\displaystyle \nabla _{\theta }J(\theta )=\mathbb {E} _{\pi _{\theta }}\left[\sum _{t=0}^{T}\nabla _{\theta }\ln \pi _{\theta }(A_{t}|S_{t})\left(\sum _{\tau =t}^{T}(\gamma ^{\tau }R_{\tau })-b(S_{t})\right){\Big |}S_{0}=s_{0}\right]} for any function b : States → R {\displaystyle b:{\text{States}}\to \mathbb {R} } . This can be proven by applying the previous lemma. The algorithm uses the modified gradient estimator g i ← 1 N ∑ n = 1 N [ ∑ t = 0 T ∇ θ t ln ⁡ π θ ( A t , n | S t , n ) ( ∑ τ = t T ( γ τ R τ , n ) − b i ( S t , n ) ) ] {\displaystyle g_{i}\leftarrow {\frac {1}{N}}\sum _{n=1}^{N}\left[\sum _{t=0}^{T}\nabla _{\theta _{t}}\ln \pi _{\theta }(A_{t,n}|S_{t,n})\left(\sum _{\tau =t}^{T}(\gamma ^{\tau }R_{\tau ,n})-b_{i}(S_{t,n})\right)\right]} and the original REINFORCE algorithm is the special case where b i ≡ 0 {\displaystyle b_{i}\equiv 0} . === Actor-critic methods === If b i {\textstyle b_{i}} is chosen well, such that b i ( S t ) ≈ ∑ τ = t T ( γ τ R τ ) = γ t V π θ i ( S t ) {\textstyle b_{i}(S_{t})\approx \sum _{\tau =t}^{T}(\gamma ^{\tau }R_{\tau })=\gamma ^{t}V^{\pi _{\theta _{i}}}(S_{t})} , this could significantly decrease variance in the gradient estimation. That is, the baseline should be as close to the value function V π θ i ( S t ) {\displaystyle V^{\pi _{\theta _{i}}}(S_{t})} as possible, approaching the ideal of: ∇ θ J ( θ ) = E π θ [ ∑ t = 0 T ∇ θ ln ⁡ π θ ( A t | S t ) ( ∑ τ = t T ( γ τ R τ ) − γ t V π θ ( S t ) ) | S 0 = s 0 ] {\displaystyle \nabla _{\theta }J(\theta )=\mathbb {E} _{\pi _{\theta }}\left[\sum _{t=0}^{T}\nabla _{\theta }\ln \pi _{\theta }(A_{t}|S_{t})\left(\sum _{\tau =t}^{T}(\gamma ^{\tau }R_{\tau })-\gamma ^{t}V^{\pi _{\theta }}(S_{t})\right){\Big |}S_{0}=s_{0}\right]} Note that, as the policy π θ t {\displaystyle \pi _{\theta _{t}}} updates, the value function V π θ i ( S t ) {\displaystyle V^{\pi _{\theta _{i}}}(S_{t})} updates as well, so the baseline should also be updated. One common approach is to train a separate function that estimates the value function, and use that as the baseline. This is one of the actor-critic methods, where the policy function is the actor and the value function is the critic. The Q-function Q π {\displaystyle Q^{\pi }} can also be used as the critic, since ∇ θ J ( θ ) = E π θ [ ∑ 0 ≤ t ≤ T γ t ∇ θ ln ⁡ π θ ( A t | S t ) ⋅ Q π θ ( S t , A t ) | S 0 = s 0 ] {\displaystyle \nabla _{\theta }J(\theta )=E_{\pi _{\theta }}\left[\sum _{0\leq t\leq T}\gamma ^{t}\nabla _{\theta }\ln \pi _{\theta }(A_{t}|S_{t})\cdot Q^{\pi _{\theta }}(S_{t},A_{t}){\Big |}S_{0}=s_{0}\right]} by a similar argument using the tower law. Subtracting the value function as a baseline, we find that the advantage function A π ( S , A ) = Q π ( S , A ) − V π ( S ) {\displaystyle A^{\pi }(S,A)=Q^{\pi }(S,A)-V^{\pi }(S)} can be used as the critic as well: ∇ θ J ( θ ) = E π θ [ ∑ 0 ≤ t ≤ T γ t ∇ θ ln ⁡ π θ ( A t | S t ) ⋅ A π θ ( S t , A t ) | S 0 = s 0 ] {\displaystyle \nabla _{\theta }J(\theta )=E_{\pi _{\theta }}\left[\sum _{0\leq t\leq T}\gamma ^{t}\nabla _{\theta }\ln \pi _{\theta }(A_{t}|S_{t})\cdot A^{\pi _{\theta }}(S_{t},A_{t}){\Big |}S_{0}=s_{0}\right]} In summary, there are many unbiased estimators for ∇ θ J θ {\textstyle \nabla _{\theta }J_{\theta }} , all in the form of: ∇ θ J ( θ ) = E π θ [ ∑ 0 ≤ t ≤ T ∇ θ ln ⁡ π θ ( A t | S t ) ⋅ Ψ t | S 0 = s 0 ] {\displaystyle \nabla _{\theta }J(\theta )=E_{\pi _{\theta }}\left[\su

    Read more →
  • Self-play

    Self-play

    Self-play is a technique for improving the performance of reinforcement learning agents. Intuitively, agents learn to improve their performance by playing "against themselves". == Definition and motivation == In multi-agent reinforcement learning experiments, researchers try to optimize the performance of a learning agent on a given task, in cooperation or competition with one or more agents. These agents learn by trial-and-error, and researchers may choose to have the learning algorithm play the role of two or more of the different agents. When successfully executed, this technique has a double advantage: It provides a straightforward way to determine the actions of the other agents, resulting in a meaningful challenge. It increases the amount of experience that can be used to improve the policy, by a factor of two or more, since the viewpoints of each of the different agents can be used for learning. Czarnecki et al argue that most of the games that people play for fun are "Games of Skill", meaning games whose space of all possible strategies looks like a spinning top. In more detail, we can partition the space of strategies into sets L 1 , L 2 , . . . , L n {\displaystyle L_{1},L_{2},...,L_{n}} , such that any i < j , π i ∈ L i , π j ∈ L j {\displaystyle i Read more →

  • Cartesian genetic programming

    Cartesian genetic programming

    Cartesian genetic programming is a form of genetic programming that uses a graph representation to encode computer programs. It grew from a method of evolving digital circuits developed by Julian F. Miller and Peter Thomson in 1997. The term ‘Cartesian genetic programming’ first appeared in 1999 and was proposed as a general form of genetic programming in 2000. It is called ‘Cartesian’ because it represents a program using a two-dimensional grid of nodes. Miller's keynote explains how CGP works. He edited a book entitled Cartesian Genetic Programming, published in 2011 by Springer. The open source project dCGP implements a differentiable version of CGP developed at the European Space Agency by Dario Izzo, Francesco Biscani and Alessio Mereta able to approach symbolic regression tasks, to find solution to differential equations, find prime integrals of dynamical systems, represent variable topology artificial neural networks and more.

    Read more →
  • I-MSCP

    I-MSCP

    i-MSCP (internet Multi Server Control Panel) was a free and open-source software for shared hosting environments management on Linux servers. It comes with a large choice of modules for various services such as Apache2, ProFTPd, Dovecot, Courier, Bind9, and can be easily extended through plugins, or listener files using its events-based API. Latest stable is the 1.5.3 version (build 2018120800) which has been released on 8 December 2018. The i-MSCP is no longer under development, although the developer has repeatedly claimed to be working on a new version, which has never has been published or even shown in any possible way. Whether development occurs or not, the current version of the software is not installable, as it only supports outdated versions of systems for which some of the necessary software to install i-MSCP cannot be installed. == Licensing == i-MSCP has a dual license. A part of the base code is licensed under the Mozilla Public License. All new code, and submissions to i-MSCP are licensed under the GNU Lesser General Public License Version 2.1 (LGPLv2). To solve this license conflict there is work on a complete rewrite for a completely LGPLv2 licensed i-MSCP. == Features == === Supported Linux Distributions === Debian Jessie (8.x), Stretch (9.x), Buster (10.x) Devuan Jessie (1.0), ASCII (2.x) Ubuntu Trusty Thar (14.04 LTS), Bionic Beaver (18.04 LTS) === Supported Daemons / Services === Web server: Apache (ITK, Fcgid and FastCGI/PHP-FPM), Nginx Name server: Bind9 MTA (Mail Transport Agent): Postfix MDA (Mail Delivery Agent): Courier, Dovecot Database: MySQL, MariaDB, Percona FTP-Server: ProFTPD, vsftpd Web statistics: AWStats === Addons === PhpMyAdmin Pydio, formerly AjaXplorer Net2ftp Roundcube Rainloop == Competing software == cPanel DTC Froxlor ISPConfig ispCP OpenPanel hestiacp Plesk SysCP Virtualmin

    Read more →
  • Implicit blockmodeling

    Implicit blockmodeling

    Implicit blockmodeling is an approach in blockmodeling, similar to a valued and homogeneity blockmodeling, where initially an additional normalization is used and then while specifying the parameter of the relevant link is replaced by the block maximum. This approach was first proposed by Batagelj and Ferligoj in 2000, and developed by Aleš Žiberna in 2007/08. Comparing with homogeneity, the implicit blockmodeling will perform similarly with max-regular equivalence, but slightly worse in other settings. It will perform worse than valued and homogeneity blockmodeling with a pre-specified blockmodel.

    Read more →
  • Generalized iterative scaling

    Generalized iterative scaling

    In statistics, generalized iterative scaling (GIS) and improved iterative scaling (IIS) are two early algorithms used to fit log-linear models, notably multinomial logistic regression (MaxEnt) classifiers and extensions of it such as MaxEnt Markov models and conditional random fields. These algorithms have been largely surpassed by gradient-based methods such as L-BFGS and coordinate descent algorithms.

    Read more →