Constrained conditional model

Constrained conditional model

A constrained conditional model (CCM) is a machine learning and inference framework that augments the learning of conditional (probabilistic or discriminative) models with declarative constraints. The constraint can be used as a way to incorporate expressive prior knowledge into the model and bias the assignments made by the learned model to satisfy these constraints. The framework can be used to support decisions in an expressive output space while maintaining modularity and tractability of training and inference. Models of this kind have recently attracted much attention within the natural language processing (NLP) community. Formulating problems as constrained optimization problems over the output of learned models has several advantages. It allows one to focus on the modeling of problems by providing the opportunity to incorporate domain-specific knowledge as global constraints using a first order language. Using this declarative framework frees the developer from low level feature engineering while capturing the problem's domain-specific properties and guarantying exact inference. From a machine learning perspective it allows decoupling the stage of model generation (learning) from that of the constrained inference stage, thus helping to simplify the learning stage while improving the quality of the solutions. For example, in the case of generating compressed sentences, rather than simply relying on a language model to retain the most commonly used n-grams in the sentence, constraints can be used to ensure that if a modifier is kept in the compressed sentence, its subject will also be kept. == Motivation == Making decisions in many domains (such as natural language processing and computer vision problems) often involves assigning values to sets of interdependent variables where the expressive dependency structure can influence, or even dictate, what assignments are possible. These settings are applicable not only to Structured Learning problems such as semantic role labeling, but also for cases that require making use of multiple pre-learned components, such as summarization, textual entailment and question answering. In all these cases, it is natural to formulate the decision problem as a constrained optimization problem, with an objective function that is composed of learned models, subject to domain- or problem-specific constraints. Constrained conditional models form a learning and inference framework that augments the learning of conditional (probabilistic or discriminative) models with declarative constraints (written, for example, using a first-order representation) as a way to support decisions in an expressive output space while maintaining modularity and tractability of training and inference. These constraints can express either hard restrictions, completely prohibiting some assignments, or soft restrictions, penalizing unlikely assignments. In most applications of this framework in NLP, following, Integer Linear Programming (ILP) was used as the inference framework, although other algorithms can be used for that purpose. == Formal Definition == Given a set of feature functions { ϕ i ( x , y ) } {\displaystyle \{\phi _{i}(x,y)\}} and a set of constraints { C i ( x , y ) } {\displaystyle \{C_{i}(x,y)\}} , defined over an input structure x ∈ X {\displaystyle x\in X} and an output structure y ∈ Y {\displaystyle y\in Y} , a constraint conditional model is characterized by two weight vectors, w and ρ {\displaystyle \rho } , and is defined as the solution to the following optimization problem: a r g m a x y ∑ i w i ϕ i ( x , y ) − ∑ ρ i C i ( x , y ) {\displaystyle argmax_{y}\sum _{i}w_{i}\phi _{i}(x,y)-\sum \rho _{i}C_{i}(x,y)} . Each constraint C i ∈ C {\displaystyle C_{i}\in C} is a boolean mapping indicating if the joint assignment ( x , y ) {\displaystyle (x,y)} violates a constraint, and ρ {\displaystyle \rho } is the penalty incurred for violating the constraints. Constraints assigned an infinite penalty are known as hard constraints, and represent unfeasible assignments to the optimization problem. == Training paradigms == === Learning local vs. global models === The objective function used by CCMs can be decomposed and learned in several ways, ranging from a complete joint training of the model along with the constraints to completely decoupling the learning and the inference stage. In the latter case, several local models are learned independently and the dependency between these models is considered only at decision time via a global decision process. The advantages of each approach are discussed in which studies the two training paradigms: (1) local models: L+I (learning + inference) and (2) global model: IBT (Inference based training), and shows both theoretically and experimentally that while IBT (joint training) is best in the limit, under some conditions (basically, ”good” components) L+I can generalize better. The ability of CCM to combine local models is especially beneficial in cases where joint learning is computationally intractable or when training data are not available for joint learning. This flexibility distinguishes CCM from the other learning frameworks that also combine statistical information with declarative constraints, such as Markov logic network, that emphasize joint training. === Minimally supervised CCM === CCM can help reduce supervision by using domain knowledge (expressed as constraints) to drive learning. These settings were studied in and. These works introduce semi-supervised Constraints Driven Learning (CODL) and show that by incorporating domain knowledge the performance of the learned model improves significantly. === Learning over latent representations === CCMs have also been applied to latent learning frameworks, where the learning problem is defined over a latent representation layer. Since the notion of a correct representation is inherently ill-defined, no gold-standard labeled data regarding the representation decision is available to the learner. Identifying the correct (or optimal) learning representation is viewed as a structured prediction process and therefore modeled as a CCM. This problem was covered in several papers, in both supervised and unsupervised settings. In all cases research showed that explicitly modeling the interdependencies between representation decisions via constraints results in an improved performance. == Integer linear programming for natural language processing applications == The advantages of the CCM declarative formulation and the availability of off-the-shelf solvers have led to a large variety of natural language processing tasks being formulated within the framework, including semantic role labeling, syntactic parsing, coreference resolution, summarization, transliteration, natural language generation and joint information extraction. Most of these works use an integer linear programming (ILP) solver to solve the decision problem. Although theoretically solving an Integer Linear Program is exponential in the size of the decision problem, in practice using state-of-the-art solvers and approximate inference techniques large scale problems can be solved efficiently. The key advantage of using an ILP solver for solving the optimization problem defined by a constrained conditional model is the declarative formulation used as input for the ILP solver, consisting of a linear objective function and a set of linear constraints. == Resources == CCM Tutorial Predicting Structures in NLP: Constrained Conditional Models and Integer Linear Programming in NLP

Resilience week

Resilience week is an annual symposium established to enable cross-disciplinary and role based discussions to advance strategies and research that engenders resilience in critical infrastructure systems and communities. Damaging storms, cyber attack and the interconnection of critical infrastructure systems can lead to cascading events that not only affect local but also across regions. However, many of these interdependencies are not easily recognized and obscure and complicate the mitigation of risk. The purpose of the symposia series is hence to facilitate best practice in managing critical infrastructure risks, by bringing together businesses, government and researchers. == Background == Originally organized in 2008 as a focus on the new research area of resilient control systems, including the disciplinary areas of control system, cyber-security, cognitive psychology and any number of critical infrastructure domains. Resilience has long been recognized as an area that requires not only the contributions of multiple disciplines or multidisciplinary participation, but interdisciplinary interaction where there is a common language and familiarity of the contributors to what other disciplines (and roles) contribute. The resulting interactions developed by Resilience Week and associated activities are intended to culture this sharing environment as a safe zone for inclusion; more importantly, an environment that lends to developing the new science and practice. As the attributes of resilience are complex, the contributions and topics for the event have included both the disciplinary and the project considerations, in keynotes, panels and research presentations. Keynotes have included senior leadership in the Department of Energy, Department of Defense, Department of Homeland Security, the National Science Foundation, and other agencies in addition to National Academy and professional organization fellows and senior industry leaders. Project panels and research presentations include emergent topics in resilience to climate change, cyber attack, damaging storms and the energy assurance. Topics Areas of focus have included: Control Systems Cyber Systems Cognitive Systems Communications Systems Communities and Infrastructure Project Focus Areas have included: Dependencies and Interdependencies Cyber Resilience for Operating Technology Commercializing Research and Development Building Critical Infrastructure Resilience through Distributed Energy Resources Energy Equity and Community Resilience Proceedings are developed for each year of the event, documenting the diversity of the research and engagements within these topical areas. == Impacts for the future == Since its inception, the Resilience Week community has evolved from one that primarily included only university researchers to one that includes many government laboratories, universities and private industries in the US and internationally. This type of collaboration forms a feedback loop that informs the research with the current needs and hones best practices. The future of the event is to further advance discussions that advance investment, recognize priorities and expedite technologies and tools to proactively address our energy future, in light of the natural and manmade challenges, and rationalizing the complex relationships that exist in critical infrastructure.

Dynamic topic model

Within statistics, Dynamic topic models' are generative models that can be used to analyze the evolution of (unobserved) topics of a collection of documents over time. This family of models was proposed by David Blei and John Lafferty and is an extension to Latent Dirichlet Allocation (LDA) that can handle sequential documents. In LDA, both the order the words appear in a document and the order the documents appear in the corpus are oblivious to the model. Whereas words are still assumed to be exchangeable, in a dynamic topic model the order of the documents plays a fundamental role. More precisely, the documents are grouped by time slice (e.g.: years) and it is assumed that the documents of each group come from a set of topics that evolved from the set of the previous slice. == Topics == Similarly to LDA and pLSA, in a dynamic topic model, each document is viewed as a mixture of unobserved topics. Furthermore, each topic defines a multinomial distribution over a set of terms. Thus, for each word of each document, a topic is drawn from the mixture and a term is subsequently drawn from the multinomial distribution corresponding to that topic. The topics, however, evolve over time. For instance, the two most likely terms of a topic at time t could be "network" and "Zipf" (in descending order) while the most likely ones at time t+1 could be "Zipf" and "percolation" (in descending order). == Model == Define α t {\displaystyle \alpha _{t}} as the per-document topic distribution at time t. β t , k {\displaystyle \beta _{t,k}} as the word distribution of topic k at time t. η t , d {\displaystyle \eta _{t,d}} as the topic distribution for document d in time t, z t , d , n {\displaystyle z_{t,d,n}} as the topic for the nth word in document d in time t, and w t , d , n {\displaystyle w_{t,d,n}} as the specific word. In this model, the multinomial distributions α t + 1 {\displaystyle \alpha _{t+1}} and β t + 1 , k {\displaystyle \beta _{t+1,k}} are generated from α t {\displaystyle \alpha _{t}} and β t , k {\displaystyle \beta _{t,k}} , respectively. Even though multinomial distributions are usually written in terms of the mean parameters, representing them in terms of the natural parameters is better in the context of dynamic topic models. The former representation has some disadvantages due to the fact that the parameters are constrained to be non-negative and sum to one. When defining the evolution of these distributions, one would need to assure that such constraints were satisfied. Since both distributions are in the exponential family, one solution to this problem is to represent them in terms of the natural parameters, that can assume any real value and can be individually changed. Using the natural parameterization, the dynamics of the topic model are given by β t , k | β t − 1 , k ∼ N ( β t − 1 , k , σ 2 I ) {\displaystyle \beta _{t,k}|\beta _{t-1,k}\sim N(\beta _{t-1,k},\sigma ^{2}I)} and α t | α t − 1 ∼ N ( α t − 1 , δ 2 I ) {\displaystyle \alpha _{t}|\alpha _{t-1}\sim N(\alpha _{t-1},\delta ^{2}I)} . The generative process at time slice 't' is therefore: Draw topics β t , k | β t − 1 , k ∼ N ( β t − 1 , k , σ 2 I ) ∀ k {\displaystyle \beta _{t,k}|\beta _{t-1,k}\sim N(\beta _{t-1,k},\sigma ^{2}I)\forall k} Draw mixture model α t | α t − 1 ∼ N ( α t − 1 , δ 2 I ) {\displaystyle \alpha _{t}|\alpha _{t-1}\sim N(\alpha _{t-1},\delta ^{2}I)} For each document: Draw η t , d ∼ N ( α t , a 2 I ) {\displaystyle \eta _{t,d}\sim N(\alpha _{t},a^{2}I)} For each word: Draw topic Z t , d , n ∼ Mult ( π ( η t , d ) ) {\displaystyle Z_{t,d,n}\sim {\textrm {Mult}}(\pi (\eta _{t,d}))} Draw word W t , d , n ∼ Mult ( π ( β t , Z t , d , n ) ) {\displaystyle W_{t,d,n}\sim {\textrm {Mult}}(\pi (\beta _{t,Z_{t,d,n}}))} where π ( x ) {\displaystyle \pi (x)} is a mapping from the natural parameterization x to the mean parameterization, namely π ( x i ) = exp ⁡ ( x i ) ∑ i exp ⁡ ( x i ) {\displaystyle \pi (x_{i})={\frac {\exp(x_{i})}{\sum _{i}\exp(x_{i})}}} . == Inference == In the dynamic topic model, only W t , d , n {\displaystyle W_{t,d,n}} is observable. Learning the other parameters constitutes an inference problem. Blei and Lafferty argue that applying Gibbs sampling to do inference in this model is more difficult than in static models, due to the nonconjugacy of the Gaussian and multinomial distributions. They propose the use of variational methods, in particular, the Variational Kalman Filtering and the Variational Wavelet Regression. == Applications == In the original paper, a dynamic topic model is applied to the corpus of Science articles published between 1881 and 1999 aiming to show that this method can be used to analyze the trends of word usage inside topics. The authors also show that the model trained with past documents is able to fit documents of an incoming year better than LDA. A continuous dynamic topic model was developed by Wang et al. and applied to predict the timestamp of documents. Going beyond text documents, dynamic topic models were used to study musical influence, by learning musical topics and how they evolve in recent history.

Noam Slonim

Noam Slonim (Hebrew: נעם סלונים; born in Jerusalem) is an Israeli computer scientist, specializing in Natural Language Processing and the application of Large language models. He is a Research Scientist at Google Research Israel (since September 2025) and formerly an IBM Distinguished Engineer. He founded and served as Principal Investigator of Project Debater and led Language Model Utilization at IBM Research. Beyond his scientific achievements, Slonim had a writing and media career. He was a writer for Season 4 of The Cameric Five TV comedy show, published a weekly column in Haaretz on brain science, and co-created and wrote the Israeli sitcom Puzzle. He was also the head writer for Seasons 2 and 3 of the sitcom Ha-movilim and featured in the 2020 documentary The Debater. In October 2025, his debut novel, Questionable Memories, was published by Kinneret Publishing Group. == Education and research interests == Slonim graduated from the Hebrew University of Jerusalem in 1996 with a B.S. degree in Computer Science, Physics, and Mathematics. In 2002 he completed Ph.D. summa cum laude at the Interdisciplinary Center for Neural Computation at the Hebrew University, under the supervision of Professor Naftali Tishby. His thesis focused on the theory and applications of the Information Bottleneck method. From 2003 till 2006 he did post-doctoral studies at the Lewis-Sigler Institute for Integrative Genomics at Princeton University, working with Professor Bill Bialek and Professor Saeed Tavazoie. He joined IBM Research in 2007. Slonim holds over 30 patents (granted or pending) and has co-authored more than 100 scientific publications. In 2025, he joined Google Research Israel as a research scientist. == Research activities == From 1998 to 2003 he worked on the theory and applications of the Information Bottleneck method, suggesting various cluster analysis algorithms inspired by this method, and demonstrating the practical value of these algorithms on various domains. From 2003 to 2006 he worked on developing Machine Learning algorithms that rely on Information Theory concepts, and applied these algorithms to the analysis of various types of Genomics data. In 2011 he proposed to develop the first Artificial Intelligence system that can meaningfully participate in a full live debate with an expert human debater. This work gave rise to Project Debater, that debated expert human debaters in several live events during 2018 and 2019. In 2020, Slonim delivered the opening keynote at the EMNLP conference, describing the IBM Research work on developing Project Debater. From 2022 to 2025, he led IBM Research efforts applying large language models to practical use cases; in 2025 he moved to Google Research Israel as a Research Scientist. == Writing and video career == In 1996 Slonim was a writer for Season 4 of The Cameric Five TV comedy show. In 1997–1998 he published a weekly column in Haaretz newspaper, focused on brain science research. In 1997–1999 he co-created and co-wrote the Israeli sitcom, Puzzle. In 2008–2010 he was the head writer of Season 2 and Season 3 of the Israeli Sitcom, Ha-movilim. In 2020 he was featured in the documentary The Debater, an official selection of the 2020 Copenhagen International Documentary Film Festival. In 2025, his debut novel, Questionable Memories, was published by Kinneret Publishing Group.

Scott Fahlman

Scott Elliott Fahlman (born March 21, 1948) is an American computer scientist and Professor Emeritus at Carnegie Mellon University's Language Technologies Institute and Computer Science Department. He is notable for early work on automated planning and scheduling in a blocks world, on semantic networks, on neural networks (especially the cascade correlation algorithm), on the programming languages Dylan, and Common Lisp (especially CMU Common Lisp), and he was one of the founders of Lucid Inc. During the period when it was standardized, he was recognized as "the leader of Common Lisp." From 2006 to 2015, Fahlman was engaged in developing a knowledge base named Scone, based in part on his thesis work on the NETL Semantic Network. He also is credited with coining the use of the emoticon. == Life and career == Fahlman was born in Medina, Ohio, the son of Lorna May (Dean) and John Emil Fahlman. He attended the Massachusetts Institute of Technology (MIT), where he received a Bachelor of Science (B.S.) and Master of Science (M.S.) degree in electrical engineering and computer science in 1973, and a Doctor of Philosophy (Ph.D.) in artificial intelligence in 1977. He has noted that his doctoral diploma says the degree was awarded for "original research as demonstrated by a thesis in the field of Artificial Intelligence" and suggested that it may be the first doctorate to use that term. He is a fellow of the American Association for Artificial Intelligence. Fahlman acted as thesis advisor for Donald Cohen, David B. McDonald, David S. Touretzky, Skef Wholey, Justin Boyan, Michael Witbrock, and Alicia Tribble Sagae. From May 1996 to July 2001, Fahlman directed the Justsystem Pittsburgh Research Center. === Boltzmann Machine (1983) === In 1983, Fahlman, Geoffrey Hinton, and Terry Sejnowski published a paper in Proceedings of the AAAI-83 Conference, Washington DC, August 1983. The paper was titled as "Massively Parallel Architectures for AI: NETL, Thistle and Boltzmann Machines". === Emoticons === Fahlman was not the first to suggest the concept of the emoticon – a similar concept for a marker appeared in an article of Reader's Digest in May 1967, although that idea was never put into practice. In an interview printed in The New York Times in 1969, Vladimir Nabokov noted: "I often think there should exist a special typographical sign for a smile – some sort of concave mark, a supine round bracket." Fahlman is credited with originating the first smiley emoticon, which he thought would help people on a message board at Carnegie Mellon to distinguish serious posts from jokes. He proposed the use of :-) and :-( for this purpose, and the symbols caught on. The original message from which these symbols originated was posted on 19 September 1982. The message was recovered by Jeff Baird on 10 September 2002 and read: 19-Sep-82 11:44 Scott E Fahlman :-) From: Scott E Fahlman I propose that the following character sequence for joke markers: :-) Read it sideways. Actually, it is probably more economical to mark things that are NOT jokes, given current trends. For this, use :-(

Coalition for App Fairness

The Coalition for App Fairness (CAF) is a coalition comprised by companies, who aim to reach a fairer deal for the inclusion of their apps into the Apple App Store or the Google Play Store. The organization's executive director is Meghan DiMuzio and its headquarters are located in Washington, D.C. == Background == In July 2015, Spotify launched an email campaign to urge its App Store subscribers to cancel their subscriptions and start new ones through its website, bypassing the 30% transaction fee for in-app purchases required for iOS applications by technology company Apple Inc. A later update to the Spotify app on iOS was rejected by Apple, prompting Spotify's general counsel Horacio Gutierrez to write a letter to Apple's then-general counsel Bruce Sewell, stating: "This latest episode raises serious concerns under both U.S. and EU competition law. It continues a troubling pattern of behavior by Apple to exclude and diminish the competitiveness of Spotify on iOS and as a rival to Apple Music, particularly when seen against the backdrop of Apple's previous anticompetitive conduct aimed at Spotify … we cannot stand by as Apple uses the App Store approval process as a weapon to harm competitors." In August 2020, Epic Games updated their Fortnite Battle Royale game app on both Apple's App Store and Google's Google Play to include its own storefront that offered a 20% discount on V-Bucks, the in-game currency, if players bought through there rather than through the app stores' storefront, both which take a 30% revenue cut of the sale. Both Apple and Google removed the Fortnite app within hours, as this alternate storefront violated their terms of use that required all in-app purchases to be made through their storefronts. Epic immediately filed lawsuits against both companies challenging their storefront policies on antitrust principles, arguing that their non-negotiable 30% revenue cut is too high and the restrictions against alternate storefronts anticompetitive. Apple countersued Epic over its behavior, leading to a highly publicized 2021 bench trial. Ultimately, Epic largely lost its lawsuit against Apple, though the court did order Apple to allow developers to point users to alternative payment methods. Conversely, Epic won its antitrust lawsuit against Google in late 2023. == Foundation == On 24 September 2020, Epic Games joined forces with thirteen other prominent companies—including the music streaming platform Spotify, Tinder owner Match Group, the encrypted mail service Proton Mail, and the crypto currency website Blockchain.com—to establish the Coalition for App Fairness. It also includes Basecamp. The coalition criticizes the fact that for now the app stores of both Apple and Google charge their clients a 30% fee on any purchases made over their stores. Apple and Google defended themselves by arguing that the 30% transaction fee is a standard in the industry while the Coalition for App Fairness states that there is no other transaction fee which is even close to the 30%. In October 2020, it was reported that the coalition grew from 13 to 40 members since its foundation and received more than 400 applications for membership. In October 2025, X (formerly Twitter) joined CAF. This was seen as a larger pushback in the industry against Apple and Google, and a step towards hopefully passing the Bipartisan Open App Markets Act. == Aims == The group has broadened their demands for the app stores and now also aim for a better treatment for the apps available in the App Store. They claim that Apple favors its own services before other services available on the market and unjustifiably excludes other apps from their App Store. The group has also been viewing other transaction fees like the 5% fee which is charged by credit card companies, and states that Apple charges up to 600% more and would like the 30% fee, which was only included in 2011 by Apple, adapted to a comparable percentage that charge other providers of payment solutions. Its demands are mainly directed at Apple's strict control over its App Store, but to a lesser extent are also directed towards Google. Google allows apps to be downloaded over an independent web link or also another App Store, such as the Epic Game App Store. The organization emphasizes that no app developer should come into the position in which they are discriminated and are not granted the same rights as to the developers of the owner of the app store. == Reactions == In October 2020, Microsoft presented a new framework concerning the access to its Windows 10 operating system by app stores other than the one offered by Microsoft. The new framework is based on the demands of the Coalition for App Fairness. Microsoft emphasized though, that these principles would not apply to the Xbox. In December 2020, Apple announced that they would be lowering the revenue cut Apple takes for app developers making $1M or less from 30% to 15% if app developers fill out an application for the lowered revenue cut. In March 2021, Google followed suit by also lowering the revenue cut from the Play Store from 30% to 15% for the first million in revenue earned by a developer each year. == Notable members == Members listed are notable companies listed as members the groups website: Blockchain.com Deezer Epic Games European Digital SME Alliance Fanfix Life360 Masimo Nium Proton Mail Spotify TapTap Threema Vipps

Lenhart Schubert

Lenhart Karl Otto Schubert is a professor of Computer Science at the University of Rochester, as well as a member of the Center for Language Sciences and the Center for Computation and the Brain. Schubert is a prominent researcher in the field of common sense reasoning. == Biography == Schubert received his Ph.D. from the University of Toronto in 1970. He was on the faculty of the University of Alberta between 1973 and 1988 and joined the faculty at the University of Rochester in 1988. He was elected fellow of Association for Advancement of Artificial Intelligence in 1993 for "fundamental contributions in NLP, esp. in the formalization, representation, and practical implementation of non-first order concepts".