Source author record

Krishnendu Chatterjee

Krishnendu Chatterjee appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Science and Game Theory Logic in Computer Science Formal Languages and Automata Theory Artificial Intelligence Data Structures and Algorithms Programming Languages Populations and Evolution Systems and Control Machine Learning Software Engineering Cryptography and Security Discrete Mathematics math.OC Biological Physics Computational Engineering, Finance, and Science math.PR nlin.CG physics.soc-ph

Catalog footprint

What is connected

108works

18topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Automated Approach for Solving Infinite-state Polynomial Reachability Games

Reachability games are two-player games played on a graph, where the objective of $\texttt{REACH}$ player is to reach the target set whereas the objective of $\texttt{SAFE}$ player is to stay away from the target set. Reachability games have important applications in artificial intelligence and reactive synthesis, and many of these applications give rise to infinite-state reachability games. In this paper, we study turn-based reachability games on infinite-state graphs defined over valuations of a finite set of real variables. We consider the problem of determining the existence of and computing a winning strategy for $\texttt{REACH}$ player. Our contributions are twofold. First, we propose ranking certificates for reachability games, a sound and complete proof rule for proving that $\texttt{REACH}$ player has a winning strategy from the specified initial state. Second, we consider polynomial reachability games, where transitions and objectives are described by polynomial constraints over real variables, and propose a fully automated algorithm for computing a winning strategy for $\texttt{REACH}$ player together with a formal correctness witness in the form of a ranking certificate. The algorithm is sound, semi-complete, and runs in sub-exponential time. Our experiments demonstrate the ability of our method to solve challenging examples from the literature that were out of the reach of existing methods. Specifically, for the classical Cinderella-Stepmother game, we are able to compute an optimal winning strategy for an arbitrary precision parameter for the first time.

preprint2026arXiv

Genotype specificity and spatial arrangement govern the direction and magnitude of selection in variable environments

Spatial environmental variation can either amplify or suppress the fixation of beneficial mutants in structured populations, yet the interplay of ecological factors and spatial structure in determining which outcome occurs remains theoretically unresolved. Here, we develop a unified framework for selection on lattice graphs with environmental heterogeneity, in which mutant and resident fitness depend on the local environmental state. Across three common classes of genotype-environment interactions and a wide range of spatial arrangements of environmental states, we identify two governing principles. Genotype specificity determines the direction of the effect: heterogeneity amplifies selection when it modulates resident fitness, but suppresses selection when it modulates mutant fitness, with genotype-symmetric modulation producing weaker amplification. Spatial arrangement determines the magnitude: intermixed versus clustered environments tune the strength of amplification or suppression without reversing the direction of the effect. Together, these principles reconcile disparate theoretical results and provide predictive criteria for adaptation in heterogeneous landscapes, from microbial communities to somatic evolution and cancer.

preprint2026arXiv

Multi-Environment POMDPs with Finite-Horizon Objectives

Partially Observable Markov Decision Processes (POMDPs) are systems in which one agent interacts with a stochastic environment, and receives only partial information about the current state. In a multi-environment POMDP (MEPOMDP), the initial state is unknown, and assumed to be adversarially chosen. In this work we focus on computing the optimal value and policy in MEPOMDPs with finite-horizon objectives. That problem is known to be PSPACE-complete in POMDPs. Our main results are as follows: (1) we establish that it is also PSPACE-complete in the more general setting of MEPOMDPs; (2) we present a practical algorithm and evaluate it on classical benchmarks, significantly outperforming the only previously known algorithm.

preprint2022arXiv

Evolutionary instability of selfish learning in repeated games

Across many domains of interaction, both natural and artificial, individuals use past experience to shape future behaviors. The results of such learning processes depend on what individuals wish to maximize. A natural objective is one's own success. However, when two such "selfish" learners interact with each other, the outcome can be detrimental to both, especially when there are conflicts of interest. Here, we explore how a learner can align incentives with a selfish opponent. Moreover, we consider the dynamics that arise when learning rules themselves are subject to evolutionary pressure. By combining extensive simulations and analytical techniques, we demonstrate that selfish learning is unstable in most classical two-player repeated games. If evolution operates on the level of long-run payoffs, selection instead favors learning rules that incorporate social (other-regarding) preferences. To further corroborate these results, we analyze data from a repeated prisoner's dilemma experiment. We find that selfish learning is insufficient to explain human behavior when there is a trade-off between payoff maximization and fairness.

preprint2022arXiv

Learning Stabilizing Policies in Stochastic Control Systems

In this work, we address the problem of learning provably stable neural network policies for stochastic control systems. While recent work has demonstrated the feasibility of certifying given policies using martingale theory, the problem of how to learn such policies is little explored. Here, we study the effectiveness of jointly learning a policy together with a martingale certificate that proves its stability using a single learning algorithm. We observe that the joint optimization problem becomes easily stuck in local minima when starting from a randomly initialized policy. Our results suggest that some form of pre-training of the policy is required for the joint optimization to repair and verify the policy successfully.

preprint2021arXiv

On Satisficing in Quantitative Games

Several problems in planning and reactive synthesis can be reduced to the analysis of two-player quantitative graph games. {\em Optimization} is one form of analysis. We argue that in many cases it may be better to replace the optimization problem with the {\em satisficing problem}, where instead of searching for optimal solutions, the goal is to search for solutions that adhere to a given threshold bound. This work defines and investigates the satisficing problem on a two-player graph game with the discounted-sum cost model. We show that while the satisficing problem can be solved using numerical methods just like the optimization problem, this approach does not render compelling benefits over optimization. When the discount factor is, however, an integer, we present another approach to satisficing, which is purely based on automata methods. We show that this approach is algorithmically more performant -- both theoretically and empirically -- and demonstrates the broader applicability of satisficing overoptimization.

preprint2020arXiv

Approximating Values of Generalized-Reachability Stochastic Games

Simple stochastic games are turn-based 2.5-player games with a reachability objective. The basic question asks whether one player can ensure reaching a given target with at least a given probability. A natural extension is games with a conjunction of such conditions as objective. Despite a plethora of recent results on the analysis of systems with multiple objectives, the decidability of this basic problem remains open. In this paper, we present an algorithm approximating the Pareto frontier of the achievable values to a given precision. Moreover, it is an anytime algorithm, meaning it can be stopped at any time returning the current approximation and its error bound.

preprint2020arXiv

Concentration-Bound Analysis for Probabilistic Programs and Probabilistic Recurrence Relations

Analyzing probabilistic programs and randomized algorithms are classical problems in computer science. The first basic problem in the analysis of stochastic processes is to consider the expectation or mean, and another basic problem is to consider concentration bounds, i.e. showing that large deviations from the mean have small probability. Similarly, in the context of probabilistic programs and randomized algorithms, the analysis of expected termination time/running time and their concentration bounds are fundamental problems.In this work, we focus on concentration bounds for probabilistic programs and probabilistic recurrences of randomized algorithms. For probabilistic programs, the basic technique to achieve concentration bounds is to consider martingales and apply the classical Azuma's inequality. For probabilistic recurrences of randomized algorithms, Karp's classical "cookbook" method, which is similar to the master theorem for recurrences, is the standard approach to obtain concentration bounds. In this work, we propose a novel approach for deriving concentration bounds for probabilistic programs and probabilistic recurrence relations through the synthesis of exponential supermartingales. For probabilistic programs, we present algorithms for synthesis of such supermartingales in several cases. We also show that our approach can derive better concentration bounds than simply applying the classical Azuma's inequality over various probabilistic programs considered in the literature. For probabilistic recurrences, our approach can derive tighter bounds than the Karp's well-established methods on classical algorithms. Moreover, we show that our approach could derive bounds comparable to the optimal bound for quicksort, proposed by McDiarmid and Hayward. We also present a prototype implementation that can automatically infer these bounds

preprint2020arXiv

Faster Algorithms for Quantitative Analysis of Markov Chains and Markov Decision Processes with Small Treewidth

Discrete-time Markov Chains (MCs) and Markov Decision Processes (MDPs) are two standard formalisms in system analysis. Their main associated quantitative objectives are hitting probabilities, discounted sum, and mean payoff. Although there are many techniques for computing these objectives in general MCs/MDPs, they have not been thoroughly studied in terms of parameterized algorithms, particularly when treewidth is used as the parameter. This is in sharp contrast to qualitative objectives for MCs, MDPs and graph games, for which treewidth-based algorithms yield significant complexity improvements. In this work, we show that treewidth can also be used to obtain faster algorithms for the quantitative problems. For an MC with $n$ states and $m$ transitions, we show that each of the classical quantitative objectives can be computed in $O((n+m)\cdot t^2)$ time, given a tree decomposition of the MC that has width $t$. Our results also imply a bound of $O(κ\cdot (n+m)\cdot t^2)$ for each objective on MDPs, where $κ$ is the number of strategy-iteration refinements required for the given input and objective. Finally, we make an experimental evaluation of our new algorithms on low-treewidth MCs and MDPs obtained from the DaCapo benchmark suite. Our experimental results show that on MCs and MDPs with small treewidth, our algorithms outperform existing well-established methods by one or more orders of magnitude.

preprint2020arXiv

Inductive Reachability Witnesses

In this work, we consider the fundamental problem of reachability analysis over imperative programs with real variables. The reachability property requires that a program can reach certain target states during its execution. Previous works that tackle reachability analysis are either unable to handle programs consisting of general loops (e.g. symbolic execution), or lack completeness guarantees (e.g. abstract interpretation), or are not automated (e.g. incorrectness logic/reverse Hoare logic). In contrast, we propose a novel approach for reachability analysis that can handle general programs, is (semi-)complete, and can be entirely automated for a wide family of programs. Our approach extends techniques from both invariant generation and ranking-function synthesis to reachability analysis through the notion of (Universal) Inductive Reachability Witnesses (IRWs/UIRWs). While traditional invariant generation uses over-approximations of reachable states, we consider the natural dual problem of under-approximating the set of program states that can reach a target state. We then apply an argument similar to ranking functions to ensure that all states in our under-approximation can indeed reach the target set in finitely many steps.

preprint2020arXiv

Infection dynamics of COVID-19 virus under lockdown and reopening

Motivated by COVID-19, we develop and analyze a simple stochastic model for a disease spread in human population. We track how the number of infected and critically ill people develops over time in order to estimate the demand that is imposed on the hospital system. To keep this demand under control, we consider a class of simple policies for slowing down and reopening the society and we compare their efficiency in mitigating the spread of the virus from several different points of view. We find that in order to avoid overwhelming of the hospital system, a policy must impose a harsh lockdown or it must react swiftly (or both). While reacting swiftly is universally beneficial, being harsh pays off only when the country is patient about reopening and when the neighboring countries coordinate their mitigation efforts. Our work highlights the importance of acting decisively when closing down and the importance of patience and coordination between neighboring countries when reopening.

preprint2020arXiv

Multi-dimensional Long-Run Average Problems for Vector Addition Systems with States

A vector addition system with states (VASS) consists of a finite set of states and counters. A transition changes the current state to the next state, and every counter is either incremented, or decremented, or left unchanged. A state and value for each counter is a configuration; and a computation is an infinite sequence of configurations with transitions between successive configurations. A probabilistic VASS consists of a VASS along with a probability distribution over the transitions for each state. Qualitative properties such as state and configuration reachability have been widely studied for VASS. In this work we consider multi-dimensional long-run average objectives for VASS and probabilistic VASS. For a counter, the cost of a configuration is the value of the counter; and the long-run average value of a computation for the counter is the long-run average of the costs of the configurations in the computation. The multi-dimensional long-run average problem given a VASS and a threshold value for each counter, asks whether there is a computation such that for each counter the long-run average value for the counter does not exceed the respective threshold. For probabilistic VASS, instead of the existence of a computation, we consider whether the expected long-run average value for each counter does not exceed the respective threshold. Our main results are as follows: we show that the multi-dimensional long-run average problem (a) is NP-complete for integer-valued VASS; (b) is undecidable for natural-valued VASS (i.e., nonnegative counters); and (c) can be solved in polynomial time for probabilistic integer-valued VASS, and probabilistic natural-valued VASS when all computations are non-terminating.

preprint2020arXiv

Optimal and Perfectly Parallel Algorithms for On-demand Data-flow Analysis

Interprocedural data-flow analyses form an expressive and useful paradigm of numerous static analysis applications, such as live variables analysis, alias analysis and null pointers analysis. The most widely-used framework for interprocedural data-flow analysis is IFDS, which encompasses distributive data-flow functions over a finite domain. On-demand data-flow analyses restrict the focus of the analysis on specific program locations and data facts. This setting provides a natural split between (i) an offline (or preprocessing) phase, where the program is partially analyzed and analysis summaries are created, and (ii) an online (or query) phase, where analysis queries arrive on demand and the summaries are used to speed up answering queries. In this work, we consider on-demand IFDS analyses where the queries concern program locations of the same procedure (aka same-context queries). We exploit the fact that flow graphs of programs have low treewidth to develop faster algorithms that are space and time optimal for many common data-flow analyses, in both the preprocessing and the query phase. We also use treewidth to develop query solutions that are embarrassingly parallelizable, i.e. the total work for answering each query is split to a number of threads such that each thread performs only a constant amount of work. Finally, we implement a static analyzer based on our algorithms, and perform a series of on-demand analysis experiments on standard benchmarks. Our experimental results show a drastic speed-up of the queries after only a lightweight preprocessing phase, which significantly outperforms existing techniques.

preprint2020arXiv

Polynomial Invariant Generation for Non-deterministic Recursive Programs

We consider the classical problem of invariant generation for programs with polynomial assignments and focus on synthesizing invariants that are a conjunction of strict polynomial inequalities. We present a sound and semi-complete method based on positivstellensaetze, i.e. theorems in semi-algebraic geometry that characterize positive polynomials over a semi-algebraic set. To the best of our knowledge, this is the first invariant generation method to provide completeness guarantees for invariants consisting of polynomial inequalities. Moreover, on the theoretical side, the worst-case complexity of our approach is subexponential, whereas the worst-case complexity of the previously-known complete method (Colon et al, CAV 2003), which could only handle linear invariants, is exponential. On the practical side, we reduce the invariant generation problem to quadratic programming (QCLP), which is a classical optimization problem with many industrial solvers. Finally, we demonstrate the applicability of our approach by providing experimental results on several academic benchmarks.

preprint2020arXiv

Reinforcement Learning of Risk-Constrained Policies in Markov Decision Processes

Markov decision processes (MDPs) are the defacto frame-work for sequential decision making in the presence ofstochastic uncertainty. A classical optimization criterion forMDPs is to maximize the expected discounted-sum pay-off, which ignores low probability catastrophic events withhighly negative impact on the system. On the other hand,risk-averse policies require the probability of undesirableevents to be below a given threshold, but they do not accountfor optimization of the expected payoff. We consider MDPswith discounted-sum payoff with failure states which repre-sent catastrophic outcomes. The objective of risk-constrainedplanning is to maximize the expected discounted-sum payoffamong risk-averse policies that ensure the probability to en-counter a failure state is below a desired threshold. Our maincontribution is an efficient risk-constrained planning algo-rithm that combines UCT-like search with a predictor learnedthrough interaction with the MDP (in the style of AlphaZero)and with a risk-constrained action selection via linear pro-gramming. We demonstrate the effectiveness of our approachwith experiments on classical MDPs from the literature, in-cluding benchmarks with an order of 10^6 states.

preprint2020arXiv

Simplified Game of Life: Algorithms and Complexity

Game of Life is a simple and elegant model to study dynamical system over networks. The model consists of a graph where every vertex has one of two types, namely, dead or alive. A configuration is a mapping of the vertices to the types. An update rule describes how the type of a vertex is updated given the types of its neighbors. In every round, all vertices are updated synchronously, which leads to a configuration update. While in general, Game of Life allows a broad range of update rules, we focus on two simple families of update rules, namely, underpopulation and overpopulation, that model several interesting dynamics studied in the literature. In both settings, a dead vertex requires at least a desired number of live neighbors to become alive. For underpopulation (resp., overpopulation), a live vertex requires at least (resp. at most) a desired number of live neighbors to remain alive. We study the basic computation problems, e.g., configuration reachability, for these two families of rules. For underpopulation rules, we show that these problems can be solved in polynomial time, whereas for overpopulation rules they are PSPACE-complete.

preprint2020arXiv

Stochastic Games with Lexicographic Reachability-Safety Objectives

We study turn-based stochastic zero-sum games with lexicographic preferences over reachability and safety objectives. Stochastic games are standard models in control, verification, and synthesis of stochastic reactive systems that exhibit both randomness as well as angelic and demonic non-determinism. Lexicographic order allows to consider multiple objectives with a strict preference order over the satisfaction of the objectives. To the best of our knowledge, stochastic games with lexicographic objectives have not been studied before. We establish determinacy of such games and present strategy and computational complexity results. For strategy complexity, we show that lexicographically optimal strategies exist that are deterministic and memory is only required to remember the already satisfied and violated objectives. For a constant number of objectives, we show that the relevant decision problem is in NP $\cap$ coNP, matching the current known bound for single objectives; and in general the decision problem is PSPACE-hard and can be solved in NEXPTIME $\cap$ coNEXPTIME. We present an algorithm that computes the lexicographically optimal strategies via a reduction to computation of optimal strategies in a sequence of single-objectives games. We have implemented our algorithm and report experimental results on various case studies.

preprint2020arXiv

The Moran process on 2-chromatic graphs

Resources are rarely distributed uniformly within a population. Heterogeneity in the concentration of a drug, the quality of breeding sites, or wealth can all affect evolutionary dynamics. In this study, we represent a collection of properties affecting the fitness at a given location using a color. A green node is rich in resources while a red node is poorer. More colors can represent a broader spectrum of resource qualities. For a population evolving according to the birth-death Moran model, the first question we address is which structures, identified by graph connectivity and graph coloring, are evolutionarily equivalent. We prove that all properly two-colored, undirected, regular graphs are evolutionarily equivalent (where "properly colored" means that no two neighbors have the same color). We then compare the effects of background heterogeneity on properly two-colored graphs to those with alternative schemes in which the colors are permuted. Finally, we discuss dynamic coloring as a model for spatiotemporal resource fluctuations, and we illustrate that random dynamic colorings often diminish the effects of background heterogeneity relative to a proper two-coloring.

preprint2019arXiv

Limits on amplifiers of natural selection under death-Birth updating

The fixation probability of a single mutant invading a population of residents is among the most widely-studied quantities in evolutionary dynamics. Amplifiers of natural selection are population structures that increase the fixation probability of advantageous mutants, compared to well-mixed populations. Extensive studies have shown that many amplifiers exist for the Birth-death Moran process, some of them substantially increasing the fixation probability or even guaranteeing fixation in the limit of large population size. On the other hand, no amplifiers are known for the death-Birth Moran process, and computer-assisted exhaustive searches have failed to discover amplification. In this work we resolve this disparity, by showing that any amplification under death-Birth updating is necessarily \emph{bounded} and \emph{transient}. Our boundedness result states that even if a population structure does amplify selection, the resulting fixation probability is close to that of the well-mixed population. Our transience result states that for any population structure there exists a threshold $r^*$ such that the population structure ceases to amplify selection if the mutant fitness advantage $r$ is larger than $r^\star$. Finally, we also extend the above results to $δ$-death-Birth updating, which is a combination of Birth-death and death-Birth updating. On the positive side, we identify population structures that maintain amplification for a wide range of values $r$ and $δ$. These results demonstrate that amplification of natural selection depends on the specific mechanisms of the evolutionary process.

preprint2017arXiv

Faster Algorithms for Weighted Recursive State Machines

Pushdown systems (PDSs) and recursive state machines (RSMs), which are linearly equivalent, are standard models for interprocedural analysis. Yet RSMs are more convenient as they (a) explicitly model function calls and returns, and (b) specify many natural parameters for algorithmic analysis, e.g., the number of entries and exits. We consider a general framework where RSM transitions are labeled from a semiring and path properties are algebraic with semiring operations, which can model, e.g., interprocedural reachability and dataflow analysis problems. Our main contributions are new algorithms for several fundamental problems. As compared to a direct translation of RSMs to PDSs and the best-known existing bounds of PDSs, our analysis algorithm improves the complexity for finite-height semirings (that subsumes reachability and standard dataflow properties). We further consider the problem of extracting distance values from the representation structures computed by our algorithm, and give efficient algorithms that distinguish the complexity of a one-time preprocessing from the complexity of each individual query. Another advantage of our algorithm is that our improvements carry over to the concurrent setting, where we improve the best-known complexity for the context-bounded analysis of concurrent RSMs. Finally, we provide a prototype implementation that gives a significant speed-up on several benchmarks from the SLAM/SDV project.

preprint2016arXiv

Computation Tree Logic for Synchronization Properties

We present a logic that extends CTL (Computation Tree Logic) with operators that express synchronization properties. A property is synchronized in a system if it holds in all paths of a certain length. The new logic is obtained by using the same path quantifiers and temporal operators as in CTL, but allowing a different order of the quantifiers. This small syntactic variation induces a logic that can express non-regular properties for which known extensions of MSO with equality of path length are undecidable. We show that our variant of CTL is decidable and that the model-checking problem is in Delta_3^P = P^{NP^NP}, and is DP-hard. We analogously consider quantifier exchange in extensions of CTL, and we present operators defined using basic operators of CTL* that express the occurrence of infinitely many synchronization points. We show that the model-checking problem remains in Delta_3^P. The distinguishing power of CTL and of our new logic coincide if the Next operator is allowed in the logics, thus the classical bisimulation quotient can be used for state-space reduction before model checking.

preprint2016arXiv

Conditionally Optimal Algorithms for Generalized Büchi Games

Games on graphs provide the appropriate framework to study several central problems in computer science, such as the verification and synthesis of reactive systems. One of the most basic objectives for games on graphs is the liveness (or Büchi) objective that given a target set of vertices requires that some vertex in the target set is visited infinitely often. We study generalized Büchi objectives (i.e., conjunction of liveness objectives), and implications between two generalized Büchi objectives (known as GR(1) objectives), that arise in numerous applications in computer-aided verification. We present improved algorithms and conditional super-linear lower bounds based on widely believed assumptions about the complexity of (A1) combinatorial Boolean matrix multiplication and (A2) CNF-SAT. We consider graph games with $n$ vertices, $m$ edges, and generalized Büchi objectives with $k$ conjunctions. First, we present an algorithm with running time $O(k \cdot n^2)$, improving the previously known $O(k \cdot n \cdot m)$ and $O(k^2 \cdot n^2)$ worst-case bounds. Our algorithm is optimal for dense graphs under (A1). Second, we show that the basic algorithm for the problem is optimal for sparse graphs when the target sets have constant size under (A2). Finally, we consider GR(1) objectives, with $k_1$ conjunctions in the antecedent and $k_2$ conjunctions in the consequent, and present an $O(k_1 \cdot k_2 \cdot n^{2.5})$-time algorithm, improving the previously known $O(k_1 \cdot k_2 \cdot n \cdot m)$-time algorithm for $m > n^{1.5}$.

preprint2016arXiv

Mean-Payoff Pushdown Games

Two-player games on graphs is central in many problems in formal verification and program analysis such as synthesis and verification of open systems. In this work we consider solving recursive game graphs (or pushdown game graphs) that can model the control flow of sequential programs with recursion. While pushdown games have been studied before with qualitative objectives, such as reachability and $ω$-regular objectives, in this work we study for the first time such games with the most well-studied quantitative objective, namely, mean-payoff objectives. In pushdown games two types of strategies are relevant: (1) global strategies, that depend on the entire global history; and (2) modular strategies, that have only local memory and thus does not depend on the context of invocation, but only on the history of the current invocation of the module. Our main results are as follows (1) One-player pushdown games with mean-payoff objectives under global strategies is decidable in polynomial time. (2) Two-player pushdown games with mean-payoff objectives under global strategies is undecidable. (3) One-player pushdown games with mean-payoff objectives under modular strategies is NP-hard. (4) Two-player pushdown games with mean-payoff objectives under modular strategies can be solved in NP (i.e., both one-player and two-player pushdown games with mean-payoff objectives under modular strategies is NP-complete). We also establish the optimal strategy complexity showing that global strategies for mean-payoff objectives require infinite memory even in one-player pushdown games; and memoryless modular strategies are sufficient in two-player pushdown games. Finally we also show that all the problems have the same complexity if the stack boundedness condition is added, where along with the mean-payoff objective the player must also ensure that the stack height is bounded.

preprint2016arXiv

Model and Objective Separation with Conditional Lower Bounds: Disjunction is Harder than Conjunction

Given a model of a system and an objective, the model-checking question asks whether the model satisfies the objective. We study polynomial-time problems in two classical models, graphs and Markov Decision Processes (MDPs), with respect to several fundamental $ω$-regular objectives, e.g., Rabin and Streett objectives. For many of these problems the best-known upper bounds are quadratic or cubic, yet no super-linear lower bounds are known. In this work our contributions are two-fold: First, we present several improved algorithms, and second, we present the first conditional super-linear lower bounds based on widely believed assumptions about the complexity of CNF-SAT and combinatorial Boolean matrix multiplication. A separation result for two models with respect to an objective means a conditional lower bound for one model that is strictly higher than the existing upper bound for the other model, and similarly for two objectives with respect to a model. Our results establish the following separation results: (1) A separation of models (graphs and MDPs) for disjunctive queries of reachability and Büchi objectives. (2) Two kinds of separations of objectives, both for graphs and MDPs, namely, (2a) the separation of dual objectives such as reachability/safety (for disjunctive questions) and Streett/Rabin objectives, and (2b) the separation of conjunction and disjunction of multiple objectives of the same type such as safety, Büchi, and coBüchi. In summary, our results establish the first model and objective separation results for graphs and MDPs for various classical $ω$-regular objectives. Quite strikingly, we establish conditional lower bounds for the disjunction of objectives that are strictly higher than the existing upper bounds for the conjunction of the same objectives.

preprint2016arXiv

Perfect-Information Stochastic Games with Generalized Mean-Payoff Objectives

Graph games provide the foundation for modeling and synthesizing reactive processes. In the synthesis of stochastic reactive processes, the traditional model is perfect-information stochastic games, where some transitions of the game graph are controlled by two adversarial players, and the other transitions are executed probabilistically. We consider such games where the objective is the conjunction of several quantitative objectives (specified as mean-payoff conditions), which we refer to as generalized mean-payoff objectives. The basic decision problem asks for the existence of a finite-memory strategy for a player that ensures the generalized mean-payoff objective be satisfied with a desired probability against all strategies of the opponent. A special case of the decision problem is the almost-sure problem where the desired probability is 1. Previous results presented a semi-decision procedure for epsilon-approximations of the almost-sure problem. In this work, we show that both the almost-sure problem as well as the general basic decision problem are coNP-complete, significantly improving the previous results. Moreover, we show that in the case of 1-player stochastic games, randomized memoryless strategies are sufficient and the problem can be solved in polynomial time. In contrast, in two-player stochastic games, we show that even with randomized strategies exponential memory is required in general, and present a matching exponential upper bound. We also study the basic decision problem with infinite-memory strategies and present computational complexity results for the problem. Our results are relevant in the synthesis of stochastic reactive systems with multiple quantitative requirements.

preprint2016arXiv

Robust Draws in Balanced Knockout Tournaments

Balanced knockout tournaments are ubiquitous in sports competitions and are also used in decision-making and elections. The traditional computational question, that asks to compute a draw (optimal draw) that maximizes the winning probability for a distinguished player, has received a lot of attention. Previous works consider the problem where the pairwise winning probabilities are known precisely, while we study how robust is the winning probability with respect to small errors in the pairwise winning probabilities. First, we present several illuminating examples to establish: (a)~there exist deterministic tournaments (where the pairwise winning probabilities are~0 or~1) where one optimal draw is much more robust than the other; and (b)~in general, there exist tournaments with slightly suboptimal draws that are more robust than all the optimal draws. The above examples motivate the study of the computational problem of robust draws that guarantee a specified winning probability. Second, we present a polynomial-time algorithm for approximating the robustness of a draw for sufficiently small errors in pairwise winning probabilities, and obtain that the stated computational problem is NP-complete. We also show that two natural cases of deterministic tournaments where the optimal draw could be computed in polynomial time also admit polynomial-time algorithms to compute robust optimal draws.

preprint2016arXiv

Stochastic Invariants for Probabilistic Termination

Termination is one of the basic liveness properties, and we study the termination problem for probabilistic programs with real-valued variables. Previous works focused on the qualitative problem that asks whether an input program terminates with probability~1 (almost-sure termination). A powerful approach for this qualitative problem is the notion of ranking supermartingales with respect to a given set of invariants. The quantitative problem (probabilistic termination) asks for bounds on the termination probability. A fundamental and conceptual drawback of the existing approaches to address probabilistic termination is that even though the supermartingales consider the probabilistic behavior of the programs, the invariants are obtained completely ignoring the probabilistic aspect. In this work we address the probabilistic termination problem for linear-arithmetic probabilistic programs with nondeterminism. We define the notion of {\em stochastic invariants}, which are constraints along with a probability bound that the constraints hold. We introduce a concept of {\em repulsing supermartingales}. First, we show that repulsing supermartingales can be used to obtain bounds on the probability of the stochastic invariants. Second, we show the effectiveness of repulsing supermartingales in the following three ways: (1)~With a combination of ranking and repulsing supermartingales we can compute lower bounds on the probability of termination; (2)~repulsing supermartingales provide witnesses for refutation of almost-sure termination; and (3)~with a combination of ranking and repulsing supermartingales we can establish persistence properties of probabilistic programs. We also present results on related computational problems and an experimental evaluation of our approach on academic examples.

preprint2016arXiv

Stochastic Shortest Path with Energy Constraints in POMDPs

We consider partially observable Markov decision processes (POMDPs) with a set of target states and positive integer costs associated with every transition. The traditional optimization objective (stochastic shortest path) asks to minimize the expected total cost until the target set is reached. We extend the traditional framework of POMDPs to model energy consumption, which represents a hard constraint. The energy levels may increase and decrease with transitions, and the hard constraint requires that the energy level must remain positive in all steps till the target is reached. First, we present a novel algorithm for solving POMDPs with energy levels, developing on existing POMDP solvers and using RTDP as its main method. Our second contribution is related to policy representation. For larger POMDP instances the policies computed by existing solvers are too large to be understandable. We present an automated procedure based on machine learning techniques that automatically extracts important decisions of the policy allowing us to compute succinct human readable policies. Finally, we show experimentally that our algorithm performs well and computes succinct policies on a number of POMDP instances from the literature that were naturally enhanced with energy levels.

preprint2016arXiv

Termination Analysis of Probabilistic Programs through Positivstellensatz's

We consider nondeterministic probabilistic programs with the most basic liveness property of termination. We present efficient methods for termination analysis of nondeterministic probabilistic programs with polynomial guards and assignments. Our approach is through synthesis of polynomial ranking supermartingales, that on one hand significantly generalizes linear ranking supermartingales and on the other hand is a counterpart of polynomial ranking-functions for proving termination of nonprobabilistic programs. The approach synthesizes polynomial ranking-supermartingales through Positivstellensatz's, yielding an efficient method which is not only sound, but also semi-complete over a large subclass of programs. We show experimental results to demonstrate that our approach can handle several classical programs with complex polynomial guards and assignments, and can synthesize efficient quadratic ranking-supermartingales when a linear one does not exist even for simple affine programs.

preprint2015arXiv

A Symbolic SAT-based Algorithm for Almost-sure Reachability with Small Strategies in POMDPs

POMDPs are standard models for probabilistic planning problems, where an agent interacts with an uncertain environment. We study the problem of almost-sure reachability, where given a set of target states, the question is to decide whether there is a policy to ensure that the target set is reached with probability 1 (almost-surely). While in general the problem is EXPTIME-complete, in many practical cases policies with a small amount of memory suffice. Moreover, the existing solution to the problem is explicit, which first requires to construct explicitly an exponential reduction to a belief-support MDP. In this work, we first study the existence of observation-stationary strategies, which is NP-complete, and then small-memory strategies. We present a symbolic algorithm by an efficient encoding to SAT and using a SAT solver for the problem. We report experimental results demonstrating the scalability of our symbolic (SAT-based) approach.

preprint2015arXiv

Algorithmic Analysis of Qualitative and Quantitative Termination Problems for Affine Probabilistic Programs

In this paper, we consider termination of probabilistic programs with real-valued variables. The questions concerned are: 1. qualitative ones that ask (i) whether the program terminates with probability 1 (almost-sure termination) and (ii) whether the expected termination time is finite (finite termination); 2. quantitative ones that ask (i) to approximate the expected termination time (expectation problem) and (ii) to compute a bound B such that the probability to terminate after B steps decreases exponentially (concentration problem). To solve these questions, we utilize the notion of ranking supermartingales which is a powerful approach for proving termination of probabilistic programs. In detail, we focus on algorithmic synthesis of linear ranking-supermartingales over affine probabilistic programs (APP's) with both angelic and demonic non-determinism. An important subclass of APP's is LRAPP which is defined as the class of all APP's over which a linear ranking-supermartingale exists. Our main contributions are as follows. Firstly, we show that the membership problem of LRAPP (i) can be decided in polynomial time for APP's with at most demonic non-determinism, and (ii) is NP-hard and in PSPACE for APP's with angelic non-determinism; moreover, the NP-hardness result holds already for APP's without probability and demonic non-determinism. Secondly, we show that the concentration problem over LRAPP can be solved in the same complexity as for the membership problem of LRAPP. Finally, we show that the expectation problem over LRAPP can be solved in 2EXPTIME and is PSPACE-hard even for APP's without probability and non-determinism (i.e., deterministic programs). Our experimental results demonstrate the effectiveness of our approach to answer the qualitative and quantitative questions over APP's with at most demonic non-determinism.

preprint2015arXiv

Algorithms for Algebraic Path Properties in Concurrent Systems of Constant Treewidth Components

We study algorithmic questions for concurrent systems where the transitions are labeled from a complete, closed semiring, and path properties are algebraic with semiring operations. The algebraic path properties can model dataflow analysis problems, the shortest path problem, and many other natural problems that arise in program analysis. We consider that each component of the concurrent system is a graph with constant treewidth, a property satisfied by the controlflow graphs of most programs. We allow for multiple possible queries, which arise naturally in demand driven dataflow analysis. The study of multiple queries allows us to consider the tradeoff between the resource usage of the one-time preprocessing and for each individual query. The traditional approach constructs the product graph of all components and applies the best-known graph algorithm on the product. In this approach, even the answer to a single query requires the transitive closure, which provides no room for tradeoff between preprocessing and query time. Our main contributions are algorithms that significantly improve the worst-case running time of the traditional approach, and provide various tradeoffs depending on the number of queries. For example, in a concurrent system of two components, the traditional approach requires hexic time in the worst case for answering one query as well as computing the transitive closure, whereas we show that with one-time preprocessing in almost cubic time, each subsequent query can be answered in at most linear time, and even the transitive closure can be computed in almost quartic time. Furthermore, we establish conditional optimality results showing that the worst-case running time of our algorithms cannot be improved without achieving major breakthroughs in graph algorithms.

preprint2015arXiv

Counterexample Explanation by Learning Small Strategies in Markov Decision Processes

While for deterministic systems, a counterexample to a property can simply be an error trace, counterexamples in probabilistic systems are necessarily more complex. For instance, a set of erroneous traces with a sufficient cumulative probability mass can be used. Since these are too large objects to understand and manipulate, compact representations such as subchains have been considered. In the case of probabilistic systems with non-determinism, the situation is even more complex. While a subchain for a given strategy (or scheduler, resolving non-determinism) is a straightforward choice, we take a different approach. Instead, we focus on the strategy - which can be a counterexample to violation of or a witness of satisfaction of a property - itself, and extract the most important decisions it makes, and present its succinct representation. The key tools we employ to achieve this are (1) introducing a concept of importance of a state w.r.t. the strategy, and (2) learning using decision trees. There are three main consequent advantages of our approach. Firstly, it exploits the quantitative information on states, stressing the more important decisions. Secondly, it leads to a greater variability and degree of freedom in representing the strategies. Thirdly, the representation uses a self-explanatory data structure. In summary, our approach produces more succinct and more explainable strategies, as opposed to e.g. binary decision diagrams. Finally, our experimental results show that we can extract several rules describing the strategy even for very large systems that do not fit in memory, and based on the rules explain the erroneous behaviour.

preprint2015arXiv

Faster Algorithms for Quantitative Verification in Constant Treewidth Graphs

We consider the core algorithmic problems related to verification of systems with respect to three classical quantitative properties, namely, the mean-payoff property, the ratio property, and the minimum initial credit for energy property. The algorithmic problem given a graph and a quantitative property asks to compute the optimal value (the infimum value over all traces) from every node of the graph. We consider graphs with constant treewidth, and it is well-known that the control-flow graphs of most programs have constant treewidth. Let $n$ denote the number of nodes of a graph, $m$ the number of edges (for constant treewidth graphs $m=O(n)$) and $W$ the largest absolute value of the weights. Our main theoretical results are as follows. First, for constant treewidth graphs we present an algorithm that approximates the mean-payoff value within a multiplicative factor of $ε$ in time $O(n \cdot \log (n/ε))$ and linear space, as compared to the classical algorithms that require quadratic time. Second, for the ratio property we present an algorithm that for constant treewidth graphs works in time $O(n \cdot \log (|a\cdot b|))=O(n\cdot\log (n\cdot W))$, when the output is $\frac{a}{b}$, as compared to the previously best known algorithm with running time $O(n^2 \cdot \log (n\cdot W))$. Third, for the minimum initial credit problem we show that (i) for general graphs the problem can be solved in $O(n^2\cdot m)$ time and the associated decision problem can be solved in $O(n\cdot m)$ time, improving the previous known $O(n^3\cdot m\cdot \log (n\cdot W))$ and $O(n^2 \cdot m)$ bounds, respectively; and (ii) for constant treewidth graphs we present an algorithm that requires $O(n\cdot \log n)$ time, improving the previous known $O(n^4 \cdot \log (n \cdot W))$ bound.

preprint2015arXiv

MultiGain: A controller synthesis tool for MDPs with multiple mean-payoff objectives

We present MultiGain, a tool to synthesize strategies for Markov decision processes (MDPs) with multiple mean-payoff objectives. Our models are described in PRISM, and our tool uses the existing interface and simulator of PRISM. Our tool extends PRISM by adding novel algorithms for multiple mean-payoff objectives, and also provides features such as (i)~generating strategies and exploring them for simulation, and checking them with respect to other properties; and (ii)~generating an approximate Pareto curve for two mean-payoff objectives. In addition, we present a new practical algorithm for the analysis of MDPs with multiple mean-payoff objectives under memoryless strategies.

preprint2015arXiv

Nested Weighted Automata

Recently there has been a significant effort to handle quantitative properties in formal verification and synthesis. While weighted automata over finite and infinite words provide a natural and flexible framework to express quantitative properties, perhaps surprisingly, some basic system properties such as average response time cannot be expressed using weighted automata, nor in any other know decidable formalism. In this work, we introduce nested weighted automata as a natural extension of weighted automata which makes it possible to express important quantitative properties such as average response time. In nested weighted automata, a master automaton spins off and collects results from weighted slave automata, each of which computes a quantity along a finite portion of an infinite word. Nested weighted automata can be viewed as the quantitative analogue of monitor automata, which are used in run-time verification. We establish an almost complete decidability picture for the basic decision problems about nested weighted automata, and illustrate their applicability in several domains. In particular, nested weighted automata can be used to decide average response time properties.

preprint2015arXiv

Quantitative Timed Simulation Functions and Refinement Metrics for Timed Systems (Full Version)

We introduce quantatitive timed refinement and timed simulation (directed) metrics, incorporating zenoness check s, for timed systems. These metrics assign positive real numbers between zero and infinity which quantify the \emph{timing mismatches} between two timed systems, amongst non-zeno runs. We quantify timing mismatches in three ways: (1) the maximal timing mismatch that can arise, (2) the "steady-state" maximal timing mismatches, where initial transient timing mismatches are ignored; and (3) the (long-run) average timing mismatches amongst two systems. These three kinds of mismatches constitute three important types of timing differences. Our event times are the \emph{global times}, measured from the start of the system execution, not just the time durations of individual steps. We present algorithms over timed automata for computing the three quantitative simulation distances to within any desired degree of accuracy. In order to compute the values of the quantitative simulation distances, we use a game theoretic formulation. We introduce two new kinds of objectives for two player games on finite-state game graphs: (1) eventual debit-sum level objectives, and (2) average debit-sum level objectives. We present algorithms for computing the optimal values for these objectives in graph games, and then use these algorithms to compute the values of the timed simulation distances over timed automata.

preprint2015arXiv

Strategy Complexity of Concurrent Stochastic Games with Safety and Reachability Objectives

We consider finite-state concurrent stochastic games, played by k>=2 players for an infinite number of rounds, where in every round, each player simultaneously and independently of the other players chooses an action, whereafter the successor state is determined by a probability distribution given by the current state and the chosen actions. We consider reachability objectives that given a target set of states require that some state in the target is visited, and the dual safety objectives that given a target set require that only states in the target set are visited. We are interested in the complexity of stationary strategies measured by their patience, which is defined as the inverse of the smallest nonzero probability employed. Our main results are as follows: We show that in two-player zero-sum concurrent stochastic games (with reachability objective for one player and the complementary safety objective for the other player): (i) the optimal bound on the patience of optimal and epsilon-optimal strategies, for both players is doubly exponential; and (ii) even in games with a single nonabsorbing state exponential (in the number of actions) patience is necessary. In general we study the class of non-zero-sum games admitting stationary epsilon-Nash equilibria. We show that if there is at least one player with reachability objective, then doubly-exponential patience may be needed for epsilon-Nash equilibrium strategies, whereas in contrast if all players have safety objectives, the optimal bound on patience for epsilon-Nash equilibrium strategies is only exponential.

preprint2015arXiv

Temporal Logic Control for Stochastic Linear Systems using Abstraction Refinement of Probabilistic Games

We consider the problem of computing the set of initial states of a dynamical system such that there exists a control strategy to ensure that the trajectories satisfy a temporal logic specification with probability 1 (almost-surely). We focus on discrete-time, stochastic linear dynamics and specifications given as formulas of the Generalized Reactivity(1) fragment of Linear Temporal Logic over linear predicates in the states of the system. We propose a solution based on iterative abstraction-refinement, and turn-based 2-player probabilistic games. While the theoretical guarantee of our algorithm after any finite number of iterations is only a partial solution, we show that if our algorithm terminates, then the result is the set of satisfying initial states. Moreover, for any (partial) solution our algorithm synthesizes witness control strategies to ensure almost-sure satisfaction of the temporal logic specification. We demonstrate our approach on an illustrative case study.

preprint2015arXiv

Verification of Markov Decision Processes using Learning Algorithms

We present a general framework for applying machine-learning algorithms to the verification of Markov decision processes (MDPs). The primary goal of these techniques is to improve performance by avoiding an exhaustive exploration of the state space. Our framework focuses on probabilistic reachability, which is a core property for verification, and is illustrated through two distinct instantiations. The first assumes that full knowledge of the MDP is available, and performs a heuristic-driven partial exploration of the model, yielding precise lower and upper bounds on the required probability. The second tackles the case where we may only sample the MDP, and yields probabilistic guarantees, again in terms of both the lower and upper bounds, which provides efficient stopping criteria for the approximation. The latter is the first extension of statistical model-checking for unbounded properties in MDPs. In contrast with other related approaches, we do not restrict our attention to time-bounded (finite-horizon) or discounted properties, nor assume any particular properties of the MDP. We also show how our techniques extend to LTL objectives. We present experimental results showing the performance of our framework on several examples.

preprint2014arXiv

A Framework for Automated Competitive Analysis of On-line Scheduling of Firm-Deadline Tasks

We present a flexible framework for the automated competitive analysis of on-line scheduling algorithms for firm-deadline real-time tasks based on multi-objective graphs: Given a taskset and an on-line scheduling algorithm specified as a labeled transition system, along with some optional safety, liveness, and/or limit-average constraints for the adversary, we automatically compute the competitive ratio of the algorithm w.r.t. a clairvoyant scheduler. We demonstrate the flexibility and power of our approach by comparing the competitive ratio of several on-line algorithms, including $D^{over}$, that have been proposed in the past, for various tasksets. Our experimental results reveal that none of these algorithms is universally optimal, in the sense that there are tasksets where other schedulers provide better performance. Our framework is hence a very useful design tool for selecting optimal algorithms for a given application.

preprint2014arXiv

Assume-Guarantee Synthesis for Concurrent Reactive Programs with Partial Information

Synthesis of program parts is very useful for concurrent systems. However, most synthesis approaches do not support common design tasks, like modifying a single process without having to re-synthesize or verify the whole system. Assume-guarantee synthesis (AGS) provides robustness against modifications of system parts, but thus far has been limited to the perfect information setting. This means that local variables cannot be hidden from other processes, which renders synthesis results cumbersome or even impossible to realize. We resolve this shortcoming by defining AGS in a partial information setting. We analyze the complexity and decidability in different settings, showing that the problem has a high worst-case complexity and is undecidable in many interesting cases. Based on these observations, we present a pragmatic algorithm based on bounded synthesis, and demonstrate its practical applicability on several examples.

preprint2014arXiv

Automatic Generation of Alternative Starting Positions for Simple Traditional Board Games

Simple board games, like Tic-Tac-Toe and CONNECT-4, play an important role not only in the development of mathematical and logical skills, but also in the emotional and social development. In this paper, we address the problem of generating targeted starting positions for such games. This can facilitate new approaches for bringing novice players to mastery, and also leads to discovery of interesting game variants. We present an approach that generates starting states of varying hardness levels for player~$1$ in a two-player board game, given rules of the board game, the desired number of steps required for player~$1$ to win, and the expertise levels of the two players. Our approach leverages symbolic methods and iterative simulation to efficiently search the extremely large state space. We present experimental results that include discovery of states of varying hardness levels for several simple grid-based board games. The presence of such states for standard game variants like $4 \times 4$ Tic-Tac-Toe opens up new games to be played that have never been played as the default start state is heavily biased.

preprint2014arXiv

Average Case Analysis of the Classical Algorithm for Markov Decision Processes with Büchi Objectives

We consider Markov decision processes (MDPs) with $ω$-regular specifications given as parity objectives. We consider the problem of computing the set of almost-sure winning vertices from where the objective can be ensured with probability 1. The algorithms for the computation of the almost-sure winning set for parity objectives iteratively use the solutions for the almost-sure winning set for Büchi objectives (a special case of parity objectives). We study for the first time the average case complexity of the classical algorithm for computing almost-sure winning vertices for MDPs with Büchi objectives. Our contributions are as follows: First, we show that for MDPs with constant out-degree the expected number of iterations is at most logarithmic and the average case running time is linear (as compared to the worst case linear number of iterations and quadratic time complexity). Second, we show that for general MDPs the expected number of iterations is constant and the average case running time is linear (again as compared to the worst case linear number of iterations and quadratic time complexity). Finally we also show that given all graphs are equally likely, the probability that the classical algorithm requires more than constant number of iterations is exponentially small.

preprint2014arXiv

CEGAR for Qualitative Analysis of Probabilistic Systems

We consider Markov decision processes (MDPs) which are a standard model for probabilistic systems. We focus on qualitative properties for MDPs that can express that desired behaviors of the system arise almost-surely (with probability 1) or with positive probability. We introduce a new simulation relation to capture the refinement relation of MDPs with respect to qualitative properties, and present discrete graph theoretic algorithms with quadratic complexity to compute the simulation relation. We present an automated technique for assume-guarantee style reasoning for compositional analysis of MDPs with qualitative properties by giving a counter-example guided abstraction-refinement approach to compute our new simulation relation. We have implemented our algorithms and show that the compositional analysis leads to significant improvements.

preprint2014arXiv

Faster Algorithms for Algebraic Path Properties in RSMs with Constant Treewidth

Interprocedural analysis is at the heart of numerous applications in programming languages, such as alias analysis, constant propagation, etc. Recursive state machines (RSMs) are standard models for interprocedural analysis. We consider a general framework with RSMs where the transitions are labeled from a semiring, and path properties are algebraic with semiring operations. RSMs with algebraic path properties can model interprocedural dataflow analysis problems, the shortest path problem, the most probable path problem, etc. The traditional algorithms for interprocedural analysis focus on path properties where the starting point is \emph{fixed} as the entry point of a specific method. In this work, we consider possible multiple queries as required in many applications such as in alias analysis. The study of multiple queries allows us to bring in a very important algorithmic distinction between the resource usage of the \emph{one-time} preprocessing vs for \emph{each individual} query. The second aspect that we consider is that the control flow graphs for most programs have constant treewidth. Our main contributions are simple and implementable algorithms that support multiple queries for algebraic path properties for RSMs that have constant treewidth. Our theoretical results show that our algorithms have small additional one-time preprocessing, but can answer subsequent queries significantly faster as compared to the current best-known solutions for several important problems, such as interprocedural reachability and shortest path. We provide a prototype implementation for interprocedural reachability and intraprocedural shortest path that gives a significant speed-up on several benchmarks.

preprint2014arXiv

Games with a Weak Adversary

We consider multi-player graph games with partial-observation and parity objective. While the decision problem for three-player games with a coalition of the first and second players against the third player is undecidable, we present a decidability result for partial-observation games where the first and third player are in a coalition against the second player, thus where the second player is adversarial but weaker due to partial-observation. We establish tight complexity bounds in the case where player 1 is less informed than player 2, namely 2-EXPTIME-completeness for parity objectives. The symmetric case of player 1 more informed than player 2 is much more complicated, and we show that already in the case where player 1 has perfect observation, memory of size non-elementary is necessary in general for reachability objectives, and the problem is decidable for safety and reachability objectives. Our results have tight connections with partial-observation stochastic games for which we derive new complexity results.

preprint2014arXiv

Generalized Risk-Aversion in Stochastic Multi-Armed Bandits

We consider the problem of minimizing the regret in stochastic multi-armed bandit, when the measure of goodness of an arm is not the mean return, but some general function of the mean and the variance.We characterize the conditions under which learning is possible and present examples for which no natural algorithm can achieve sublinear regret.

preprint2014arXiv

Looking at Mean-Payoff and Total-Payoff through Windows

We consider two-player games played on weighted directed graphs with mean-payoff and total-payoff objectives, two classical quantitative objectives. While for single-dimensional games the complexity and memory bounds for both objectives coincide, we show that in contrast to multi-dimensional mean-payoff games that are known to be coNP-complete, multi-dimensional total-payoff games are undecidable. We introduce conservative approximations of these objectives, where the payoff is considered over a local finite window sliding along a play, instead of the whole play. For single dimension, we show that (i) if the window size is polynomial, deciding the winner takes polynomial time, and (ii) the existence of a bounded window can be decided in NP $\cap$ coNP, and is at least as hard as solving mean-payoff games. For multiple dimensions, we show that (i) the problem with fixed window size is EXPTIME-complete, and (ii) there is no primitive-recursive algorithm to decide the existence of a bounded window.

preprint2014arXiv

Markov Decision Processes with Multiple Long-run Average Objectives

We study Markov decision processes (MDPs) with multiple limit-average (or mean-payoff) functions. We consider two different objectives, namely, expectation and satisfaction objectives. Given an MDP with k limit-average functions, in the expectation objective the goal is to maximize the expected limit-average value, and in the satisfaction objective the goal is to maximize the probability of runs such that the limit-average value stays above a given vector. We show that under the expectation objective, in contrast to the case of one limit-average function, both randomization and memory are necessary for strategies even for epsilon-approximation, and that finite-memory randomized strategies are sufficient for achieving Pareto optimal values. Under the satisfaction objective, in contrast to the case of one limit-average function, infinite memory is necessary for strategies achieving a specific value (i.e. randomized finite-memory strategies are not sufficient), whereas memoryless randomized strategies are sufficient for epsilon-approximation, for all epsilon>0. We further prove that the decision problems for both expectation and satisfaction objectives can be solved in polynomial time and the trade-off curve (Pareto curve) can be epsilon-approximated in time polynomial in the size of the MDP and 1/epsilon, and exponential in the number of limit-average functions, for all epsilon>0. Our analysis also reveals flaws in previous work for MDPs with multiple mean-payoff functions under the expectation objective, corrects the flaws, and allows us to obtain improved results.

preprint2014arXiv

Optimal Cost Almost-sure Reachability in POMDPs

We consider partially observable Markov decision processes (POMDPs) with a set of target states and every transition is associated with an integer cost. The optimization objective we study asks to minimize the expected total cost till the target set is reached, while ensuring that the target set is reached almost-surely (with probability 1). We show that for integer costs approximating the optimal cost is undecidable. For positive costs, our results are as follows: (i) we establish matching lower and upper bounds for the optimal cost and the bound is double exponential; (ii) we show that the problem of approximating the optimal cost is decidable and present approximation algorithms developing on the existing algorithms for POMDPs with finite-horizon objectives. While the worst-case running time of our algorithm is double exponential, we also present efficient stopping criteria for the algorithm and show experimentally that it performs well in many examples of interest.

preprint2014arXiv

POMDPs under Probabilistic Semantics

We consider partially observable Markov decision processes (POMDPs) with limit-average payoff, where a reward value in the interval [0,1] is associated to every transition, and the payoff of an infinite path is the long-run average of the rewards. We consider two types of path constraints: (i) quantitative constraint defines the set of paths where the payoff is at least a given threshold lambda_1 in (0,1]; and (ii) qualitative constraint which is a special case of quantitative constraint with lambda_1=1. We consider the computation of the almost-sure winning set, where the controller needs to ensure that the path constraint is satisfied with probability 1. Our main results for qualitative path constraint are as follows: (i) the problem of deciding the existence of a finite-memory controller is EXPTIME-complete; and (ii) the problem of deciding the existence of an infinite-memory controller is undecidable. For quantitative path constraint we show that the problem of deciding the existence of a finite-memory controller is undecidable.

preprint2014arXiv

Probabilistic Opacity for Markov Decision Processes

Opacity is a generic security property, that has been defined on (non probabilistic) transition systems and later on Markov chains with labels. For a secret predicate, given as a subset of runs, and a function describing the view of an external observer, the value of interest for opacity is a measure of the set of runs disclosing the secret. We extend this definition to the richer framework of Markov decision processes, where non deterministic choice is combined with probabilistic transitions, and we study related decidability problems with partial or complete observation hypotheses for the schedulers. We prove that all questions are decidable with complete observation and $ω$-regular secrets. With partial observation, we prove that all quantitative questions are undecidable but the question whether a system is almost surely non opaque becomes decidable for a restricted class of $ω$-regular secrets, as well as for all $ω$-regular secrets under finite-memory schedulers.

preprint2014arXiv

Proceedings 3rd Workshop on Synthesis

The idea of synthesis, i.e., the process of automatically computing implementations from their specifications, has recently gained a lot of momentum in the contexts of software engineering and reactive system design. While it is widely believed that, due to complexity/undecidability issues, synthesis cannot completely replace manual engineering, it can assist the process of designing the intricate pieces of code that most programmers find challenging, or help with orchestrating tasks in reactive environments. The SYNT workshop aims to bring together researchers interested in synthesis to discuss and present ongoing and mature work on all aspects of automated synthesis and its applications. The third iteration of the workshop took place in Vienna, Austria, and was co-located with the 26th International Conference on Computer Aided Verification, held in the context of the Vienna Summer of Logic in July 2014. The workshop included eight contributed talks and four invited talks. In addition, it featured a special session about the Syntax-Guided Synthesis Competition (SyGuS) and the SyntComp Synthesis competition.

preprint2014arXiv

Qualitative Analysis of Concurrent Mean-payoff Games

We consider concurrent games played by two-players on a finite-state graph, where in every round the players simultaneously choose a move, and the current state along with the joint moves determine the successor state. We study a fundamental objective, namely, mean-payoff objective, where a reward is associated to each transition, and the goal of player 1 is to maximize the long-run average of the rewards, and the objective of player 2 is strictly the opposite. The path constraint for player 1 could be qualitative, i.e., the mean-payoff is the maximal reward, or arbitrarily close to it; or quantitative, i.e., a given threshold between the minimal and maximal reward. We consider the computation of the almost-sure (resp. positive) winning sets, where player 1 can ensure that the path constraint is satisfied with probability 1 (resp. positive probability). Our main results for qualitative path constraints are as follows: (1) we establish qualitative determinacy results that show that for every state either player 1 has a strategy to ensure almost-sure (resp. positive) winning against all player-2 strategies, or player 2 has a spoiling strategy to falsify almost-sure (resp. positive) winning against all player-1 strategies; (2) we present optimal strategy complexity results that precisely characterize the classes of strategies required for almost-sure and positive winning for both players; and (3) we present quadratic time algorithms to compute the almost-sure and the positive winning sets, matching the best known bound of algorithms for much simpler problems (such as reachability objectives). For quantitative constraints we show that a polynomial time solution for the almost-sure or the positive winning set would imply a solution to a long-standing open problem (the value problem for turn-based deterministic mean-payoff games) that is not known to be solvable in polynomial time.

preprint2014arXiv

Randomness for Free

We consider two-player zero-sum games on graphs. These games can be classified on the basis of the information of the players and on the mode of interaction between them. On the basis of information the classification is as follows: (a) partial-observation (both players have partial view of the game); (b) one-sided complete-observation (one player has complete observation); and (c) complete-observation (both players have complete view of the game). On the basis of mode of interaction we have the following classification: (a) concurrent (both players interact simultaneously); and (b) turn-based (both players interact in turn). The two sources of randomness in these games are randomness in transition function and randomness in strategies. In general, randomized strategies are more powerful than deterministic strategies, and randomness in transitions gives more general classes of games. In this work we present a complete characterization for the classes of games where randomness is not helpful in: (a) the transition function probabilistic transition can be simulated by deterministic transition); and (b) strategies (pure strategies are as powerful as randomized strategies). As consequence of our characterization we obtain new undecidability results for these games.

preprint2014arXiv

Strategy Synthesis for Multi-dimensional Quantitative Objectives

Multi-dimensional mean-payoff and energy games provide the mathematical foundation for the quantitative study of reactive systems, and play a central role in the emerging quantitative theory of verification and synthesis. In this work, we study the strategy synthesis problem for games with such multi-dimensional objectives along with a parity condition, a canonical way to express $ω$-regular conditions. While in general, the winning strategies in such games may require infinite memory, for synthesis the most relevant problem is the construction of a finite-memory winning strategy (if one exists). Our main contributions are as follows. First, we show a tight exponential bound (matching upper and lower bounds) on the memory required for finite-memory winning strategies in both multi-dimensional mean-payoff and energy games along with parity objectives. This significantly improves the triple exponential upper bound for multi energy games (without parity) that could be derived from results in literature for games on VASS (vector addition systems with states). Second, we present an optimal symbolic and incremental algorithm to compute a finite-memory winning strategy (if one exists) in such games. Finally, we give a complete characterization of when finite memory of strategies can be traded off for randomness. In particular, we show that for one-dimension mean-payoff parity games, randomized memoryless strategies are as powerful as their pure finite-memory counterparts.

preprint2014arXiv

Symbolic Algorithms for Qualitative Analysis of Markov Decision Processes with Büchi Objectives

We consider Markov decision processes (MDPs) with ω-regular specifications given as parity objectives. We consider the problem of computing the set of almost-sure winning states from where the objective can be ensured with probability 1. The algorithms for the computation of the almost-sure winning set for parity objectives iteratively use the solutions for the almost-sure winning set for Büchi objectives (a special case of parity objectives). Our contributions are as follows: First, we present the first subquadratic symbolic algorithm to compute the almost-sure winning set for MDPs with Büchi objectives; our algorithm takes O(n \sqrt{m}) symbolic steps as compared to the previous known algorithm that takes O(n^2) symbolic steps, where $n$ is the number of states and $m$ is the number of edges of the MDP. In practice MDPs have constant out-degree, and then our symbolic algorithm takes O(n \sqrt{n}) symbolic steps, as compared to the previous known $O(n^2)$ symbolic steps algorithm. Second, we present a new algorithm, namely win-lose algorithm, with the following two properties: (a) the algorithm iteratively computes subsets of the almost-sure winning set and its complement, as compared to all previous algorithms that discover the almost-sure winning set upon termination; and (b) requires O(n \sqrt{K}) symbolic steps, where K is the maximal number of edges of strongly connected components (scc's) of the MDP. The win-lose algorithm requires symbolic computation of scc's. Third, we improve the algorithm for symbolic scc computation; the previous known algorithm takes linear symbolic steps, and our new algorithm improves the constants associated with the linear number of steps. In the worst case the previous known algorithm takes 5n symbolic steps, whereas our new algorithm takes 4n symbolic steps.

preprint2014arXiv

The Complexity of Ergodic Mean-payoff Games

We study two-player (zero-sum) concurrent mean-payoff games played on a finite-state graph. We focus on the important sub-class of ergodic games where all states are visited infinitely often with probability 1. The algorithmic study of ergodic games was initiated in a seminal work of Hoffman and Karp in 1966, but all basic complexity questions have remained unresolved. Our main results for ergodic games are as follows: We establish (1) an optimal exponential bound on the patience of stationary strategies (where patience of a distribution is the inverse of the smallest positive probability and represents a complexity measure of a stationary strategy); (2) the approximation problem lie in FNP; (3) the approximation problem is at least as hard as the decision problem for simple stochastic games (for which NP intersection coNP is the long-standing best known bound). We present a variant of the strategy-iteration algorithm by Hoffman and Karp; show that both our algorithm and the classical value-iteration algorithm can approximate the value in exponential time; and identify a subclass where the value-iteration algorithm is a FPTAS. We also show that the exact value can be expressed in the existential theory of the reals, and establish square-root sum hardness for a related class of games.

preprint2014arXiv

The Complexity of Partial-observation Stochastic Parity Games With Finite-memory Strategies

We consider two-player partial-observation stochastic games on finite-state graphs where player 1 has partial observation and player 2 has perfect observation. The winning condition we study are ω-regular conditions specified as parity objectives. The qualitative-analysis problem given a partial-observation stochastic game and a parity objective asks whether there is a strategy to ensure that the objective is satisfied with probability~1 (resp. positive probability). These qualtitative-analysis problems are known to be undecidable. However in many applications the relevant question is the existence of finite-memory strategies, and the qualitative-analysis problems under finite-memory strategies was recently shown to be decidable in 2EXPTIME. We improve the complexity and show that the qualitative-analysis problems for partial-observation stochastic parity games under finite-memory strategies are EXPTIME-complete; and also establish optimal (exponential) memory bounds for finite-memory strategies required for qualitative analysis.

preprint2014arXiv

The Value 1 Problem Under Finite-memory Strategies for Concurrent Mean-payoff Games

We consider concurrent mean-payoff games, a very well-studied class of two-player (player 1 vs player 2) zero-sum games on finite-state graphs where every transition is assigned a reward between 0 and 1, and the payoff function is the long-run average of the rewards. The value is the maximal expected payoff that player 1 can guarantee against all strategies of player 2. We consider the computation of the set of states with value 1 under finite-memory strategies for player 1, and our main results for the problem are as follows: (1) we present a polynomial-time algorithm; (2) we show that whenever there is a finite-memory strategy, there is a stationary strategy that does not need memory at all; and (3) we present an optimal bound (which is double exponential) on the patience of stationary strategies (where patience of a distribution is the inverse of the smallest positive probability and represents a complexity measure of a stationary strategy).

preprint2013arXiv

Approximating the minimum cycle mean

We consider directed graphs where each edge is labeled with an integer weight and study the fundamental algorithmic question of computing the value of a cycle with minimum mean weight. Our contributions are twofold: (1) First we show that the algorithmic question is reducible in O(n^2) time to the problem of a logarithmic number of min-plus matrix multiplications of n-by-n matrices, where n is the number of vertices of the graph. (2) Second, when the weights are nonnegative, we present the first (1 + ε)-approximation algorithm for the problem and the running time of our algorithm is \tilde(O)(n^ωlog^3(nW/ε) / ε), where O(n^ω) is the time required for the classic n-by-n matrix multiplication and W is the maximum value of the weights.

preprint2013arXiv

Automata with Generalized Rabin Pairs for Probabilistic Model Checking and LTL Synthesis

The model-checking problem for probabilistic systems crucially relies on the translation of LTL to deterministic Rabin automata (DRW). Our recent Safraless translation for the LTL(F,G) fragment produces smaller automata as compared to the traditional approach. In this work, instead of DRW we consider deterministic automata with acceptance condition given as disjunction of generalized Rabin pairs (DGRW). The Safraless translation of LTL(F,G) formulas to DGRW results in smaller automata as compared to DRW. We present algorithms for probabilistic model-checking as well as game solving for DGRW conditions. Our new algorithms lead to improvement both in terms of theoretical bounds as well as practical evaluation. We compare PRISM with and without our new translation, and show that the new translation leads to significant improvements.

preprint2013arXiv

Better Quality in Synthesis through Quantitative Objectives

Most specification languages express only qualitative constraints. However, among two implementations that satisfy a given specification, one may be preferred to another. For example, if a specification asks that every request is followed by a response, one may prefer an implementation that generates responses quickly but does not generate unnecessary responses. We use quantitative properties to measure the "goodness" of an implementation. Using games with corresponding quantitative objectives, we can synthesize "optimal" implementations, which are preferred among the set of possible implementations that satisfy a given specification. In particular, we show how automata with lexicographic mean-payoff conditions can be used to express many interesting quantitative properties for reactive systems. In this framework, the synthesis of optimal implementations requires the solution of lexicographic mean-payoff games (for safety requirements), and the solution of games with both lexicographic mean-payoff and parity objectives (for liveness requirements). We present algorithms for solving both kinds of novel graph games.

preprint2013arXiv

Bounded Rationality in Concurrent Parity Games

We consider 2-player games played on a finite state space for infinite rounds. The games are concurrent: in each round, the two players choose their moves simultaneously; the current state and the moves determine the successor. We consider omega-regular winning conditions given as parity objectives. We consider the qualitative analysis problems: the computation of the almost-sure and limit-sure winning set of states, where player 1 can ensure to win with probability 1 and with probability arbitrarily close to 1, respectively. In general the almost-sure and limit-sure winning strategies require both infinite-memory and infinite-precision. We study the bounded-rationality problem for qualitative analysis of concurrent parity games, where the strategy set player 1 is restricted to bounded-resource strategies. In terms of precision, strategies can be deterministic, uniform, finite-precision or infinite-precision; and in terms of memory, strategies can be memoryless, finite-memory or infinite-memory. We present a precise and complete characterization of the qualitative winning sets for all combinations of classes of strategies. In particular, we show that uniform memoryless strategies are as powerful as finite-precision infinite-memory strategies, and infinite-precision memoryless strategies are as powerful as infinite-precision finite-memory strategies. We show that the winning sets can be computed in O(n^{2d+3}) time, where n is the size of the game and 2d is the number of priorities, and our algorithms are symbolic. The membership problem of whether a state belongs to a winning set can be decided in NP cap coNP. While this complexity is the same as for the simpler class of turn-based games, where in each state only one of the players has a choice of moves, our algorithms, that are obtained by characterization of the winning sets as mu-calculus formulas, are considerably more involved.

preprint2013arXiv

Doomsday Equilibria for Omega-Regular Games

Two-player games on graphs provide the theoretical frame- work for many important problems such as reactive synthesis. While the traditional study of two-player zero-sum games has been extended to multi-player games with several notions of equilibria, they are decidable only for perfect-information games, whereas several applications require imperfect-information games. In this paper we propose a new notion of equilibria, called doomsday equilibria, which is a strategy profile such that all players satisfy their own objective, and if any coalition of players deviates and violates even one of the players objective, then the objective of every player is violated. We present algorithms and complexity results for deciding the existence of doomsday equilibria for various classes of omega-regular objectives, both for imperfect-information games, and for perfect-information games. We provide optimal complexity bounds for imperfect-information games, and in most cases for perfect-information games.

preprint2013arXiv

Faster Algorithms for Markov Decision Processes with Low Treewidth

We consider two core algorithmic problems for probabilistic verification: the maximal end-component decomposition and the almost-sure reachability set computation for Markov decision processes (MDPs). For MDPs with treewidth $k$, we present two improved static algorithms for both the problems that run in time $O(n \cdot k^{2.38} \cdot 2^k)$ and $O(m \cdot \log n \cdot k)$, respectively, where $n$ is the number of states and $m$ is the number of edges, significantly improving the previous known $O(n\cdot k \cdot \sqrt{n\cdot k})$ bound for low treewidth. We also present decremental algorithms for both problems for MDPs with constant treewidth that run in amortized logarithmic time, which is a huge improvement over the previously known algorithms that require amortized linear time.

preprint2013arXiv

Forgiver triumphs in alternating Prisoner's Dilemma

Cooperative behavior, where one individual incurs a cost to help another, is a wide spread phenomenon. Here we study direct reciprocity in the context of the alternating Prisoner's Dilemma. We consider all strategies that can be implemented by one and two-state automata. We calculate the payoff matrix of all pairwise encounters in the presence of noise. We explore deterministic selection dynamics with and without mutation. Using different error rates and payoff values, we observe convergence to a small number of distinct equilibria. Two of them are uncooperative strict Nash equilibria representing always-defect (ALLD) and Grim. The third equilibrium is mixed and represents a cooperative alliance of several strategies, dominated by a strategy which we call Forgiver. Forgiver cooperates whenever the opponent has cooperated; it defects once when the opponent has defected, but subsequently Forgiver attempts to re-establish cooperation even if the opponent has defected again. Forgiver is not an evolutionarily stable strategy, but the alliance, which it rules, is asymptotically stable. For a wide range of parameter values the most commonly observed outcome is convergence to the mixed equilibrium, dominated by Forgiver. Our results show that although forgiving might incur a short-term loss it can lead to a long-term gain. Forgiveness facilitates stable cooperation in the presence of exploitation and noise.

preprint2013arXiv

Hyperplane Separation Technique for Multidimensional Mean-Payoff Games

We consider both finite-state game graphs and recursive game graphs (or pushdown game graphs), that can model the control flow of sequential programs with recursion, with multi-dimensional mean-payoff objectives. In pushdown games two types of strategies are relevant: global strategies, that depend on the entire global history; and modular strategies, that have only local memory and thus do not depend on the context of invocation. We present solutions to several fundamental algorithmic questions and our main contributions are as follows: (1) We show that finite-state multi-dimensional mean-payoff games can be solved in polynomial time if the number of dimensions and the maximal absolute value of the weight is fixed; whereas if the number of dimensions is arbitrary, then problem is already known to be coNP-complete. (2) We show that pushdown graphs with multi-dimensional mean-payoff objectives can be solved in polynomial time. (3) For pushdown games under global strategies both single and multi-dimensional mean-payoff objectives problems are known to be undecidable, and we show that under modular strategies the multi-dimensional problem is also undecidable (whereas under modular strategies the single dimensional problem is NP-complete). We show that if the number of modules, the number of exits, and the maximal absolute value of the weight is fixed, then pushdown games under modular strategies with single dimensional mean-payoff objectives can be solved in polynomial time, and if either of the number of exits or the number of modules is not bounded, then the problem is NP-hard. (4) Finally we show that a fixed parameter tractable algorithm for finite-state multi-dimensional mean-payoff games or pushdown games under modular strategies with single-dimensional mean-payoff objectives would imply the solution of the long-standing open problem of fixed parameter tractability of parity games.

preprint2013arXiv

Infinite-state games with finitary conditions

We study two-player zero-sum games over infinite-state graphs with boundedness conditions. Our first contribution is about the strategy complexity, i.e the memory required for winning strategies: we prove that over general infinite-state graphs, memoryless strategies are sufficient for finitary Büchi games, and finite-memory suffices for finitary parity games. We then study pushdown boundedness games, with two contributions. First we prove a collapse result for pushdown omega B games, implying the decidability of solving these games. Second we consider pushdown games with finitary parity along with stack boundedness conditions, and show that solving these games is EXPTIME-complete.

preprint2013arXiv

Obligation Blackwell Games and p-Automata

We recently introduced p-automata, automata that read discrete-time Markov chains. We used turn-based stochastic parity games to define acceptance of Markov chains by a subclass of p-automata. Definition of acceptance required a cumbersome and complicated reduction to a series of turn-based stochastic parity games. The reduction could not support acceptance by general p-automata, which was left undefined as there was no notion of games that supported it. Here we generalize two-player games by adding a structural acceptance condition called obligations. Obligations are orthogonal to the linear winning conditions that define winning. Obligations are a declaration that player 0 can achieve a certain value from a configuration. If the obligation is met, the value of that configuration for player 0 is 1. One cannot define value in obligation games by the standard mechanism of considering the measure of winning paths on a Markov chain and taking the supremum of the infimum of all strategies. Mainly because obligations need definition even for Markov chains and the nature of obligations has the flavor of an infinite nesting of supremum and infimum operators. We define value via a reduction to turn-based games similar to Martin's proof of determinacy of Blackwell games with Borel objectives. Based on this definition, we show that games are determined. We show that for Markov chains with Borel objectives and obligations, and finite turn-based stochastic parity games with obligations there exists an alternative and simpler characterization of the value function. Based on this simpler definition we give an exponential time algorithm to analyze finite turn-based stochastic parity games with obligations. Finally, we show that obligation games provide the necessary framework for reasoning about p-automata and that they generalize the previous definition.

preprint2013arXiv

POMDPs under Probabilistic Semantics

We consider partially observable Markov decision processes (POMDPs) with limit-average payoff, where a reward value in the interval [0,1] is associated to every transition, and the payoff of an infinite path is the long-run average of the rewards. We consider two types of path constraints: (i) quantitative constraint defines the set of paths where the payoff is at least a given threshold λ in (0, 1]; and (ii) qualitative constraint which is a special case of quantitative constraint with λ = 1. We consider the computation of the almost-sure winning set, where the controller needs to ensure that the path constraint is satisfied with probability 1. Our main results for qualitative path constraint are as follows: (i) the problem of deciding the existence of a finite-memory controller is EXPTIME-complete; and (ii) the problem of deciding the existence of an infinite-memory controller is undecidable. For quantitative path constraint we show that the problem of deciding the existence of a finite-memory controller is undecidable.

preprint2013arXiv

Trading Performance for Stability in Markov Decision Processes

We study the complexity of central controller synthesis problems for finite-state Markov decision processes, where the objective is to optimize both the expected mean-payoff performance of the system and its stability. We argue that the basic theoretical notion of expressing the stability in terms of the variance of the mean-payoff (called global variance in our paper) is not always sufficient, since it ignores possible instabilities on respective runs. For this reason we propose alernative definitions of stability, which we call local and hybrid variance, and which express how rewards on each run deviate from the run's own mean-payoff and from the expected mean-payoff, respectively. We show that a strategy ensuring both the expected mean-payoff and the variance below given bounds requires randomization and memory, under all the above semantics of variance. We then look at the problem of determining whether there is a such a strategy. For the global variance, we show that the problem is in PSPACE, and that the answer can be approximated in pseudo-polynomial time. For the hybrid variance, the analogous decision problem is in NP, and a polynomial-time approximating algorithm also exists. For local variance, we show that the decision problem is in NP. Since the overall performance can be traded for stability (and vice versa), we also present algorithms for approximating the associated Pareto curve in all the three cases. Finally, we study a special case of the decision problems, where we require a given expected mean-payoff together with zero variance. Here we show that the problems can be all solved in polynomial time.

preprint2013arXiv

TTP: Tool for Tumor Progression

In this work we present a flexible tool for tumor progression, which simulates the evolutionary dynamics of cancer. Tumor progression implements a multi-type branching process where the key parameters are the fitness landscape, the mutation rate, and the average time of cell division. The fitness of a cancer cell depends on the mutations it has accumulated. The input to our tool could be any fitness landscape, mutation rate, and cell division time, and the tool produces the growth dynamics and all relevant statistics.

preprint2013arXiv

What is Decidable about Partially Observable Markov Decision Processes with ω-Regular Objectives

We consider partially observable Markov decision processes (POMDPs) with ω-regular conditions specified as parity objectives. The class of ω-regular languages extends regular languages to infinite strings and provides a robust specification language to express all properties used in verification, and parity objectives are canonical forms to express ω-regular conditions. The qualitative analysis problem given a POMDP and a parity objective asks whether there is a strategy to ensure that the objective is satis- fied with probability 1 (resp. positive probability). While the qualitative analysis problems are known to be undecidable even for very special cases of parity objectives, we establish decidability (with optimal complexity) of the qualitative analysis problems for POMDPs with all parity objectives under finite- memory strategies. We establish optimal (exponential) memory bounds and EXPTIME-completeness of the qualitative analysis problems under finite-memory strategies for POMDPs with parity objectives.

preprint2012arXiv

Counterexample-guided Planning

Planning in adversarial and uncertain environments can be modeled as the problem of devising strategies in stochastic perfect information games. These games are generalizations of Markov decision processes (MDPs): there are two (adversarial) players, and a source of randomness. The main practical obstacle to computing winning strategies in such games is the size of the state space. In practice therefore, one typically works with abstractions of the model. The diffculty is to come up with an abstraction that is neither too coarse to remove all winning strategies (plans), nor too fine to be intractable. In verification, the paradigm of counterexample-guided abstraction refinement has been successful to construct useful but parsimonious abstractions automatically. We extend this paradigm to probabilistic models (namely, perfect information games and, as a special case, MDPs). This allows us to apply the counterexample-guided abstraction paradigm to the AI planning problem. As special cases, we get planning algorithms for MDPs and deterministic systems that automatically construct system abstractions.

preprint2012arXiv

Efficient Controller Synthesis for Consumption Games with Multiple Resource Types

We introduce consumption games, a model for discrete interactive system with multiple resources that are consumed or reloaded independently. More precisely, a consumption game is a finite-state graph where each transition is labeled by a vector of resource updates, where every update is a non-positive number or omega. The omega updates model the reloading of a given resource. Each vertex belongs either to player \Box or player \Diamond, where the aim of player \Box is to play so that the resources are never exhausted. We consider several natural algorithmic problems about consumption games, and show that although these problems are computationally hard in general, they are solvable in polynomial time for every fixed number of resource types (i.e., the dimension of the update vectors).

preprint2012arXiv

Energy Parity Games

Energy parity games are infinite two-player turn-based games played on weighted graphs. The objective of the game combines a (qualitative) parity condition with the (quantitative) requirement that the sum of the weights (i.e., the level of energy in the game) must remain positive. Beside their own interest in the design and synthesis of resource-constrained omega-regular specifications, energy parity games provide one of the simplest model of games with combined qualitative and quantitative objective. Our main results are as follows: (a) exponential memory is necessary and sufficient for winning strategies in energy parity games; (b) the problem of deciding the winner in energy parity games can be solved in NP \cap coNP; and (c) we give an algorithm to solve energy parity by reduction to energy games. We also show that the problem of deciding the winner in energy parity games is polynomially equivalent to the problem of deciding the winner in mean-payoff parity games, while optimal strategies may require infinite memory in mean-payoff parity games. As a consequence we obtain a conceptually simple algorithm to solve mean-payoff parity games.

preprint2012arXiv

Equivalence of Games with Probabilistic Uncertainty and Partial-observation Games

We introduce games with probabilistic uncertainty, a natural model for controller synthesis in which the controller observes the state of the system through imprecise sensors that provide correct information about the current state with a fixed probability. That is, in each step, the sensors return an observed state, and given the observed state, there is a probability distribution (due to the estimation error) over the actual current state. The controller must base its decision on the observed state (rather than the actual current state, which it does not know). On the other hand, we assume that the environment can perfectly observe the current state. We show that our model can be reduced in polynomial time to standard partial-observation stochastic games, and vice-versa. As a consequence we establish the precise decidability frontier for the new class of games, and for most of the decidable problems establish optimal complexity results.

preprint2012arXiv

Faster Algorithms for Alternating Refinement Relations

One central issue in the formal design and analysis of reactive systems is the notion of refinement that asks whether all behaviors of the implementation is allowed by the specification. The local interpretation of behavior leads to the notion of simulation. Alternating transition systems (ATSs) provide a general model for composite reactive systems, and the simulation relation for ATSs is known as alternating simulation. The simulation relation for fair transition systems is called fair simulation. In this work our main contributions are as follows: (1) We present an improved algorithm for fair simulation with Büchi fairness constraints; our algorithm requires $O(n^3 \cdot m)$ time as compared to the previous known $O(n^6)$-time algorithm, where $n$ is the number of states and $m$ is the number of transitions. (2) We present a game based algorithm for alternating simulation that requires $O(m^2)$-time as compared to the previous known $O((n \cdot m)^2)$-time algorithm, where $n$ is the number of states and $m$ is the size of transition relation. (3) We present an iterative algorithm for alternating simulation that matches the time complexity of the game based algorithm, but is more space efficient than the game based algorithm.

preprint2012arXiv

Finite Automata with Time-Delay Blocks (Extended Version)

The notion of delays arises naturally in many computational models, such as, in the design of circuits, control systems, and dataflow languages. In this work, we introduce \emph{automata with delay blocks} (ADBs), extending finite state automata with variable time delay blocks, for deferring individual transition output symbols, in a discrete-time setting. We show that the ADB languages strictly subsume the regular languages, and are incomparable in expressive power to the context-free languages. We show that ADBs are closed under union, concatenation and Kleene star, and under intersection with regular languages, but not closed under complementation and intersection with other ADB languages. We show that the emptiness and the membership problems are decidable in polynomial time for ADBs, whereas the universality problem is undecidable. Finally we consider the linear-time model checking problem, i.e., whether the language of an ADB is contained in a regular language, and show that the model checking problem is PSPACE-complete.

preprint2012arXiv

Robustness of Structurally Equivalent Concurrent Parity Games

We consider two-player stochastic games played on a finite state space for an infinite number of rounds. The games are concurrent: in each round, the two players (player 1 and player 2) choose their moves independently and simultaneously; the current state and the two moves determine a probability distribution over the successor states. We also consider the important special case of turn-based stochastic games where players make moves in turns, rather than concurrently. We study concurrent games with ω-regular winning conditions specified as parity objectives. The value for player 1 for a parity objective is the maximal probability with which the player can guarantee the satisfaction of the objective against all strategies of the opponent. We study the problem of continuity and robustness of the value function in concurrent and turn-based stochastic parity gameswith respect to imprecision in the transition probabilities. We present quantitative bounds on the difference of the value function (in terms of the imprecision of the transition probabilities) and show the value continuity for structurally equivalent concurrent games (two games are structurally equivalent if the support of the transition function is same and the probabilities differ). We also show robustness of optimal strategies for structurally equivalent turn-based stochastic parity games. Finally we show that the value continuity property breaks without the structurally equivalent assumption (even for Markov chains) and show that our quantitative bound is asymptotically optimal. Hence our results are tight (the assumption is both necessary and sufficient) and optimal (our quantitative bound is asymptotically optimal).

preprint2012arXiv

Strategy complexity of finite-horizon Markov decision processes and simple stochastic games

Markov decision processes (MDPs) and simple stochastic games (SSGs) provide a rich mathematical framework to study many important problems related to probabilistic systems. MDPs and SSGs with finite-horizon objectives, where the goal is to maximize the probability to reach a target state in a given finite time, is a classical and well-studied problem. In this work we consider the strategy complexity of finite-horizon MDPs and SSGs. We show that for all $ε>0$, the natural class of counter-based strategies require at most $\log \log (\frac{1}ε) + n+1$ memory states, and memory of size $Ω(\log \log (\frac{1}ε) + n)$ is required. Thus our bounds are asymptotically optimal. We then study the periodic property of optimal strategies, and show a sub-exponential lower bound on the period for optimal strategies.

preprint2012arXiv

Strategy Improvement for Concurrent Reachability and Safety Games

We consider concurrent games played on graphs. At every round of a game, each player simultaneously and independently selects a move; the moves jointly determine the transition to a successor state. Two basic objectives are the safety objective to stay forever in a given set of states, and its dual, the reachability objective to reach a given set of states. First, we present a simple proof of the fact that in concurrent reachability games, for all $ε>0$, memoryless $ε$-optimal strategies exist. A memoryless strategy is independent of the history of plays, and an $ε$-optimal strategy achieves the objective with probability within $ε$ of the value of the game. In contrast to previous proofs of this fact, our proof is more elementary and more combinatorial. Second, we present a strategy-improvement (a.k.a.\ policy-iteration) algorithm for concurrent games with reachability objectives. We then present a strategy-improvement algorithm for concurrent games with safety objectives. Our algorithms yield sequences of player-1 strategies which ensure probabilities of winning that converge monotonically to the value of the game. Our result is significant because the strategy-improvement algorithm for safety games provides, for the first time, a way to approximate the value of a concurrent safety game from below. Previous methods could approximate the values of these games only from one direction, and as no rates of convergence are known, they did not provide a practical way to solve these games.

preprint2012arXiv

The Complexity of Multi-Mean-Payoff and Multi-Energy Games

In mean-payoff games, the objective of the protagonist is to ensure that the limit average of an infinite sequence of numeric weights is nonnegative. In energy games, the objective is to ensure that the running sum of weights is always nonnegative. Multi-mean-payoff and multi-energy games replace individual weights by tuples, and the limit average (resp. running sum) of each coordinate must be (resp. remain) nonnegative. These games have applications in the synthesis of resource-bounded processes with multiple resources. We prove the finite-memory determinacy of multi-energy games and show the inter-reducibility of multimean-payoff and multi-energy games for finite-memory strategies. We also improve the computational complexity for solving both classes of games with finite-memory strategies: while the previously best known upper bound was EXPSPACE, and no lower bound was known, we give an optimal coNP-complete bound. For memoryless strategies, we show that the problem of deciding the existence of a winning strategy for the protagonist is NP-complete. Finally we present the first solution of multi-meanpayoff games with infinite-memory strategies. We show that multi-mean-payoff games with mean-payoff-sup objectives can be decided in NP and coNP, whereas multi-mean-payoff games with mean-payoff-inf objectives are coNP-complete.

preprint2011arXiv

A reduction from parity games to simple stochastic games

Games on graphs provide a natural model for reactive non-terminating systems. In such games, the interaction of two players on an arena results in an infinite path that describes a run of the system. Different settings are used to model various open systems in computer science, as for instance turn-based or concurrent moves, and deterministic or stochastic transitions. In this paper, we are interested in turn-based games, and specifically in deterministic parity games and stochastic reachability games (also known as simple stochastic games). We present a simple, direct and efficient reduction from deterministic parity games to simple stochastic games: it yields an arena whose size is linear up to a logarithmic factor in size of the original arena.

preprint2011arXiv

An O(n^2) Time Algorithm for Alternating Büchi Games

Computing the winning set for B{ü}chi objectives in alternating games on graphs is a central problem in computer aided verification with a large number of applications. The long standing best known upper bound for solving the problem is $\tilde{O}(n \cdot m)$, where $n$ is the number of vertices and $m$ is the number of edges in the graph. We are the first to break the $\tilde{O}(n\cdot m)$ bound by presenting a new technique that reduces the running time to $O(n^2)$. This bound also leads to an $O(n^2)$ algorithm time for computing the set of almost-sure winning vertices in alternating games with probabilistic transitions (improving an earlier bound of $\tilde{O}(n\cdot m)$) and in concurrent graph games with constant actions (improving an earlier bound of $O(n^3)$). We also show that the same technique can be used to compute the maximal end-component decomposition of a graph in time $O(n^2)$. Finally, we show how to maintain the winning set for B{ü}chi objectives in alternating games under a sequence of edge insertions or a sequence of edge deletions in O(n) amortized time per operation. This is the first dynamic algorithm for this problem.

preprint2011arXiv

Assume-Guarantee Synthesis for Digital Contract Signing

We study the automatic synthesis of fair non-repudiation protocols, a class of fair exchange protocols, used for digital contract signing. First, we show how to specify the objectives of the participating agents and the trusted third party (TTP) as path formulas in LTL and prove that the satisfaction of these objectives imply fairness; a property required of fair exchange protocols. We then show that weak (co-operative) co-synthesis and classical (strictly competitive) co-synthesis fail, whereas assume-guarantee synthesis (AGS) succeeds. We demonstrate the success of assume-guarantee synthesis as follows: (a) any solution of assume-guarantee synthesis is attack-free; no subset of participants can violate the objectives of the other participants; (b) the Asokan-Shoup-Waidner (ASW) certified mail protocol that has known vulnerabilities is not a solution of AGS; (c) the Kremer-Markowitch (KM) non-repudiation protocol is a solution of AGS; and (d) AGS presents a new and symmetric fair non-repudiation protocol that is attack-free. To our knowledge this is the first application of synthesis to fair non-repudiation protocols, and our results show how synthesis can both automatically discover vulnerabilities in protocols and generate correct protocols. The solution to assume-guarantee synthesis can be computed efficiently as the secure equilibrium solution of three-player graph games.

preprint2011arXiv

Decidable Problems for Probabilistic Automata on Infinite Words

We consider probabilistic automata on infinite words with acceptance defined by parity conditions. We consider three qualitative decision problems: (i) the positive decision problem asks whether there is a word that is accepted with positive probability; (ii) the almost decision problem asks whether there is a word that is accepted with probability 1; and (iii) the limit decision problem asks whether for every epsilon > 0 there is a word that is accepted with probability at least 1 - epsilon. We unify and generalize several decidability results for probabilistic automata over infinite words, and identify a robust (closed under union and intersection) subclass of probabilistic automata for which all the qualitative decision problems are decidable for parity conditions. We also show that if the input words are restricted to lasso shape (regular) words, then the positive and almost problems are decidable for all probabilistic automata with parity conditions. For most decidable problems we show an optimal PSPACE-complete complexity bound.

preprint2011arXiv

Energy and Mean-Payoff Parity Markov Decision Processes

We consider Markov Decision Processes (MDPs) with mean-payoff parity and energy parity objectives. In system design, the parity objective is used to encode ω-regular specifications, and the mean-payoff and energy objectives can be used to model quantitative resource constraints. The energy condition requires that the resource level never drops below 0, and the mean-payoff condition requires that the limit-average value of the resource consumption is within a threshold. While these two (energy and mean-payoff) classical conditions are equivalent for two-player games, we show that they differ for MDPs. We show that the problem of deciding whether a state is almost-sure winning (i.e., winning with probability 1) in energy parity MDPs is in NP \cap coNP, while for mean-payoff parity MDPs, the problem is solvable in polynomial time, improving a recent PSPACE bound.

preprint2011arXiv

Finitary languages

The class of omega-regular languages provides a robust specification language in verification. Every omega-regular condition can be decomposed into a safety part and a liveness part. The liveness part ensures that something good happens "eventually". Finitary liveness was proposed by Alur and Henzinger as a stronger formulation of liveness. It requires that there exists an unknown, fixed bound b such that something good happens within b transitions. In this work we consider automata with finitary acceptance conditions defined by finitary Buchi, parity and Streett languages. We study languages expressible by such automata: we give their topological complexity and present a regular-expression characterization. We compare the expressive power of finitary automata and give optimal algorithms for classical decisions questions. We show that the finitary languages are Sigma 2-complete; we present a complete picture of the expressive power of various classes of automata with finitary and infinitary acceptance conditions; we show that the languages defined by finitary parity automata exactly characterize the star-free fragment of omega B-regular languages; and we show that emptiness is NLOGSPACE-complete and universality as well as language inclusion are PSPACE-complete for finitary parity and Streett automata.

preprint2011arXiv

Magnifying Lens Abstraction for Stochastic Games with Discounted and Long-run Average Objectives

Turn-based stochastic games and its important subclass Markov decision processes (MDPs) provide models for systems with both probabilistic and nondeterministic behaviors. We consider turn-based stochastic games with two classical quantitative objectives: discounted-sum and long-run average objectives. The game models and the quantitative objectives are widely used in probabilistic verification, planning, optimal inventory control, network protocol and performance analysis. Games and MDPs that model realistic systems often have very large state spaces, and probabilistic abstraction techniques are necessary to handle the state-space explosion. The commonly used full-abstraction techniques do not yield space-savings for systems that have many states with similar value, but does not necessarily have similar transition structure. A semi-abstraction technique, namely Magnifying-lens abstractions (MLA), that clusters states based on value only, disregarding differences in their transition relation was proposed for qualitative objectives (reachability and safety objectives). In this paper we extend the MLA technique to solve stochastic games with discounted-sum and long-run average objectives. We present the MLA technique based abstraction-refinement algorithm for stochastic games and MDPs with discounted-sum objectives. For long-run average objectives, our solution works for all MDPs and a sub-class of stochastic games where every state has the same value.

preprint2011arXiv

Measuring and Synthesizing Systems in Probabilistic Environments

Often one has a preference order among the different systems that satisfy a given specification. Under a probabilistic assumption about the possible inputs, such a preference order is naturally expressed by a weighted automaton, which assigns to each word a value, such that a system is preferred if it generates a higher expected value. We solve the following optimal-synthesis problem: given an omega-regular specification, a Markov chain that describes the distribution of inputs, and a weighted automaton that measures how well a system satisfies the given specification under the given input assumption, synthesize a system that optimizes the measured value. For safety specifications and measures given by mean-payoff automata, the optimal-synthesis problem amounts to finding a strategy in a Markov decision process (MDP) that is optimal for a long-run average reward objective, which can be done in polynomial time. For general omega-regular specifications, the solution rests on a new, polynomial-time algorithm for computing optimal strategies in MDPs with mean-payoff parity objectives. Our algorithm generates optimal strategies consisting of two memoryless strategies and a counter. This counter is in general not bounded. To obtain a finite-state system, we show how to construct an ε-optimal strategy with a bounded counter for any ε>0. We also show how to decide in polynomial time if we can construct an optimal finite-state system (i.e., a system without a counter) for a given specification. We have implemented our approach in a tool that takes qualitative and quantitative specifications and automatically constructs a system that satisfies the qualitative specification and optimizes the quantitative specification, if such a system exists. We present experimental results showing optimal systems that were generated in this way.

preprint2011arXiv

On Memoryless Quantitative Objectives

In two-player games on graph, the players construct an infinite path through the game graph and get a reward computed by a payoff function over infinite paths. Over weighted graphs, the typical and most studied payoff functions compute the limit-average or the discounted sum of the rewards along the path. Beside their simple definition, these two payoff functions enjoy the property that memoryless optimal strategies always exist. In an attempt to construct other simple payoff functions, we define a class of payoff functions which compute an (infinite) weighted average of the rewards. This new class contains both the limit-average and discounted sum functions, and we show that they are the only members of this class which induce memoryless optimal strategies, showing that there is essentially no other simple payoff functions.

preprint2011arXiv

Partial-Observation Stochastic Games: How to Win when Belief Fails

In two-player finite-state stochastic games of partial observation on graphs, in every state of the graph, the players simultaneously choose an action, and their joint actions determine a probability distribution over the successor states. We consider reachability objectives where player 1 tries to ensure a target state to be visited almost-surely or positively. On the basis of information, the game can be one-sided with either (a)player 1 or (b)player 2 having partial observation, or two-sided with both players having partial observation. On the basis of randomization (a)players may not be allowed to use randomization (pure strategies), or (b)may choose a probability distribution over actions but the actual random choice is not visible (actions invisible), or (c)may use full randomization. Our results for pure strategies are as follows: (1)For one-sided games with player 2 perfect observation we show that belief-based strategies are not sufficient, and present an exponential upper bound on memory both for almost-sure and positive winning strategies; we show that the problem of deciding the existence of almost-sure and positive winning strategies for player 1 is EXPTIME-complete and present symbolic algorithms that avoid the explicit exponential construction. (2)For one-sided games with player 1 perfect observation we show that non-elementary memory is both necessary and sufficient for both almost-sure and positive winning strategies. (3)We show that for the two-sided case finite memory strategies are sufficient for both positive and almost-sure winning. We establish the equivalence of the almost-sure winning problem for pure strategies with randomized strategies with actions invisible. Our equivalence result exhibit serious flaws in previous results in the literature: we show a non-elementary memory lower bound for almost-sure winning whereas an exponential upper bound was claimed.

preprint2011arXiv

Quantitative Synthesis for Concurrent Programs

We present an algorithmic method for the quantitative, performance-aware synthesis of concurrent programs. The input consists of a nondeterministic partial program and of a parametric performance model. The nondeterminism allows the programmer to omit which (if any) synchronization construct is used at a particular program location. The performance model, specified as a weighted automaton, can capture system architectures by assigning different costs to actions such as locking, context switching, and memory and cache accesses. The quantitative synthesis problem is to automatically resolve the nondeterminism of the partial program so that both correctness is guaranteed and performance is optimal. As is standard for shared memory concurrency, correctness is formalized "specification free", in particular as race freedom or deadlock freedom. For worst-case (average-case) performance, we show that the problem can be reduced to 2-player graph games (with probabilistic transitions) with quantitative objectives. While we show, using game-theoretic methods, that the synthesis problem is NEXP-complete, we present an algorithmic method and an implementation that works efficiently for concurrent programs and performance models of practical interest. We have implemented a prototype tool and used it to synthesize finite-state concurrent programs that exhibit different programming patterns, for several performance models representing different architectures.

preprint2011arXiv

Synthesis of Memory-Efficient Real-Time Controllers for Safety Objectives (Full Version)

We study synthesis of controllers for real-time systems, where the objective is to stay in a given safe set. The problem is solved by obtaining winning strategies in concurrent two-player \emph{timed automaton games} with safety objectives. To prevent a player from winning by blocking time, we restrict each player to strategies that ensure that the player cannot be responsible for causing a zeno run. We construct winning strategies for the controller which require access only to (1) the system clocks (thus, controllers which require their own internal infinitely precise clocks are not necessary), and (2) a linear (in the number of clocks) number of memory bits. Precisely, we show that a memory of size $\big(3\cdot|C|+1 + \lg(|C|+1)\big)$ bits suffices for winning controller strategies for safety objectives, where $C$ is the set of clocks of the timed automaton game, significantly improving the previous known exponential bound. We also settle the open question of whether \emph{region} strategies for controllers require memory for safety objectives by showing with an example that region strategies do require memory for safety objectives.

preprint2011arXiv

The Decidability Frontier for Probabilistic Automata on Infinite Words

We consider probabilistic automata on infinite words with acceptance defined by safety, reachability, Büchi, coBüchi, and limit-average conditions. We consider quantitative and qualitative decision problems. We present extensions and adaptations of proofs for probabilistic finite automata and present a complete characterization of the decidability and undecidability frontier of the quantitative and qualitative decision problems for probabilistic automata on infinite words.

preprint2010arXiv

Algorithms for Game Metrics

Simulation and bisimulation metrics for stochastic systems provide a quantitative generalization of the classical simulation and bisimulation relations. These metrics capture the similarity of states with respect to quantitative specifications written in the quantitative μ-calculus and related probabilistic logics. We first show that the metrics provide a bound for the difference in long-run average and discounted average behavior across states, indicating that the metrics can be used both in system verification, and in performance evaluation. For turn-based games and MDPs, we provide a polynomial-time algorithm for the computation of the one-step metric distance between states. The algorithm is based on linear programming; it improves on the previous known exponential-time algorithm based on a reduction to the theory of reals. We then present PSPACE algorithms for both the decision problem and the problem of approximating the metric distance between two states, matching the best known algorithms for Markov chains. For the bisimulation kernel of the metric our algorithm works in time O(n^4) for both turn-based games and MDPs; improving the previously best known O(n^9\cdot log(n)) time algorithm for MDPs. For a concurrent game G, we show that computing the exact distance between states is at least as hard as computing the value of concurrent reachability games and the square-root-sum problem in computational geometry. We show that checking whether the metric distance is bounded by a rational r, can be done via a reduction to the theory of real closed fields, involving a formula with three quantifier alternations, yielding O(|G|^O(|G|^5)) time complexity, improving the previously known reduction, which yielded O(|G|^O(|G|^7)) time complexity. These algorithms can be iterated to approximate the metrics using binary search.

preprint2010arXiv

Discounting in Games across Time Scales

We introduce two-level discounted games played by two players on a perfect-information stochastic game graph. The upper level game is a discounted game and the lower level game is an undiscounted reachability game. Two-level games model hierarchical and sequential decision making under uncertainty across different time scales. We show the existence of pure memoryless optimal strategies for both players and an ordered field property for such games. We show that if there is only one player (Markov decision processes), then the values can be computed in polynomial time. It follows that whether the value of a player is equal to a given rational constant in two-level discounted games can be decided in NP intersected coNP. We also give an alternate strategy improvement algorithm to compute the value.

preprint2010arXiv

Expressiveness and Closure Properties for Quantitative Languages

Weighted automata are nondeterministic automata with numerical weights on transitions. They can define quantitative languages~$L$ that assign to each word~$w$ a real number~$L(w)$. In the case of infinite words, the value of a run is naturally computed as the maximum, limsup, liminf, limit-average, or discounted-sum of the transition weights. The value of a word $w$ is the supremum of the values of the runs over $w$. We study expressiveness and closure questions about these quantitative languages. We first show that the set of words with value greater than a threshold can be non-$ω$-regular for deterministic limit-average and discounted-sum automata, while this set is always $ω$-regular when the threshold is isolated (i.e., some neighborhood around the threshold contains no word). In the latter case, we prove that the $ω$-regular language is robust against small perturbations of the transition weights. We next consider automata with transition weights Weighted automata are nondeterministic automata with numerical weights ontransitions. They can define quantitative languages~$L$ that assign to eachword~$w$ a real number~$L(w)$. In the case of infinite words, the value of arun is naturally computed as the maximum, limsup, liminf, limit-average, ordiscounted-sum of the transition weights. The value of a word $w$ is thesupremum of the values of the runs over $w$. We study expressiveness andclosure questions about these quantitative languages. We first show that the set of words with value greater than a threshold canbe non-$ω$-regular for deterministic limit-average and discounted-sumautomata, while this set is always $ω$-regular when the threshold isisolated (i.e., some neighborhood around the threshold contains no word). Inthe latter case, we prove that the $ω$-regular language is robust againstsmall perturbations of the transition weights. We next consider automata with transition weights $0$ or $1$ and show thatthey are as expressive as general weighted automata in the limit-average case,but not in the discounted-sum case. Third, for quantitative languages $L_1$ and~$L_2$, we consider the operations$\max(L_1,L_2)$, $\min(L_1,L_2)$, and $1-L_1$, which generalize the booleanoperations on languages, as well as the sum $L_1 + L_2$. We establish theclosure properties of all classes of quantitative languages with respect tothese four operations.$ or $ and show that they are as expressive as general weighted automata in the limit-average case, but not in the discounted-sum case. Third, for quantitative languages $L_1$ and~$L_2$, we consider the operations $\max(L_1,L_2)$, $\min(L_1,L_2)$, and -L_1$, which generalize the boolean operations on languages, as well as the sum $L_1 + L_2$. We establish the closure properties of all classes of quantitative languages with respect to these four operations.

preprint2010arXiv

Generalized Mean-payoff and Energy Games

In mean-payoff games, the objective of the protagonist is to ensure that the limit average of an infinite sequence of numeric weights is nonnegative. In energy games, the objective is to ensure that the running sum of weights is always nonnegative. Generalized mean-payoff and energy games replace individual weights by tuples, and the limit average (resp. running sum) of each coordinate must be (resp. remain) nonnegative. These games have applications in the synthesis of resource-bounded processes with multiple resources. We prove the finite-memory determinacy of generalized energy games and show the inter-reducibility of generalized mean-payoff and energy games for finite-memory strategies. We also improve the computational complexity for solving both classes of games with finite-memory strategies: while the previously best known upper bound was EXPSPACE, and no lower bound was known, we give an optimal coNP-complete bound. For memoryless strategies, we show that the problem of deciding the existence of a winning strategy for the protagonist is NP-complete.

preprint2010arXiv

GIST: A Solver for Probabilistic Games

Gist is a tool that (a) solves the qualitative analysis problem of turn-based probabilistic games with ω-regular objectives; and (b) synthesizes reasonable environment assumptions for synthesis of unrealizable specifications. Our tool provides the first and efficient implementations of several reduction-based techniques to solve turn-based probabilistic games, and uses the analysis of turn-based probabilistic games for synthesizing environment assumptions for unrealizable specifications.

preprint2010arXiv

Mean-payoff Automaton Expressions

Quantitative languages are an extension of boolean languages that assign to each word a real number. Mean-payoff automata are finite automata with numerical weights on transitions that assign to each infinite path the long-run average of the transition weights. When the mode of branching of the automaton is deterministic, nondeterministic, or alternating, the corresponding class of quantitative languages is not robust as it is not closed under the pointwise operations of max, min, sum, and numerical complement. Nondeterministic and alternating mean-payoff automata are not decidable either, as the quantitative generalization of the problems of universality and language inclusion is undecidable. We introduce a new class of quantitative languages, defined by mean-payoff automaton expressions, which is robust and decidable: it is closed under the four pointwise operations, and we show that all decision problems are decidable for this class. Mean-payoff automaton expressions subsume deterministic mean-payoff automata, and we show that they have expressive power incomparable to nondeterministic and alternating mean-payoff automata. We also present for the first time an algorithm to compute distance between two quantitative languages, and in our case the quantitative languages are given as mean-payoff automaton expressions.

preprint2010arXiv

Qualitative Analysis of Partially-observable Markov Decision Processes

We study observation-based strategies for partially-observable Markov decision processes (POMDPs) with omega-regular objectives. An observation-based strategy relies on partial information about the history of a play, namely, on the past sequence of observations. We consider the qualitative analysis problem: given a POMDP with an omega-regular objective, whether there is an observation-based strategy to achieve the objective with probability~1 (almost-sure winning), or with positive probability (positive winning). Our main results are twofold. First, we present a complete picture of the computational complexity of the qualitative analysis of POMDP s with parity objectives (a canonical form to express omega-regular objectives) and its subclasses. Our contribution consists in establishing several upper and lower bounds that were not known in literature. Second, we present optimal bounds (matching upper and lower bounds) on the memory required by pure and randomized observation-based strategies for the qualitative analysis of POMDP s with parity objectives and its subclasses.

preprint2010arXiv

Synthesis of AMBA AHB from Formal Specification

The standard procedure for hardware design consists of describing circuit in a hardware description language at logic level followed by extensive verification and logic-synthesis. However, this process consumes significant time and needs a lot of effort. An alternative is to use formal specification language as a high-level hardware description language and synthesize hardware from formal specification. Bloem et.al. gave formal specifications and synthesize the AMBA AHB Arbiter. Our contributions are as follows:(1) We present more complete and compact formal specifications for the AMBA AHB Arbiter, and obtain significant (order of magnitude) improvement in synthesis results (both with respect to time and the number of gates of the synthesize circuit); (2) we present formal specification and synthesize to generate compact circuits for the remaining two components of the AMBA AHB protocol, namely, the AMBA AHB Master and AMBA AHB Slave; and (3) from the lessons learnt we present few principles for writing formal specifications for efficient hardware synthesis. Thus with intelligently written complete formal specifications we are able to automatically synthesize an important and widely used industrial protocol.

preprint2009arXiv

Qualitative Logics and Equivalences for Probabilistic Systems

We investigate logics and equivalence relations that capture the qualitative behavior of Markov Decision Processes (MDPs). We present Qualitative Randomized CTL (QRCTL): formulas of this logic can express the fact that certain temporal properties hold over all paths, or with probability 0 or 1, but they do not distinguish among intermediate probability values. We present a symbolic, polynomial time model-checking algorithm for QRCTL on MDPs. The logic QRCTL induces an equivalence relation over states of an MDP that we call qualitative equivalence: informally, two states are qualitatively equivalent if the sets of formulas that hold with probability 0 or 1 at the two states are the same. We show that for finite alternating MDPs, where nondeterministic and probabilistic choices occur in different states, qualitative equivalence coincides with alternating bisimulation, and can thus be computed via efficient partition-refinement algorithms. On the other hand, in non-alternating MDPs the equivalence relations cannot be computed via partition-refinement algorithms, but rather, they require non-local computation. Finally, we consider QRCTL*, that extends QRCTL with nested temporal operators in the same manner in which CTL* extends CTL. We show that QRCTL and QRCTL* induce the same qualitative equivalence on alternating MDPs, while on non-alternating MDPs, the equivalence arising from QRCTL* can be strictly finer. We also provide a full characterization of the relation between qualitative equivalence, bisimulation, and alternating bisimulation, according to whether the MDPs are finite, and to whether their transition relations are finitely-branching.

preprint2007arXiv

Algorithms for Omega-Regular Games with Imperfect Information

We study observation-based strategies for two-player turn-based games on graphs with omega-regular objectives. An observation-based strategy relies on imperfect information about the history of a play, namely, on the past sequence of observations. Such games occur in the synthesis of a controller that does not see the private state of the plant. Our main results are twofold. First, we give a fixed-point algorithm for computing the set of states from which a player can win with a deterministic observation-based strategy for any omega-regular objective. The fixed point is computed in the lattice of antichains of state sets. This algorithm has the advantages of being directed by the objective and of avoiding an explicit subset construction on the game graph. Second, we give an algorithm for computing the set of states from which a player can win with probability 1 with a randomized observation-based strategy for a Buechi objective. This set is of interest because in the absence of perfect information, randomized strategies are more powerful than deterministic ones. We show that our algorithms are optimal by proving matching lower bounds.

Krishnendu Chatterjee

What is connected

Connect this record

See the researcher in context

Building this map preview

108 published item(s)

Automated Approach for Solving Infinite-state Polynomial Reachability Games

Genotype specificity and spatial arrangement govern the direction and magnitude of selection in variable environments

Multi-Environment POMDPs with Finite-Horizon Objectives

Evolutionary instability of selfish learning in repeated games

Learning Stabilizing Policies in Stochastic Control Systems

On Satisficing in Quantitative Games

Approximating Values of Generalized-Reachability Stochastic Games

Concentration-Bound Analysis for Probabilistic Programs and Probabilistic Recurrence Relations

Faster Algorithms for Quantitative Analysis of Markov Chains and Markov Decision Processes with Small Treewidth

Inductive Reachability Witnesses

Infection dynamics of COVID-19 virus under lockdown and reopening

Multi-dimensional Long-Run Average Problems for Vector Addition Systems with States

Optimal and Perfectly Parallel Algorithms for On-demand Data-flow Analysis

Polynomial Invariant Generation for Non-deterministic Recursive Programs

Reinforcement Learning of Risk-Constrained Policies in Markov Decision Processes

Simplified Game of Life: Algorithms and Complexity

Stochastic Games with Lexicographic Reachability-Safety Objectives

The Moran process on 2-chromatic graphs

Limits on amplifiers of natural selection under death-Birth updating

Faster Algorithms for Weighted Recursive State Machines

Computation Tree Logic for Synchronization Properties

Conditionally Optimal Algorithms for Generalized Büchi Games

Mean-Payoff Pushdown Games

Model and Objective Separation with Conditional Lower Bounds: Disjunction is Harder than Conjunction

Perfect-Information Stochastic Games with Generalized Mean-Payoff Objectives

Robust Draws in Balanced Knockout Tournaments

Stochastic Invariants for Probabilistic Termination

Stochastic Shortest Path with Energy Constraints in POMDPs

Termination Analysis of Probabilistic Programs through Positivstellensatz's

A Symbolic SAT-based Algorithm for Almost-sure Reachability with Small Strategies in POMDPs

Algorithmic Analysis of Qualitative and Quantitative Termination Problems for Affine Probabilistic Programs

Algorithms for Algebraic Path Properties in Concurrent Systems of Constant Treewidth Components

Counterexample Explanation by Learning Small Strategies in Markov Decision Processes

Faster Algorithms for Quantitative Verification in Constant Treewidth Graphs

MultiGain: A controller synthesis tool for MDPs with multiple mean-payoff objectives

Nested Weighted Automata

Quantitative Timed Simulation Functions and Refinement Metrics for Timed Systems (Full Version)

Strategy Complexity of Concurrent Stochastic Games with Safety and Reachability Objectives

Temporal Logic Control for Stochastic Linear Systems using Abstraction Refinement of Probabilistic Games

Verification of Markov Decision Processes using Learning Algorithms

A Framework for Automated Competitive Analysis of On-line Scheduling of Firm-Deadline Tasks

Assume-Guarantee Synthesis for Concurrent Reactive Programs with Partial Information

Automatic Generation of Alternative Starting Positions for Simple Traditional Board Games

Average Case Analysis of the Classical Algorithm for Markov Decision Processes with Büchi Objectives

CEGAR for Qualitative Analysis of Probabilistic Systems

Faster Algorithms for Algebraic Path Properties in RSMs with Constant Treewidth

Games with a Weak Adversary

Generalized Risk-Aversion in Stochastic Multi-Armed Bandits

Looking at Mean-Payoff and Total-Payoff through Windows

Markov Decision Processes with Multiple Long-run Average Objectives

Optimal Cost Almost-sure Reachability in POMDPs

POMDPs under Probabilistic Semantics

Probabilistic Opacity for Markov Decision Processes

Proceedings 3rd Workshop on Synthesis

Qualitative Analysis of Concurrent Mean-payoff Games

Randomness for Free

Strategy Synthesis for Multi-dimensional Quantitative Objectives

Symbolic Algorithms for Qualitative Analysis of Markov Decision Processes with Büchi Objectives

The Complexity of Ergodic Mean-payoff Games

The Complexity of Partial-observation Stochastic Parity Games With Finite-memory Strategies

The Value 1 Problem Under Finite-memory Strategies for Concurrent Mean-payoff Games

Approximating the minimum cycle mean

Automata with Generalized Rabin Pairs for Probabilistic Model Checking and LTL Synthesis

Better Quality in Synthesis through Quantitative Objectives

Bounded Rationality in Concurrent Parity Games

Doomsday Equilibria for Omega-Regular Games

Faster Algorithms for Markov Decision Processes with Low Treewidth

Forgiver triumphs in alternating Prisoner's Dilemma

Hyperplane Separation Technique for Multidimensional Mean-Payoff Games

Infinite-state games with finitary conditions

Obligation Blackwell Games and p-Automata

POMDPs under Probabilistic Semantics

Trading Performance for Stability in Markov Decision Processes

TTP: Tool for Tumor Progression