Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
20works
0followers
15topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

20 published item(s)

preprint2026arXiv

Automated Approach for Solving Infinite-state Polynomial Reachability Games

Reachability games are two-player games played on a graph, where the objective of $\texttt{REACH}$ player is to reach the target set whereas the objective of $\texttt{SAFE}$ player is to stay away from the target set. Reachability games have important applications in artificial intelligence and reactive synthesis, and many of these applications give rise to infinite-state reachability games. In this paper, we study turn-based reachability games on infinite-state graphs defined over valuations of a finite set of real variables. We consider the problem of determining the existence of and computing a winning strategy for $\texttt{REACH}$ player. Our contributions are twofold. First, we propose ranking certificates for reachability games, a sound and complete proof rule for proving that $\texttt{REACH}$ player has a winning strategy from the specified initial state. Second, we consider polynomial reachability games, where transitions and objectives are described by polynomial constraints over real variables, and propose a fully automated algorithm for computing a winning strategy for $\texttt{REACH}$ player together with a formal correctness witness in the form of a ranking certificate. The algorithm is sound, semi-complete, and runs in sub-exponential time. Our experiments demonstrate the ability of our method to solve challenging examples from the literature that were out of the reach of existing methods. Specifically, for the classical Cinderella-Stepmother game, we are able to compute an optimal winning strategy for an arbitrary precision parameter for the first time.

preprint2026arXiv

Genotype specificity and spatial arrangement govern the direction and magnitude of selection in variable environments

Spatial environmental variation can either amplify or suppress the fixation of beneficial mutants in structured populations, yet the interplay of ecological factors and spatial structure in determining which outcome occurs remains theoretically unresolved. Here, we develop a unified framework for selection on lattice graphs with environmental heterogeneity, in which mutant and resident fitness depend on the local environmental state. Across three common classes of genotype-environment interactions and a wide range of spatial arrangements of environmental states, we identify two governing principles. Genotype specificity determines the direction of the effect: heterogeneity amplifies selection when it modulates resident fitness, but suppresses selection when it modulates mutant fitness, with genotype-symmetric modulation producing weaker amplification. Spatial arrangement determines the magnitude: intermixed versus clustered environments tune the strength of amplification or suppression without reversing the direction of the effect. Together, these principles reconcile disparate theoretical results and provide predictive criteria for adaptation in heterogeneous landscapes, from microbial communities to somatic evolution and cancer.

preprint2026arXiv

Multi-Environment POMDPs with Finite-Horizon Objectives

Partially Observable Markov Decision Processes (POMDPs) are systems in which one agent interacts with a stochastic environment, and receives only partial information about the current state. In a multi-environment POMDP (MEPOMDP), the initial state is unknown, and assumed to be adversarially chosen. In this work we focus on computing the optimal value and policy in MEPOMDPs with finite-horizon objectives. That problem is known to be PSPACE-complete in POMDPs. Our main results are as follows: (1) we establish that it is also PSPACE-complete in the more general setting of MEPOMDPs; (2) we present a practical algorithm and evaluate it on classical benchmarks, significantly outperforming the only previously known algorithm.

preprint2022arXiv

Evolutionary instability of selfish learning in repeated games

Across many domains of interaction, both natural and artificial, individuals use past experience to shape future behaviors. The results of such learning processes depend on what individuals wish to maximize. A natural objective is one's own success. However, when two such "selfish" learners interact with each other, the outcome can be detrimental to both, especially when there are conflicts of interest. Here, we explore how a learner can align incentives with a selfish opponent. Moreover, we consider the dynamics that arise when learning rules themselves are subject to evolutionary pressure. By combining extensive simulations and analytical techniques, we demonstrate that selfish learning is unstable in most classical two-player repeated games. If evolution operates on the level of long-run payoffs, selection instead favors learning rules that incorporate social (other-regarding) preferences. To further corroborate these results, we analyze data from a repeated prisoner's dilemma experiment. We find that selfish learning is insufficient to explain human behavior when there is a trade-off between payoff maximization and fairness.

preprint2022arXiv

Learning Stabilizing Policies in Stochastic Control Systems

In this work, we address the problem of learning provably stable neural network policies for stochastic control systems. While recent work has demonstrated the feasibility of certifying given policies using martingale theory, the problem of how to learn such policies is little explored. Here, we study the effectiveness of jointly learning a policy together with a martingale certificate that proves its stability using a single learning algorithm. We observe that the joint optimization problem becomes easily stuck in local minima when starting from a randomly initialized policy. Our results suggest that some form of pre-training of the policy is required for the joint optimization to repair and verify the policy successfully.

preprint2021arXiv

On Satisficing in Quantitative Games

Several problems in planning and reactive synthesis can be reduced to the analysis of two-player quantitative graph games. {\em Optimization} is one form of analysis. We argue that in many cases it may be better to replace the optimization problem with the {\em satisficing problem}, where instead of searching for optimal solutions, the goal is to search for solutions that adhere to a given threshold bound. This work defines and investigates the satisficing problem on a two-player graph game with the discounted-sum cost model. We show that while the satisficing problem can be solved using numerical methods just like the optimization problem, this approach does not render compelling benefits over optimization. When the discount factor is, however, an integer, we present another approach to satisficing, which is purely based on automata methods. We show that this approach is algorithmically more performant -- both theoretically and empirically -- and demonstrates the broader applicability of satisficing overoptimization.

preprint2020arXiv

Approximating Values of Generalized-Reachability Stochastic Games

Simple stochastic games are turn-based 2.5-player games with a reachability objective. The basic question asks whether one player can ensure reaching a given target with at least a given probability. A natural extension is games with a conjunction of such conditions as objective. Despite a plethora of recent results on the analysis of systems with multiple objectives, the decidability of this basic problem remains open. In this paper, we present an algorithm approximating the Pareto frontier of the achievable values to a given precision. Moreover, it is an anytime algorithm, meaning it can be stopped at any time returning the current approximation and its error bound.

preprint2020arXiv

Concentration-Bound Analysis for Probabilistic Programs and Probabilistic Recurrence Relations

Analyzing probabilistic programs and randomized algorithms are classical problems in computer science. The first basic problem in the analysis of stochastic processes is to consider the expectation or mean, and another basic problem is to consider concentration bounds, i.e. showing that large deviations from the mean have small probability. Similarly, in the context of probabilistic programs and randomized algorithms, the analysis of expected termination time/running time and their concentration bounds are fundamental problems.In this work, we focus on concentration bounds for probabilistic programs and probabilistic recurrences of randomized algorithms. For probabilistic programs, the basic technique to achieve concentration bounds is to consider martingales and apply the classical Azuma's inequality. For probabilistic recurrences of randomized algorithms, Karp's classical "cookbook" method, which is similar to the master theorem for recurrences, is the standard approach to obtain concentration bounds. In this work, we propose a novel approach for deriving concentration bounds for probabilistic programs and probabilistic recurrence relations through the synthesis of exponential supermartingales. For probabilistic programs, we present algorithms for synthesis of such supermartingales in several cases. We also show that our approach can derive better concentration bounds than simply applying the classical Azuma's inequality over various probabilistic programs considered in the literature. For probabilistic recurrences, our approach can derive tighter bounds than the Karp's well-established methods on classical algorithms. Moreover, we show that our approach could derive bounds comparable to the optimal bound for quicksort, proposed by McDiarmid and Hayward. We also present a prototype implementation that can automatically infer these bounds

preprint2020arXiv

Faster Algorithms for Quantitative Analysis of Markov Chains and Markov Decision Processes with Small Treewidth

Discrete-time Markov Chains (MCs) and Markov Decision Processes (MDPs) are two standard formalisms in system analysis. Their main associated quantitative objectives are hitting probabilities, discounted sum, and mean payoff. Although there are many techniques for computing these objectives in general MCs/MDPs, they have not been thoroughly studied in terms of parameterized algorithms, particularly when treewidth is used as the parameter. This is in sharp contrast to qualitative objectives for MCs, MDPs and graph games, for which treewidth-based algorithms yield significant complexity improvements. In this work, we show that treewidth can also be used to obtain faster algorithms for the quantitative problems. For an MC with $n$ states and $m$ transitions, we show that each of the classical quantitative objectives can be computed in $O((n+m)\cdot t^2)$ time, given a tree decomposition of the MC that has width $t$. Our results also imply a bound of $O(κ\cdot (n+m)\cdot t^2)$ for each objective on MDPs, where $κ$ is the number of strategy-iteration refinements required for the given input and objective. Finally, we make an experimental evaluation of our new algorithms on low-treewidth MCs and MDPs obtained from the DaCapo benchmark suite. Our experimental results show that on MCs and MDPs with small treewidth, our algorithms outperform existing well-established methods by one or more orders of magnitude.

preprint2020arXiv

Inductive Reachability Witnesses

In this work, we consider the fundamental problem of reachability analysis over imperative programs with real variables. The reachability property requires that a program can reach certain target states during its execution. Previous works that tackle reachability analysis are either unable to handle programs consisting of general loops (e.g. symbolic execution), or lack completeness guarantees (e.g. abstract interpretation), or are not automated (e.g. incorrectness logic/reverse Hoare logic). In contrast, we propose a novel approach for reachability analysis that can handle general programs, is (semi-)complete, and can be entirely automated for a wide family of programs. Our approach extends techniques from both invariant generation and ranking-function synthesis to reachability analysis through the notion of (Universal) Inductive Reachability Witnesses (IRWs/UIRWs). While traditional invariant generation uses over-approximations of reachable states, we consider the natural dual problem of under-approximating the set of program states that can reach a target state. We then apply an argument similar to ranking functions to ensure that all states in our under-approximation can indeed reach the target set in finitely many steps.

preprint2020arXiv

Infection dynamics of COVID-19 virus under lockdown and reopening

Motivated by COVID-19, we develop and analyze a simple stochastic model for a disease spread in human population. We track how the number of infected and critically ill people develops over time in order to estimate the demand that is imposed on the hospital system. To keep this demand under control, we consider a class of simple policies for slowing down and reopening the society and we compare their efficiency in mitigating the spread of the virus from several different points of view. We find that in order to avoid overwhelming of the hospital system, a policy must impose a harsh lockdown or it must react swiftly (or both). While reacting swiftly is universally beneficial, being harsh pays off only when the country is patient about reopening and when the neighboring countries coordinate their mitigation efforts. Our work highlights the importance of acting decisively when closing down and the importance of patience and coordination between neighboring countries when reopening.

preprint2020arXiv

Multi-dimensional Long-Run Average Problems for Vector Addition Systems with States

A vector addition system with states (VASS) consists of a finite set of states and counters. A transition changes the current state to the next state, and every counter is either incremented, or decremented, or left unchanged. A state and value for each counter is a configuration; and a computation is an infinite sequence of configurations with transitions between successive configurations. A probabilistic VASS consists of a VASS along with a probability distribution over the transitions for each state. Qualitative properties such as state and configuration reachability have been widely studied for VASS. In this work we consider multi-dimensional long-run average objectives for VASS and probabilistic VASS. For a counter, the cost of a configuration is the value of the counter; and the long-run average value of a computation for the counter is the long-run average of the costs of the configurations in the computation. The multi-dimensional long-run average problem given a VASS and a threshold value for each counter, asks whether there is a computation such that for each counter the long-run average value for the counter does not exceed the respective threshold. For probabilistic VASS, instead of the existence of a computation, we consider whether the expected long-run average value for each counter does not exceed the respective threshold. Our main results are as follows: we show that the multi-dimensional long-run average problem (a) is NP-complete for integer-valued VASS; (b) is undecidable for natural-valued VASS (i.e., nonnegative counters); and (c) can be solved in polynomial time for probabilistic integer-valued VASS, and probabilistic natural-valued VASS when all computations are non-terminating.

preprint2020arXiv

Optimal and Perfectly Parallel Algorithms for On-demand Data-flow Analysis

Interprocedural data-flow analyses form an expressive and useful paradigm of numerous static analysis applications, such as live variables analysis, alias analysis and null pointers analysis. The most widely-used framework for interprocedural data-flow analysis is IFDS, which encompasses distributive data-flow functions over a finite domain. On-demand data-flow analyses restrict the focus of the analysis on specific program locations and data facts. This setting provides a natural split between (i) an offline (or preprocessing) phase, where the program is partially analyzed and analysis summaries are created, and (ii) an online (or query) phase, where analysis queries arrive on demand and the summaries are used to speed up answering queries. In this work, we consider on-demand IFDS analyses where the queries concern program locations of the same procedure (aka same-context queries). We exploit the fact that flow graphs of programs have low treewidth to develop faster algorithms that are space and time optimal for many common data-flow analyses, in both the preprocessing and the query phase. We also use treewidth to develop query solutions that are embarrassingly parallelizable, i.e. the total work for answering each query is split to a number of threads such that each thread performs only a constant amount of work. Finally, we implement a static analyzer based on our algorithms, and perform a series of on-demand analysis experiments on standard benchmarks. Our experimental results show a drastic speed-up of the queries after only a lightweight preprocessing phase, which significantly outperforms existing techniques.

preprint2020arXiv

Polynomial Invariant Generation for Non-deterministic Recursive Programs

We consider the classical problem of invariant generation for programs with polynomial assignments and focus on synthesizing invariants that are a conjunction of strict polynomial inequalities. We present a sound and semi-complete method based on positivstellensaetze, i.e. theorems in semi-algebraic geometry that characterize positive polynomials over a semi-algebraic set. To the best of our knowledge, this is the first invariant generation method to provide completeness guarantees for invariants consisting of polynomial inequalities. Moreover, on the theoretical side, the worst-case complexity of our approach is subexponential, whereas the worst-case complexity of the previously-known complete method (Colon et al, CAV 2003), which could only handle linear invariants, is exponential. On the practical side, we reduce the invariant generation problem to quadratic programming (QCLP), which is a classical optimization problem with many industrial solvers. Finally, we demonstrate the applicability of our approach by providing experimental results on several academic benchmarks.

preprint2020arXiv

Reinforcement Learning of Risk-Constrained Policies in Markov Decision Processes

Markov decision processes (MDPs) are the defacto frame-work for sequential decision making in the presence ofstochastic uncertainty. A classical optimization criterion forMDPs is to maximize the expected discounted-sum pay-off, which ignores low probability catastrophic events withhighly negative impact on the system. On the other hand,risk-averse policies require the probability of undesirableevents to be below a given threshold, but they do not accountfor optimization of the expected payoff. We consider MDPswith discounted-sum payoff with failure states which repre-sent catastrophic outcomes. The objective of risk-constrainedplanning is to maximize the expected discounted-sum payoffamong risk-averse policies that ensure the probability to en-counter a failure state is below a desired threshold. Our maincontribution is an efficient risk-constrained planning algo-rithm that combines UCT-like search with a predictor learnedthrough interaction with the MDP (in the style of AlphaZero)and with a risk-constrained action selection via linear pro-gramming. We demonstrate the effectiveness of our approachwith experiments on classical MDPs from the literature, in-cluding benchmarks with an order of 10^6 states.

preprint2020arXiv

Simplified Game of Life: Algorithms and Complexity

Game of Life is a simple and elegant model to study dynamical system over networks. The model consists of a graph where every vertex has one of two types, namely, dead or alive. A configuration is a mapping of the vertices to the types. An update rule describes how the type of a vertex is updated given the types of its neighbors. In every round, all vertices are updated synchronously, which leads to a configuration update. While in general, Game of Life allows a broad range of update rules, we focus on two simple families of update rules, namely, underpopulation and overpopulation, that model several interesting dynamics studied in the literature. In both settings, a dead vertex requires at least a desired number of live neighbors to become alive. For underpopulation (resp., overpopulation), a live vertex requires at least (resp. at most) a desired number of live neighbors to remain alive. We study the basic computation problems, e.g., configuration reachability, for these two families of rules. For underpopulation rules, we show that these problems can be solved in polynomial time, whereas for overpopulation rules they are PSPACE-complete.

preprint2020arXiv

Stochastic Games with Lexicographic Reachability-Safety Objectives

We study turn-based stochastic zero-sum games with lexicographic preferences over reachability and safety objectives. Stochastic games are standard models in control, verification, and synthesis of stochastic reactive systems that exhibit both randomness as well as angelic and demonic non-determinism. Lexicographic order allows to consider multiple objectives with a strict preference order over the satisfaction of the objectives. To the best of our knowledge, stochastic games with lexicographic objectives have not been studied before. We establish determinacy of such games and present strategy and computational complexity results. For strategy complexity, we show that lexicographically optimal strategies exist that are deterministic and memory is only required to remember the already satisfied and violated objectives. For a constant number of objectives, we show that the relevant decision problem is in NP $\cap$ coNP, matching the current known bound for single objectives; and in general the decision problem is PSPACE-hard and can be solved in NEXPTIME $\cap$ coNEXPTIME. We present an algorithm that computes the lexicographically optimal strategies via a reduction to computation of optimal strategies in a sequence of single-objectives games. We have implemented our algorithm and report experimental results on various case studies.

preprint2020arXiv

The Moran process on 2-chromatic graphs

Resources are rarely distributed uniformly within a population. Heterogeneity in the concentration of a drug, the quality of breeding sites, or wealth can all affect evolutionary dynamics. In this study, we represent a collection of properties affecting the fitness at a given location using a color. A green node is rich in resources while a red node is poorer. More colors can represent a broader spectrum of resource qualities. For a population evolving according to the birth-death Moran model, the first question we address is which structures, identified by graph connectivity and graph coloring, are evolutionarily equivalent. We prove that all properly two-colored, undirected, regular graphs are evolutionarily equivalent (where "properly colored" means that no two neighbors have the same color). We then compare the effects of background heterogeneity on properly two-colored graphs to those with alternative schemes in which the colors are permuted. Finally, we discuss dynamic coloring as a model for spatiotemporal resource fluctuations, and we illustrate that random dynamic colorings often diminish the effects of background heterogeneity relative to a proper two-coloring.

preprint2019arXiv

Limits on amplifiers of natural selection under death-Birth updating

The fixation probability of a single mutant invading a population of residents is among the most widely-studied quantities in evolutionary dynamics. Amplifiers of natural selection are population structures that increase the fixation probability of advantageous mutants, compared to well-mixed populations. Extensive studies have shown that many amplifiers exist for the Birth-death Moran process, some of them substantially increasing the fixation probability or even guaranteeing fixation in the limit of large population size. On the other hand, no amplifiers are known for the death-Birth Moran process, and computer-assisted exhaustive searches have failed to discover amplification. In this work we resolve this disparity, by showing that any amplification under death-Birth updating is necessarily \emph{bounded} and \emph{transient}. Our boundedness result states that even if a population structure does amplify selection, the resulting fixation probability is close to that of the well-mixed population. Our transience result states that for any population structure there exists a threshold $r^*$ such that the population structure ceases to amplify selection if the mutant fitness advantage $r$ is larger than $r^\star$. Finally, we also extend the above results to $δ$-death-Birth updating, which is a combination of Birth-death and death-Birth updating. On the positive side, we identify population structures that maintain amplification for a wide range of values $r$ and $δ$. These results demonstrate that amplification of natural selection depends on the specific mechanisms of the evolutionary process.

preprint2017arXiv

Faster Algorithms for Weighted Recursive State Machines

Pushdown systems (PDSs) and recursive state machines (RSMs), which are linearly equivalent, are standard models for interprocedural analysis. Yet RSMs are more convenient as they (a) explicitly model function calls and returns, and (b) specify many natural parameters for algorithmic analysis, e.g., the number of entries and exits. We consider a general framework where RSM transitions are labeled from a semiring and path properties are algebraic with semiring operations, which can model, e.g., interprocedural reachability and dataflow analysis problems. Our main contributions are new algorithms for several fundamental problems. As compared to a direct translation of RSMs to PDSs and the best-known existing bounds of PDSs, our analysis algorithm improves the complexity for finite-height semirings (that subsumes reachability and standard dataflow properties). We further consider the problem of extracting distance values from the representation structures computed by our algorithm, and give efficient algorithms that distinguish the complexity of a one-time preprocessing from the complexity of each individual query. Another advantage of our algorithm is that our improvements carry over to the concurrent setting, where we improve the best-known complexity for the context-bounded analysis of concurrent RSMs. Finally, we provide a prototype implementation that gives a significant speed-up on several benchmarks from the SLAM/SDV project.