Source author record

Ashutosh Trivedi

Ashutosh Trivedi appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Logic in Computer Science Formal Languages and Automata Theory Computer Science and Game Theory Machine Learning Systems and Control Artificial Intelligence eess.SY Software Engineering Cryptography and Security Computation and Language Computational Complexity cs.CY Data Structures and Algorithms math.OC Performance Robotics

Catalog footprint

What is connected

32works

16topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

KV-Fold: One-Step KV-Cache Recurrence for Long-Context Inference

We introduce KV-Fold, a simple, training-free long-context inference protocol that treats the key-value (KV) cache as the accumulator in a left fold over sequence chunks. At each step, the model processes the next chunk conditioned on the accumulated cache, appends the newly produced keys and values, and passes the enlarged cache forward; the same one-step update is applied repeatedly, analogous to foldl in functional programming. Building on the KV cache concatenation primitive introduced for latent multi-agent communication, we repurpose it as a chunk-to-chunk recurrence for long-context inference. When processing chunk t, the model attends to the KV cache carried from earlier chunks as a prefix, reusing its internal state across segments without modifying or retraining the model. Despite its simplicity, the induced recurrence is stable: per-step drift rises briefly and then saturates into a flat plateau that persists across deep chains. This plateau is insensitive to a 10,000x change in numerical precision, robust across chunk sizes, and consistent across model families. At the task level, KV-Fold preserves exact information over long distances. On a needle-in-a-haystack benchmark, it achieves 100% exact-match retrieval across 152 trials spanning contexts from 16K to 128K tokens and chain depths up to 511 on Llama-3.1-8B, while remaining within the memory limits of a single 40GB GPU. Compared to streaming methods, which trade fidelity for bounded memory, KV-Fold maintains long-range retrieval while operating as a sequence of tractable forward passes. Overall, our results show that frozen pretrained transformers already support a stable form of KV-cache recurrence, providing a practical route to long-context inference without architectural changes or training.

preprint2026arXiv

On the Robustness of Fairness Practices: A Causal Framework for Systematic Evaluation

Machine learning (ML) algorithms are increasingly deployed to make critical decisions in socioeconomic applications such as finance, criminal justice, and autonomous driving. However, due to their data-driven and pattern-seeking nature, ML algorithms may develop decision logic that disproportionately distributes opportunities, benefits, resources, or information among different population groups, potentially harming marginalized communities. In response to such fairness concerns, the software engineering and ML communities have made significant efforts to establish the best practices for creating fair ML software. These include fairness interventions for training ML models, such as including sensitive features, selecting non-sensitive attributes, and applying bias mitigators. But how reliably can software professionals tasked with developing data-driven systems depend on these recommendations? And how well do these practices generalize in the presence of faulty labels, missing data, or distribution shifts? These questions form the core theme of this paper.

preprint2022arXiv

Alternating Good-for-MDP Automata

When omega-regular objectives were first proposed in model-free reinforcement learning (RL) for controlling MDPs, deterministic Rabin automata were used in an attempt to provide a direct translation from their transitions to scalar values. While these translations failed, it has turned out that it is possible to repair them by using good-for-MDPs (GFM) Büchi automata instead. These are nondeterministic Büchi automata with a restricted type of nondeterminism, albeit not as restricted as in good-for-games automata. Indeed, deterministic Rabin automata have a pretty straightforward translation to such GFM automata, which is bi-linear in the number of states and pairs. Interestingly, the same cannot be said for deterministic Streett automata: a translation to nondeterministic Rabin or Büchi automata comes at an exponential cost, even without requiring the target automaton to be good-for-MDPs. Do we have to pay more than that to obtain a good-for-MDP automaton? The surprising answer is that we have to pay significantly less when we instead expand the good-for-MDP property to alternating automata: like the nondeterministic GFM automata obtained from deterministic Rabin automata, the alternating good-for-MDP automata we produce from deterministic Streett automata are bi-linear in the the size of the deterministic automaton and its index, and can therefore be exponentially more succinct than minimal nondeterministic Büchi automata.

preprint2022arXiv

Compositional Reinforcement Learning for Discrete-Time Stochastic Control Systems

We propose a compositional approach to synthesize policies for networks of continuous-space stochastic control systems with unknown dynamics using model-free reinforcement learning (RL). The approach is based on implicitly abstracting each subsystem in the network with a finite Markov decision process with unknown transition probabilities, synthesizing a strategy for each abstract model in an assume-guarantee fashion using RL, and then mapping the results back over the original network with approximate optimality guarantees. We provide lower bounds on the satisfaction probability of the overall network based on those over individual subsystems. A key contribution is to leverage the convergence results for adversarial RL (minimax Q-learning) on finite stochastic arenas to provide control strategies maximizing the probability of satisfaction over the network of continuous-space systems. We consider finite-horizon properties expressed in the syntactically co-safe fragment of linear temporal logic. These properties can readily be converted into automata-based reward functions, providing scalar reward signals suitable for RL. Since such reward functions are often sparse, we supply a potential-based reward shaping technique to accelerate learning by producing dense rewards. The effectiveness of the proposed approaches is demonstrated via two physical benchmarks including regulation of a room temperature network and control of a road traffic network.

preprint2022arXiv

Controller Synthesis for Omega-Regular and Steady-State Specifications

Given a Markov decision process (MDP) and a linear-time ($ω$-regular or LTL) specification, the controller synthesis problem aims to compute the optimal policy that satisfies the specification. More recently, problems that reason over the asymptotic behavior of systems have been proposed through the lens of steady-state planning. This entails finding a control policy for an MDP such that the Markov chain induced by the solution policy satisfies a given set of constraints on its steady-state distribution. This paper studies a generalization of the controller synthesis problem for a linear-time specification under steady-state constraints on the asymptotic behavior. We present an algorithm to find a deterministic policy satisfying $ω$-regular and steady-state constraints by characterizing the solutions as an integer linear program, and experimentally evaluate our approach.

preprint2022arXiv

Fairness-aware Configuration of Machine Learning Libraries

This paper investigates the parameter space of machine learning (ML) algorithms in aggravating or mitigating fairness bugs. Data-driven software is increasingly applied in social-critical applications where ensuring fairness is of paramount importance. The existing approaches focus on addressing fairness bugs by either modifying the input dataset or modifying the learning algorithms. On the other hand, the selection of hyperparameters, which provide finer controls of ML algorithms, may enable a less intrusive approach to influence the fairness. Can hyperparameters amplify or suppress discrimination present in the input dataset? How can we help programmers in detecting, understanding, and exploiting the role of hyperparameters to improve the fairness? We design three search-based software testing algorithms to uncover the precision-fairness frontier of the hyperparameter space. We complement these algorithms with statistical debugging to explain the role of these parameters in improving fairness. We implement the proposed approaches in the tool Parfait-ML (PARameter FAIrness Testing for ML Libraries) and show its effectiveness and utility over five mature ML algorithms as used in six social-critical applications. In these applications, our approach successfully identified hyperparameters that significantly improve (vis-a-vis the state-of-the-art techniques) the fairness without sacrificing precision. Surprisingly, for some algorithms (e.g., random forest), our approach showed that certain configuration of hyperparameters (e.g., restricting the search space of attributes) can amplify biases across applications. Upon further investigation, we found intuitive explanations of these phenomena, and the results corroborate similar observations from the literature.

preprint2022arXiv

Inferring Probabilistic Reward Machines from Non-Markovian Reward Processes for Reinforcement Learning

The success of reinforcement learning in typical settings is predicated on Markovian assumptions on the reward signal by which an agent learns optimal policies. In recent years, the use of reward machines has relaxed this assumption by enabling a structured representation of non-Markovian rewards. In particular, such representations can be used to augment the state space of the underlying decision process, thereby facilitating non-Markovian reinforcement learning. However, these reward machines cannot capture the semantics of stochastic reward signals. In this paper, we make progress on this front by introducing probabilistic reward machines (PRMs) as a representation of non-Markovian stochastic rewards. We present an algorithm to learn PRMs from the underlying decision process and prove results around its correctness and convergence.

preprint2022arXiv

Optimal Repair For Omega-regular Properties

This paper presents an optimization based framework to automate system repair against omega-regular properties. In the proposed formalization of optimal repair, the systems are represented as Kripke structures, the properties as $ω$-regular languages, and the repair space as repair machines -- weighted omega-regular transducers equipped with Büchi conditions -- that rewrite strings and associate a cost sequence to these rewritings. To translate the resulting cost-sequences to easily interpretable payoffs, we consider several aggregator functions to map cost sequences to numbers -- including limit superior, supremum, discounted-sum, and average-sum -- to define quantitative cost semantics. The problem of optimal repair, then, is to determine whether traces from a given system can be rewritten to satisfy an $ω$-regular property when the allowed cost is bounded by a given threshold. We also consider the dual challenge of impair verification that assumes that the rewritings are resolved adversarially under some given cost restriction, and asks to decide if all traces of the system satisfy the specification irrespective of the rewritings. With a negative result to the impair verification problem, we study the problem of designing a minimal mask of the Kripke structure such that the resulting traces satisfy the specifications despite the threshold-bounded impairment. We dub this problem as the mask synthesis problem. This paper presents automata-theoretic solutions to repair synthesis, impair verification, and mask synthesis problem for limit superior, supremum, discounted-sum, and average-sum cost semantics.

preprint2022arXiv

Recursive Reinforcement Learning

Recursion is the fundamental paradigm to finitely describe potentially infinite objects. As state-of-the-art reinforcement learning (RL) algorithms cannot directly reason about recursion, they must rely on the practitioner's ingenuity in designing a suitable "flat" representation of the environment. The resulting manual feature constructions and approximations are cumbersome and error-prone; their lack of transparency hampers scalability. To overcome these challenges, we develop RL algorithms capable of computing optimal policies in environments described as a collection of Markov decision processes (MDPs) that can recursively invoke one another. Each constituent MDP is characterized by several entry and exit points that correspond to input and output values of these invocations. These recursive MDPs (or RMDPs) are expressively equivalent to probabilistic pushdown systems (with call-stack playing the role of the pushdown stack), and can model probabilistic programs with recursive procedural calls. We introduce Recursive Q-learning -- a model-free RL algorithm for RMDPs -- and prove that it converges for finite, single-exit and deterministic multi-exit RMDPs under mild assumptions.

preprint2022arXiv

Secure-by-Construction Synthesis of Cyber-Physical Systems

Correct-by-construction synthesis is a cornerstone of the confluence of formal methods and control theory towards designing safety-critical systems. Instead of following the time-tested, albeit laborious (re)design-verify-validate loop, correct-by-construction methodology advocates the use of continual refinements of formal requirements -- connected by chains of formal proofs -- to build a system that assures the correctness by design. A remarkable progress has been made in scaling the scope of applicability of correct-by-construction synthesis -- with a focus on cyber-physical systems that tie discrete-event control with continuous environment -- to enlarge control systems by combining symbolic approaches with principled state-space reduction techniques. Unfortunately, in the security-critical control systems, the security properties are verified ex post facto the design process in a way that undermines the correct-by-construction paradigm. We posit that, to truly realize the dream of correct-by-construction synthesis for security-critical systems, security considerations must take center-stage with the safety considerations. Moreover, catalyzed by the recent progress on the opacity sub-classes of security properties and the notion of hyperproperties capable of combining security with safety properties, we believe that the time is ripe for the research community to holistically target the challenge of secure-by-construction synthesis. This paper details our vision by highlighting the recent progress and open challenges that may serve as bricks for providing a solid foundation for secure-by-construction synthesis of cyber-physical systems.

preprint2020arXiv

Adding Negative Prices to Priced Timed Games

Priced timed games (PTGs) are two-player zero-sum games played on the infinite graph of configurations of priced timed automata where two players take turns to choose transitions in order to optimize cost to reach target states. Bouyer et al. and Alur, Bernadsky, and Madhusudan independently proposed algorithms to solve PTGs with nonnegative prices under certain divergence restriction over prices. Brihaye, Bruyere, and Raskin later provided a justification for such a restriction by showing the undecidability of the optimal strategy synthesis problem in the absence of this divergence restriction. This problem for PTGs with one clock has long been conjectured to be in polynomial time, however the current best known algorithm, by Hansen, Ibsen-Jensen, and Miltersen, is exponential. We extend this picture by studying PTGs with both negative and positive prices. We refine the undecidability results for optimal strategy synthesis problem, and show undecidability for several variants of optimal reachability cost objectives including reachability cost, time-bounded reachability cost, and repeated reachability cost objectives. We also identify a subclass with bi-valued price-rates and give a pseudo-polynomial algorithm to partially answer the conjecture on the complexity of one-clock PTGs.

preprint2020arXiv

Data-Driven Debugging for Functional Side Channels

Information leaks through side channels are a pervasive problem, even in security-critical applications. Functional side channels arise when an attacker knows that a secret value of a server stays fixed for a certain time. Then, the attacker can observe the server executions on a sequence of different public inputs, each paired with the same secret input. Thus for each secret, the attacker observes a function from public inputs to execution time, for instance, and she can compare these functions for different secrets. First, we introduce a notion of noninterference for functional side channels. We focus on the case of noisy observations, where we demonstrate with examples that there is a practical functional side channel in programs that would be deemed information-leak-free or be underestimated using the standard definition. Second, we develop a framework and techniques for debugging programs for functional side channels. We extend evolutionary fuzzing techniques to generate inputs that exploit functional dependencies of response times on public inputs. We adapt existing results and algorithms in functional data analysis to model the functions and discover the existence of side channels. We use a functional extension of standard decision tree learning to pinpoint the code fragments causing a side channel if there is one. We empirically evaluate the performance of our tool FUCHSIA on a series of micro-benchmarks and realistic Java programs. On the set of benchmarks, we show that FUCHSIA outperforms the state-of-the-art techniques in detecting side channel classes. On the realistic programs, we show the scalability of FUCHSIA in analyzing functional side channels in Java programs with thousands of methods. Also, we show the usefulness of FUCHSIA in finding side channels including a zero-day vulnerability in OpenJDK and another vulnerability in Jetty that was since fixed by the developers.

preprint2020arXiv

Detecting and Understanding Real-World Differential Performance Bugs in Machine Learning Libraries

Programming errors that degrade the performance of systems are widespread, yet there is little tool support for analyzing these bugs. We present a method based on differential performance analysis---we find inputs for which the performance varies widely, despite having the same size. To ensure that the differences in the performance are robust (i.e. hold also for large inputs), we compare the performance of not only single inputs, but of classes of inputs, where each class has similar inputs parameterized by their size. Thus, each class is represented by a performance function from the input size to performance. Importantly, we also provide an explanation for why the performance differs in a form that can be readily used to fix a performance bug. The two main phases in our method are discovery with fuzzing and explanation with decision tree classifiers, each of which is supported by clustering. First, we propose an evolutionary fuzzing algorithm to generate inputs. For this fuzzing task, the unique challenge is that we not only need the input class with the worst performance, but rather a set of classes exhibiting differential performance. We use clustering to merge similar input classes which significantly improves the efficiency of our fuzzer. Second, we explain the differential performance in terms of program inputs and internals. We adapt discriminant learning approaches with clustering and decision trees to localize suspicious code regions. We applied our techniques to a set of applications. On a set of micro-benchmarks, we show that our approach outperforms state-of-the-art fuzzers in finding inputs to characterize the differential performance. On a set of case-studies, we discover and explain multiple performance bugs in popular machine learning frameworks. Four of these bugs, reported first in this paper, have since been fixed by the developers.

preprint2020arXiv

Formal Controller Synthesis for Continuous-Space MDPs via Model-Free Reinforcement Learning

A novel reinforcement learning scheme to synthesize policies for continuous-space Markov decision processes (MDPs) is proposed. This scheme enables one to apply model-free, off-the-shelf reinforcement learning algorithms for finite MDPs to compute optimal strategies for the corresponding continuous-space MDPs without explicitly constructing the finite-state abstraction. The proposed approach is based on abstracting the system with a finite MDP (without constructing it explicitly) with unknown transition probabilities, synthesizing strategies over the abstract MDP, and then mapping the results back over the concrete continuous-space MDP with approximate optimality guarantees. The properties of interest for the system belong to a fragment of linear temporal logic, known as syntactically co-safe linear temporal logic (scLTL), and the synthesis requirement is to maximize the probability of satisfaction within a given bounded time horizon. A key contribution of the paper is to leverage the classical convergence results for reinforcement learning on finite MDPs and provide control strategies maximizing the probability of satisfaction over unknown, continuous-space MDPs while providing probabilistic closeness guarantees. Automata-based reward functions are often sparse; we present a novel potential-based reward shaping technique to produce dense rewards to speed up learning. The effectiveness of the proposed approach is demonstrated by applying it to three physical benchmarks concerning the regulation of a room's temperature, control of a road traffic cell, and of a 7-dimensional nonlinear model of a BMW 320i car.

preprint2016arXiv

Almost-Sure Reachability in Stochastic Multi-Mode System

A constant-rate multi-mode system is a hybrid system that can switch freely among a finite set of modes, and whose dynamics is specified by a finite number of real-valued variables with mode-dependent constant rates. We introduce and study a stochastic extension of a constant-rate multi-mode system where the dynamics is specified by mode-dependent compactly supported probability distributions over a set of constant rate vectors. Given a tolerance $\varepsilon > 0$, the almost-sure reachability problem for stochastic multi-mode systems is to decide the existence of a control strategy that steers the system almost-surely from an arbitrary start state to an $\varepsilon$-neighborhood of an arbitrary target state while staying inside a pre-specified safety set. We prove a necessary and sufficient condition to decide almost-sure reachability and, using this condition, we show that almost-sure reachability can be decided in polynomial time. Our algorithm can be used as a path-following algorithm in combination with any off-the-shelf path-planning algorithm to make a robot or an autonomous vehicle with noisy low-level controllers follow a given path with arbitrary precision.

preprint2016arXiv

Expected Reachability-Time Games

Probabilistic timed automata are a suitable formalism to model systems with real-time, nondeterministic and probabilistic behaviour. We study two-player zero-sum games on such automata where the objective of the game is specified as the expected time to reach a target. The two players---called player Min and player Max---compete by proposing timed moves simultaneously and the move with a shorter delay is performed. The first player attempts to minimise the given objective while the second tries to maximise the objective. We observe that these games are not determined, and study decision problems related to computing the upper and lower values, showing that the problems are decidable and lie in the complexity class NEXPTIME $\cap$ co-NEXPTIME.

preprint2016arXiv

FO-definable transformations of infinite strings

The theory of regular and aperiodic transformations of finite strings has recently received a lot of interest. These classes can be equivalently defined using logic (Monadic second-order logic and first-order logic), two-way machines (regular two-way and aperiodic two-way transducers), and one-way register machines (regular streaming string and aperiodic streaming string transducers). These classes are known to be closed under operations such as sequential composition and regular (star-free) choice; and problems such as functional equivalence and type checking, are decidable for these classes. On the other hand, for infinite strings these results are only known for $ω$-regular transformations: Alur, Filiot, and Trivedi studied transformations of infinite strings and introduced an extension of streaming string transducers over $ω$-strings and showed that they capture monadic second-order definable transformations for infinite strings. In this paper we extend their work to recover connection for infinite strings among first-order logic definable transformations, aperiodic two-way transducers, and aperiodic streaming string transducers.

preprint2016arXiv

Mean-Payoff Games on Timed Automata

Mean-payoff games on timed automata are played on the infinite weighted graph of configurations of priced timed automata between two players, Player Min and Player Max, by moving a token along the states of the graph to form an infinite run. The goal of Player Min is to minimize the limit average weight of the run, while the goal of the Player Max is the opposite. Brenguier, Cassez, and Raskin recently studied a variation of these games and showed that mean-payoff games are undecidable for timed automata with five or more clocks. We refine this result by proving the undecidability of mean-payoff games with three clocks. On a positive side, we show the decidability of mean-payoff games on one-clock timed automata with binary price-rates. A key contribution of this paper is the application of dynamic programming based proof techniques applied in the context of average reward optimization on an uncountable state and action space.

preprint2016arXiv

On Nonlinear Prices in Timed Automata

Priced timed automata provide a natural model for quantitative analysis of real-time systems and have been successfully applied in various scheduling and planning problems. The optimal reachability problem for linearly-priced timed automata is known to be PSPACE-complete. In this paper we investigate priced timed automata with more general prices and show that in the most general setting the optimal reachability problem is undecidable. We adapt and implement the construction of Audemard, Cimatti, Kornilowicz, and Sebastiani for non-linear priced timed automata using state-of-the-art theorem prover Z3 and present some preliminary results.

preprint2016arXiv

Proceedings of the The First Workshop on Verification and Validation of Cyber-Physical Systems

The first International Workshop on Verification and Validation of Cyber-Physical Systems (V2CPS-16) was held in conjunction with the 12th International Conference on integration of Formal Methods (iFM 2016) in Reykjavik, Iceland. The purpose of V2CPS-16 was to bring together researchers and experts of the fields of formal verification and cyber-physical systems (CPS) to cover the theme of this workshop, namely a wide spectrum of verification and validation methods including (but not limited to) control, simulation, formal methods, etc. A CPS is an integration of networked computational and physical processes with meaningful inter-effects; the former monitors, controls, and affects the latter, while the latter also impacts the former. CPSs have applications in a wide-range of systems spanning robotics, transportation, communication, infrastructure, energy, and manufacturing. Many safety-critical systems such as chemical processes, medical devices, aircraft flight control, and automotive systems, are indeed CPS. The advanced capabilities of CPS require complex software and synthesis algorithms, which are hard to verify. In fact, many problems in this area are undecidable. Thus, a major step is to find particular abstractions of such systems which might be algorithmically verifiable regarding specific properties of such systems, describing the partial/overall behaviors of CPSs.

preprint2016arXiv

Stochastic Timed Games Revisited

Stochastic timed games (STGs), introduced by Bouyer and Forejt, naturally generalize both continuous-time Markov chains and timed automata by providing a partition of the locations between those controlled by two players (Player Box and Player Diamond) with competing objectives and those governed by stochastic laws. Depending on the number of players---$2$, $1$, or $0$---subclasses of stochastic timed games are often classified as $2\frac{1}{2}$-player, $1\frac{1}{2}$-player, and $\frac{1}{2}$-player games where the $\frac{1}{2}$ symbolizes the presence of the stochastic "nature" player. For STGs with reachability objectives it is known that $1\frac{1}{2}$-player one-clock STGs are decidable for qualitative objectives, and that $2\frac{1}{2}$-player three-clock STGs are undecidable for quantitative reachability objectives. This paper further refines the gap in this decidability spectrum. We show that quantitative reachability objectives are already undecidable for $1\frac{1}{2}$ player four-clock STGs, and even under the time-bounded restriction for $2\frac{1}{2}$-player five-clock STGs. We also obtain a class of $1\frac{1}{2}$, $2\frac{1}{2}$ player STGs for which the quantitative reachability problem is decidable.

preprint2015arXiv

Hybrid Automata for Formal Modeling and Verification of Cyber-Physical Systems

The presence of a tight integration between the discrete control (the "cyber") and the analog environment (the "physical")---via sensors and actuators over wired or wireless communication networks---is the defining feature of cyber-physical systems. Hence, the functional correctness of a cyber- physical system is crucially dependent not only on the dynamics of the analog physical environment, but also on the decisions taken by the discrete control that alter the dynamics of the environment. The framework of Hybrid automata---introduced by Alur, Courcoubetis, Henzinger, and Ho---provides a formal modeling and specification environment to analyze the interaction between the discrete and continuous parts of a cyber-physical system. Hybrid automata can be considered as generalizations of finite state automata augmented with a finite set of real-valued variables whose dynamics in each state is governed by a system of ordinary differential equations. Moreover, the discrete transitions of hybrid automata are guarded by constraints over the values of these real-valued variables, and enable discontinuous jumps in the evolution of these variables. Considering the richness of the dynamics in a hybrid automaton, it is perhaps not surprising that the fundamental verification questions, like reachability and schedulability, for the general model are undecidable. In this article we present a review of hybrid automata as modeling and verification framework for cyber-physical systems, and survey some of the key results related to practical verification questions related to hybrid automata.

preprint2015arXiv

Incentive Stackelberg Mean-payoff Games

We introduce and study incentive equilibria for multi-player meanpayoff games. Incentive equilibria generalise well-studied solution concepts such as Nash equilibria and leader equilibria (also known as Stackelberg equilibria). Recall that a strategy profile is a Nash equilibrium if no player can improve his payoff by changing his strategy unilaterally. In the setting of incentive and leader equilibria, there is a distinguished player called the leader who can assign strategies to all other players, referred to as her followers. A strategy profile is a leader strategy profile if no player, except for the leader, can improve his payoff by changing his strategy unilaterally, and a leader equilibrium is a leader strategy profile with a maximal return for the leader. In the proposed case of incentive equilibria, the leader can additionally influence the behaviour of her followers by transferring parts of her payoff to her followers. The ability to incentivise her followers provides the leader with more freedom in selecting strategy profiles, and we show that this can indeed improve the payoff for the leader in such games. The key fundamental result of the paper is the existence of incentive equilibria in mean-payoff games. We further show that the decision problem related to constructing incentive equilibria is NP-complete. On a positive note, we show that, when the number of players is fixed, the complexity of the problem falls in the same class as two-player mean-payoff games. We also present an implementation of the proposed algorithms, and discuss experimental results that demonstrate the feasibility of the analysis of medium sized games.

preprint2015arXiv

Revisiting Robustness in Priced Timed Games

Priced timed games are optimal-cost reachability games played between two players---the controller and the environment---by moving a token along the edges of infinite graphs of configurations of priced timed automata. The goal of the controller is to reach a given set of target locations as cheaply as possible, while the goal of the environment is the opposite. Priced timed games are known to be undecidable for timed automata with $3$ or more clocks, while they are known to be decidable for automata with $1$ clock. In an attempt to recover decidability for priced timed games Bouyer, Markey, and Sankur studied robust priced timed games where the environment has the power to slightly perturb delays proposed by the controller. Unfortunately, however, they showed that the natural problem of deciding the existence of optimal limit-strategy---optimal strategy of the controller where the perturbations tend to vanish in the limit---is undecidable with $10$ or more clocks. In this paper we revisit this problem and improve our understanding of the decidability of these games. We show that the limit-strategy problem is already undecidable for a subclass of robust priced timed games with $5$ or more clocks. On a positive side, we show the decidability of the existence of almost optimal strategies for the same subclass of one-clock robust priced timed games by adapting a classical construction by Bouyer at al. for one-clock priced timed games.

preprint2015arXiv

Skolem Functions for Factored Formulas

Given a propositional formula F(x,y), a Skolem function for x is a function Ψ(y), such that substituting Ψ(y) for x in F gives a formula semantically equivalent to \exists F. Automatically generating Skolem functions is of significant interest in several applications including certified QBF solving, finding strategies of players in games, synthesising circuits and bit-vector programs from specifications, disjunctive decomposition of sequential circuits etc. In many such applications, F is given as a conjunction of factors, each of which depends on a small subset of variables. Existing algorithms for Skolem function generation ignore any such factored form and treat F as a monolithic function. This presents scalability hurdles in medium to large problem instances. In this paper, we argue that exploiting the factored form of F can give significant performance improvements in practice when computing Skolem functions. We present a new CEGAR style algorithm for generating Skolem functions from factored propositional formulas. In contrast to earlier work, our algorithm neither requires a proof of QBF satisfiability nor uses composition of monolithic conjunctions of factors. We show experimentally that our algorithm generates smaller Skolem functions and outperforms state-of-the-art approaches on several large benchmarks.

preprint2015arXiv

Symmetric Strategy Improvement

Symmetry is inherent in the definition of most of the two-player zero-sum games, including parity, mean-payoff, and discounted-payoff games. It is therefore quite surprising that no symmetric analysis techniques for these games exist. We develop a novel symmetric strategy improvement algorithm where, in each iteration, the strategies of both players are improved simultaneously. We show that symmetric strategy improvement defies Friedmann's traps, which shook the belief in the potential of classic strategy improvement to be polynomial.

preprint2014arXiv

Bounded-Rate Multi-Mode Systems Based Motion Planning

Bounded-rate multi-mode systems are hybrid systems that can switch among a finite set of modes. Its dynamics is specified by a finite number of real-valued variables with mode-dependent rates that can vary within given bounded sets. Given an arbitrary piecewise linear trajectory, we study the problem of following the trajectory with arbitrary precision, using motion primitives given as bounded-rate multi-mode systems. We give an algorithm to solve the problem and show that the problem is co-NP complete. We further prove that the problem can be solved in polynomial time for multi-mode systems with fixed dimension. We study the problem with dwell-time requirement and show the decidability of the problem under certain positivity restriction on the rate vectors. Finally, we show that introducing structure to the multi-mode systems leads to undecidability, even when using only a single clock variable.

preprint2014arXiv

First-order definable string transformations

The connection between languages defined by computational models and logic for languages is well-studied. Monadic second-order logic and finite automata are shown to closely correspond to each-other for the languages of strings, trees, and partial-orders. Similar connections are shown for first-order logic and finite automata with certain aperiodicity restriction. Courcelle in 1994 proposed a way to use logic to define functions over structures where the output structure is defined using logical formulas interpreted over the input structure. Engelfriet and Hoogeboom discovered the corresponding "automata connection" by showing that two-way generalised sequential machines capture the class of monadic-second order definable transformations. Alur and Cerny further refined the result by proposing a one-way deterministic transducer model with string variables---called the streaming string transducers---to capture the same class of transformations. In this paper we establish a transducer-logic correspondence for Courcelle's first-order definable string transformations. We propose a new notion of transition monoid for streaming string transducers that involves structural properties of both underlying input automata and variable dependencies. By putting an aperiodicity restriction on the transition monoids, we define a class of streaming string transducers that captures exactly the class of first-order definable transformations.

preprint2014arXiv

On The Reachability Problem for Recursive Hybrid Automata with One and Two Players

Motivated by the success of bounded model checking framework for finite state machines, Ouaknine and Worrell proposed a time-bounded theory of real-time verification by claiming that restriction to bounded-time recovers decidability for several key decision problem related to real-time verification. In support of this theory, the list of undecidable problems recently shown decidable under time-bounded restriction is rather impressive: language inclusion for timed automata, emptiness problem for alternating timed automata, and emptiness problem for rectangular hybrid automata. The objective of our study was to recover decidability for general recursive timed automata---and perhaps for recursive hybrid automata---under time-bounded restriction in order to provide an appealing verification framework for powerful modeling environments such as Stateflow/Simulink. Unfortunately, however, we answer this question in negative by showing that time-bounded reachability problem stays undecidable for recursive timed automata with five or more clocks. While the bad news continues even when one considers well-behaved subclasses of recursive hybrid automata, we recover decidability by considering recursive hybrid automata with bounded context using a pass-by-reference mechanism, or by restricting the number of variables to two, with rates in $\{0,1\}$.

preprint2013arXiv

Safe Schedulability of Bounded-Rate Multi-Mode Systems

Bounded-rate multi-mode systems (BMMS) are hybrid systems that can switch freely among a finite set of modes, and whose dynamics is specified by a finite number of real-valued variables with mode-dependent rates that can vary within given bounded sets. The schedulability problem for BMMS is defined as an infinite-round game between two players---the scheduler and the environment---where in each round the scheduler proposes a time and a mode while the environment chooses an allowable rate for that mode, and the state of the system changes linearly in the direction of the rate vector. The goal of the scheduler is to keep the state of the system within a pre-specified safe set using a non-Zeno schedule, while the goal of the environment is the opposite. Green scheduling under uncertainty is a paradigmatic example of BMMS where a winning strategy of the scheduler corresponds to a robust energy-optimal policy. We present an algorithm to decide whether the scheduler has a winning strategy from an arbitrary starting state, and give an algorithm to compute such a winning strategy, if it exists. We show that the schedulability problem for BMMS is co-NP complete in general, but for two variables it is in PTIME. We also study the discrete schedulability problem where the environment has only finitely many choices of rate vectors in each mode and the scheduler can make decisions only at multiples of a given clock period, and show it to be EXPTIME-complete.

preprint2009arXiv

Average-Time Games on Timed Automata

An average-time game is played on the infinite graph of configurations of a finite timed automaton. The two players, Min and Max, construct an infinite run of the automaton by taking turns to perform a timed transition. Player Min wants to minimise the average time per transition and player Max wants to maximise it. A solution of average-time games is presented using a reduction to average-price game on a finite graph. A direct consequence is an elementary proof of determinacy for average-time games. This complements our results for reachability-time games and partially solves a problem posed by Bouyer et al., to design an algorithm for solving average-price games on priced timed automata. The paper also establishes the exact computational complexity of solving average-time games: the problem is EXPTIME-complete for timed automata with at least two clocks.

preprint2009arXiv

Reachability-time games on timed automata

In a reachability-time game, players Min and Max choose moves so that the time to reach a final state in a timed automaton is minimised or maximised, respectively. Asarin and Maler showed decidability of reachability-time games on strongly non-Zeno timed automata using a value iteration algorithm. This paper complements their work by providing a strategy improvement algorithm for the problem. It also generalizes their decidability result because the proposed strategy improvement algorithm solves reachability-time games on all timed automata. The exact computational complexity of solving reachability-time games is also established: the problem is EXPTIME-complete for timed automata with at least two clocks.

Ashutosh Trivedi

What is connected

Connect this record

See the researcher in context

Building this map preview

32 published item(s)

KV-Fold: One-Step KV-Cache Recurrence for Long-Context Inference

On the Robustness of Fairness Practices: A Causal Framework for Systematic Evaluation

Alternating Good-for-MDP Automata

Compositional Reinforcement Learning for Discrete-Time Stochastic Control Systems

Controller Synthesis for Omega-Regular and Steady-State Specifications

Fairness-aware Configuration of Machine Learning Libraries

Inferring Probabilistic Reward Machines from Non-Markovian Reward Processes for Reinforcement Learning

Optimal Repair For Omega-regular Properties

Recursive Reinforcement Learning

Secure-by-Construction Synthesis of Cyber-Physical Systems

Adding Negative Prices to Priced Timed Games

Data-Driven Debugging for Functional Side Channels

Detecting and Understanding Real-World Differential Performance Bugs in Machine Learning Libraries

Formal Controller Synthesis for Continuous-Space MDPs via Model-Free Reinforcement Learning

Almost-Sure Reachability in Stochastic Multi-Mode System

Expected Reachability-Time Games

FO-definable transformations of infinite strings

Mean-Payoff Games on Timed Automata

On Nonlinear Prices in Timed Automata

Proceedings of the The First Workshop on Verification and Validation of Cyber-Physical Systems

Stochastic Timed Games Revisited

Hybrid Automata for Formal Modeling and Verification of Cyber-Physical Systems

Incentive Stackelberg Mean-payoff Games

Revisiting Robustness in Priced Timed Games

Skolem Functions for Factored Formulas

Symmetric Strategy Improvement

Bounded-Rate Multi-Mode Systems Based Motion Planning

First-order definable string transformations

On The Reachability Problem for Recursive Hybrid Automata with One and Two Players

Safe Schedulability of Bounded-Rate Multi-Mode Systems

Average-Time Games on Timed Automata

Reachability-time games on timed automata