Source author record

Lillian J. Ratliff

Lillian J. Ratliff appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.OC Computer Science and Game Theory Systems and Control eess.SY Applications Cryptography and Security cs.CY math.DS

Catalog footprint

What is connected

17works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Adaptive Calibration in Non-Stationary Environments

Making calibrated online predictions is a central challenge in modern AI systems. Much of the existing literature focuses on fully adversarial environments where outcomes may be arbitrary, leading to conservative algorithms that can perform suboptimally in more benign settings, such as when outcomes are nearly stationary. This gap raises a natural question: can we design online prediction algorithms whose calibration error automatically adapts to the degree of non-stationarity in the environment, smoothly interpolating between i.i.d. and adversarial regimes? We answer this question in the affirmative and develop a suite of algorithms that achieve adaptive calibration guarantees under multiple calibration measures. Specifically, with $T$ being the number of rounds and $C\in[0,T]$ being an unknown non-stationary measure defined as the minimal $\ell_1$ deviation of the mean outcomes, our algorithms attain $\widetilde{O}(\sqrt{T}+(TC)^{\frac{1}{3}})$ for $\ell_1$ calibration error and $\widetilde{O}((1+C)^{\frac{1}{3}})$ for both $\ell_2$ and pseudo KL calibration error. These bounds match the optimal rates in the stationary case ($C=0$) and recover known guarantees in the fully adversarial regime ($C=T$). Our approach builds on and extends prior work [Hu et al., 2026, Luo et al., 2025], introducing an epoch-based scheduling together with a novel non-uniform partition of the prediction space that allocates finer resolution near the underlying ground truth.

preprint2022arXiv

Adaptive Constraint Satisfaction for Markov Decision Process Congestion Games: Application to Transportation Networks

Under the Markov decision process (MDP) congestion game framework, we study the problem of enforcing population distribution constraints on a population of players with stochastic dynamics and coupled congestion costs. Existing research demonstrates that the constraints on the players' population distribution can be satisfied by enforcing tolls. However, computing the minimum toll value for constraint satisfaction requires accurate modeling of the player's congestion costs. Motivated by settings where an accurate congestion cost model is unavailable (e.g. transportation networks), we consider an MDP congestion game with unknown congestion costs. We assume that a constraint-enforcing authority can repeatedly enforce tolls on a population of players who converges to an $ε$-optimal population distribution for any given toll. We then construct a myopic update algorithm to compute the minimum toll value while ensuring that the constraints are satisfied on average. We analyze how the players' sub-optimal responses to tolls impact the rates of convergence towards the minimum toll value and constraint satisfaction. Finally, we construct a congestion game model for Uber drivers in Manhattan, New York City (NYC) using data from the Taxi and Limousine Commission (TLC) to illustrate how to efficiently reduce congestion while minimizing the impact on driver earnings.

preprint2022arXiv

Decision-Dependent Risk Minimization in Geometrically Decaying Dynamic Environments

This paper studies the problem of expected loss minimization given a data distribution that is dependent on the decision-maker's action and evolves dynamically in time according to a geometric decay process. Novel algorithms for both the information setting in which the decision-maker has a first order gradient oracle and the setting in which they have simply a loss function oracle are introduced. The algorithms operate on the same underlying principle: the decision-maker repeatedly deploys a fixed decision over the length of an epoch, thereby allowing the dynamically changing environment to sufficiently mix before updating the decision. The iteration complexity in each of the settings is shown to match existing rates for first and zero order stochastic gradient methods up to logarithmic factors. The algorithms are evaluated on a "semi-synthetic" example using real world data from the SFpark dynamic pricing pilot study; it is shown that the announced prices result in an improvement for the institution's objective (target occupancy), while achieving an overall reduction in parking rates.

preprint2022arXiv

General sum stochastic games with networked information flows

Inspired by applications such as supply chain management, epidemics, and social networks, we formulate a stochastic game model that addresses three key features common across these domains: 1) network-structured player interactions, 2) pair-wise mixed cooperation and competition among players, and 3) limited global information toward individual decision-making. In combination, these features pose significant challenges for black box approaches taken by deep learning-based multi-agent reinforcement learning (MARL) algorithms and deserve more detailed analysis. We formulate a networked stochastic game with pair-wise general sum objectives and asymmetrical information structure, and empirically explore the effects of information availability on the outcomes of different MARL paradigms such as individual learning and centralized learning decentralized execution.

preprint2022arXiv

Multiplayer Performative Prediction: Learning in Decision-Dependent Games

Learning problems commonly exhibit an interesting feedback mechanism wherein the population data reacts to competing decision makers' actions. This paper formulates a new game theoretic framework for this phenomenon, called "multi-player performative prediction". We focus on two distinct solution concepts, namely (i) performatively stable equilibria and (ii) Nash equilibria of the game. The latter equilibria are arguably more informative, but can be found efficiently only when the game is monotone. We show that under mild assumptions, the performatively stable equilibria can be found efficiently by a variety of algorithms, including repeated retraining and the repeated (stochastic) gradient method. We then establish transparent sufficient conditions for strong monotonicity of the game and use them to develop algorithms for finding Nash equilibria. We investigate derivative free methods and adaptive gradient algorithms wherein each player alternates between learning a parametric description of their distribution and gradient steps on the empirical risk. Synthetic and semi-synthetic numerical experiments illustrate the results.

preprint2021arXiv

Stability of Gradient Learning Dynamics in Continuous Games: Vector Action Spaces

Towards characterizing the optimization landscape of games, this paper analyzes the stability of gradient-based dynamics near fixed points of two-player continuous games. We introduce the quadratic numerical range as a method to characterize the spectrum of game dynamics and prove the robustness of equilibria to variations in learning rates. By decomposing the game Jacobian into symmetric and skew-symmetric components, we assess the contribution of a vector field's potential and rotational components to the stability of differential Nash equilibria. Our results show that in zero-sum games, all Nash are stable and robust; in potential games, all stable points are Nash. For general-sum games, we provide a sufficient condition for instability. We conclude with a numerical example in which learning with timescale separation results in faster convergence.

preprint2020arXiv

Constrained Upper Confidence Reinforcement Learning

Constrained Markov Decision Processes are a class of stochastic decision problems in which the decision maker must select a policy that satisfies auxiliary cost constraints. This paper extends upper confidence reinforcement learning for settings in which the reward function and the constraints, described by cost functions, are unknown a priori but the transition kernel is known. Such a setting is well-motivated by a number of applications including exploration of unknown, potentially unsafe, environments. We present an algorithm C-UCRL and show that it achieves sub-linear regret ($ O(T^{\frac{3}{4}}\sqrt{\log(T/δ)})$) with respect to the reward while satisfying the constraints even while learning with probability $1-δ$. Illustrative examples are provided.

preprint2020arXiv

On Gradient-Based Learning in Continuous Games

We formulate a general framework for competitive gradient-based learning that encompasses a wide breadth of multi-agent learning algorithms, and analyze the limiting behavior of competitive gradient-based learning algorithms using dynamical systems theory. For both general-sum and potential games, we characterize a non-negligible subset of the local Nash equilibria that will be avoided if each agent employs a gradient-based learning algorithm. We also shed light on the issue of convergence to non-Nash strategies in general- and zero-sum games, which may have no relevance to the underlying game, and arise solely due to the choice of algorithm. The existence and frequency of such strategies may explain some of the difficulties encountered when using gradient descent in zero-sum games as, e.g., in the training of generative adversarial networks. To reinforce the theoretical contributions, we provide empirical results that highlight the frequency of linear quadratic dynamic games (a benchmark for multi-agent reinforcement learning) that admit global Nash equilibria that are almost surely avoided by policy gradient.

preprint2020arXiv

Safe Reinforcement Learning of Control-Affine Systems with Vertex Networks

This paper focuses on finding reinforcement learning policies for control systems with hard state and action constraints. Despite its success in many domains, reinforcement learning is challenging to apply to problems with hard constraints, especially if both the state variables and actions are constrained. Previous works seeking to ensure constraint satisfaction, or safety, have focused on adding a projection step to a learned policy. Yet, this approach requires solving an optimization problem at every policy execution step, which can lead to significant computational costs. To tackle this problem, this paper proposes a new approach, termed Vertex Networks (VNs), with guarantees on safety during exploration and on learned control policies by incorporating the safety constraints into the policy network architecture. Leveraging the geometric property that all points within a convex set can be represented as the convex combination of its vertices, the proposed algorithm first learns the convex combination weights and then uses these weights along with the pre-calculated vertices to output an action. The output action is guaranteed to be safe by construction. Numerical examples illustrate that the proposed VN algorithm outperforms vanilla reinforcement learning in a variety of benchmark control tasks.

preprint2016arXiv

To Observe or Not to Observe: Queuing Game Framework for Urban Parking

We model parking in urban centers as a set of parallel queues and overlay a game theoretic structure that allows us to compare the user-selected (Nash) equilibrium to the socially optimal equilibrium. We model arriving drivers as utility maximizers and consider the game in which observing the queue length is free as well as the game in which drivers must pay to observe the queue length. In both games, drivers must decide between balking and joining. We compare the Nash induced welfare to the socially optimal welfare. We find that gains to welfare do not require full information penetration---meaning, for social welfare to increase, not everyone needs to pay to observe. Through simulation, we explore a more complex scenario where drivers decide based the queueing game whether or not to enter a collection of queues over a network. We examine the occupancy-congestion relationship, an important relationship for determining the impact of parking resources on overall traffic congestion. Our simulated models use parameters informed by real-world data collected by the Seattle Department of Transportation.

preprint2015arXiv

Quantifying the Utility-Privacy Tradeoff in the Smart Grid

The modernization of the electrical grid and the installation of smart meters come with many advantages to control and monitoring. However, in the wrong hands, the data might pose a privacy threat. In this paper, we consider the tradeoff between smart grid operations and the privacy of consumers. We analyze the tradeoff between smart grid operations and how often data is collected by considering a realistic direct-load control example using thermostatically controlled loads, and we give simulation results to show how its performance degrades as the sampling frequency decreases. Additionally, we introduce a new privacy metric, which we call inferential privacy. This privacy metric assumes a strong adversary model, and provides an upper bound on the adversary's ability to infer a private parameter, independent of the algorithm he uses. Combining these two results allow us to directly consider the tradeoff between better load control and consumer privacy.

preprint2014arXiv

Effects of Risk on Privacy Contracts for Demand-Side Management

As smart meters continue to be deployed around the world collecting unprecedented levels of fine-grained data about consumers, we need to find mechanisms that are fair to both, (1) the electric utility who needs the data to improve their operations, and (2) the consumer who has a valuation of privacy but at the same time benefits from sharing consumption data. In this paper we address this problem by proposing privacy contracts between electric utilities and consumers with the goal of maximizing the social welfare of both. Our mathematical model designs an optimization problem between a population of users that have different valuations on privacy and the costs of operation by the utility. We then show how contracts can change depending on the probability of a privacy breach. This line of research can help inform not only current but also future smart meter collection practices.

preprint2014arXiv

Incentive Design and Utility Learning via Energy Disaggregation

The utility company has many motivations for modifying energy consumption patterns of consumers such as revenue decoupling and demand response programs. We model the utility company--consumer interaction as a principal--agent problem. We present an iterative algorithm for designing incentives while estimating the consumer's utility function. Incentives are designed using the aggregated as well as the disaggregated (device level) consumption data. We simulate the iterative control (incentive design) and estimation (utility learning and disaggregation) process for examples including the design of incentives based on the aggregate consumption data as well as the disaggregated consumption data.

preprint2014arXiv

On the Characterization of Local Nash Equilibria in Continuous Games

We present a unified framework for characterizing local Nash equilibria in continuous games on either infinite-dimensional or finite-dimensional non-convex strategy spaces. We provide intrinsic necessary and sufficient first- and second-order conditions ensuring strategies constitute local Nash equilibria. We term points satisfying the sufficient conditions differential Nash equilibria. Further, we provide a sufficient condition (non-degeneracy) guaranteeing differential Nash equilibria are isolated and show that such equilibria are structurally stable. We present tutorial examples to illustrate our results and highlight degeneracies that can arise in continuous games.

preprint2014arXiv

Privacy and Customer Segmentation in the Smart Grid

In the electricity grid, networked sensors which record and transmit increasingly high-granularity data are being deployed. In such a setting, privacy concerns are a natural consideration. We present an attack model for privacy breaches, and, using results from estimation theory, derive theoretical results ensuring that an adversary will fail to infer private information with a certain probability, independent of the algorithm used. We show utility companies would benefit from less noisy, higher frequency data, as it would improve various smart grid operations such as load prediction. We provide a method to quantify how smart grid operations improve as a function of higher frequency data. In order to obtain the consumer's valuation of privacy, we design a screening mechanism consisting of a menu of contracts to the energy consumer with varying guarantees of privacy. The screening process is a means to segment customers. Finally, we design insurance contracts using the probability of a privacy breach to be offered by third-party insurance companies.

preprint2014arXiv

Social Game for Building Energy Efficiency: Utility Learning, Simulation, and Analysis

We describe a social game that we designed for encouraging energy efficient behavior amongst building occupants with the aim of reducing overall energy consumption in the building. Occupants vote for their desired lighting level and win points which are used in a lottery based on how far their vote is from the maximum setting. We assume that the occupants are utility maximizers and that their utility functions capture the tradeoff between winning points and their comfort level. We model the occupants as non-cooperative agents in a continuous game and we characterize their play using the Nash equilibrium concept. Using occupant voting data, we parameterize their utility functions and use a convex optimization problem to estimate the parameters. We simulate the game defined by the estimated utility functions and show that the estimated model for occupant behavior is a good predictor of their actual behavior. In addition, we show that due to the social game, there is a significant reduction in energy consumption.

preprint2013arXiv

Energy Disaggregation via Adaptive Filtering

The energy disaggregation problem is recovering device level power consumption signals from the aggregate power consumption signal for a building. We show in this paper how the disaggregation problem can be reformulated as an adaptive filtering problem. This gives both a novel disaggregation algorithm and a better theoretical understanding for disaggregation. In particular, we show how the disaggregation problem can be solved online using a filter bank and discuss its optimality.

Lillian J. Ratliff

What is connected

Connect this record

See the researcher in context

Building this map preview

17 published item(s)

Adaptive Calibration in Non-Stationary Environments

Adaptive Constraint Satisfaction for Markov Decision Process Congestion Games: Application to Transportation Networks

Decision-Dependent Risk Minimization in Geometrically Decaying Dynamic Environments

General sum stochastic games with networked information flows

Multiplayer Performative Prediction: Learning in Decision-Dependent Games

Stability of Gradient Learning Dynamics in Continuous Games: Vector Action Spaces

Constrained Upper Confidence Reinforcement Learning

On Gradient-Based Learning in Continuous Games

Safe Reinforcement Learning of Control-Affine Systems with Vertex Networks

To Observe or Not to Observe: Queuing Game Framework for Urban Parking

Quantifying the Utility-Privacy Tradeoff in the Smart Grid

Effects of Risk on Privacy Contracts for Demand-Side Management

Incentive Design and Utility Learning via Energy Disaggregation

On the Characterization of Local Nash Equilibria in Continuous Games

Privacy and Customer Segmentation in the Smart Grid

Social Game for Building Energy Efficiency: Utility Learning, Simulation, and Analysis

Energy Disaggregation via Adaptive Filtering