Researcher profile

Lillian J. Ratliff

Lillian J. Ratliff contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2026arXiv

Adaptive Calibration in Non-Stationary Environments

Making calibrated online predictions is a central challenge in modern AI systems. Much of the existing literature focuses on fully adversarial environments where outcomes may be arbitrary, leading to conservative algorithms that can perform suboptimally in more benign settings, such as when outcomes are nearly stationary. This gap raises a natural question: can we design online prediction algorithms whose calibration error automatically adapts to the degree of non-stationarity in the environment, smoothly interpolating between i.i.d. and adversarial regimes? We answer this question in the affirmative and develop a suite of algorithms that achieve adaptive calibration guarantees under multiple calibration measures. Specifically, with $T$ being the number of rounds and $C\in[0,T]$ being an unknown non-stationary measure defined as the minimal $\ell_1$ deviation of the mean outcomes, our algorithms attain $\widetilde{O}(\sqrt{T}+(TC)^{\frac{1}{3}})$ for $\ell_1$ calibration error and $\widetilde{O}((1+C)^{\frac{1}{3}})$ for both $\ell_2$ and pseudo KL calibration error. These bounds match the optimal rates in the stationary case ($C=0$) and recover known guarantees in the fully adversarial regime ($C=T$). Our approach builds on and extends prior work [Hu et al., 2026, Luo et al., 2025], introducing an epoch-based scheduling together with a novel non-uniform partition of the prediction space that allocates finer resolution near the underlying ground truth.

preprint2022arXiv

Adaptive Constraint Satisfaction for Markov Decision Process Congestion Games: Application to Transportation Networks

Under the Markov decision process (MDP) congestion game framework, we study the problem of enforcing population distribution constraints on a population of players with stochastic dynamics and coupled congestion costs. Existing research demonstrates that the constraints on the players' population distribution can be satisfied by enforcing tolls. However, computing the minimum toll value for constraint satisfaction requires accurate modeling of the player's congestion costs. Motivated by settings where an accurate congestion cost model is unavailable (e.g. transportation networks), we consider an MDP congestion game with unknown congestion costs. We assume that a constraint-enforcing authority can repeatedly enforce tolls on a population of players who converges to an $ε$-optimal population distribution for any given toll. We then construct a myopic update algorithm to compute the minimum toll value while ensuring that the constraints are satisfied on average. We analyze how the players' sub-optimal responses to tolls impact the rates of convergence towards the minimum toll value and constraint satisfaction. Finally, we construct a congestion game model for Uber drivers in Manhattan, New York City (NYC) using data from the Taxi and Limousine Commission (TLC) to illustrate how to efficiently reduce congestion while minimizing the impact on driver earnings.

preprint2022arXiv

Decision-Dependent Risk Minimization in Geometrically Decaying Dynamic Environments

This paper studies the problem of expected loss minimization given a data distribution that is dependent on the decision-maker's action and evolves dynamically in time according to a geometric decay process. Novel algorithms for both the information setting in which the decision-maker has a first order gradient oracle and the setting in which they have simply a loss function oracle are introduced. The algorithms operate on the same underlying principle: the decision-maker repeatedly deploys a fixed decision over the length of an epoch, thereby allowing the dynamically changing environment to sufficiently mix before updating the decision. The iteration complexity in each of the settings is shown to match existing rates for first and zero order stochastic gradient methods up to logarithmic factors. The algorithms are evaluated on a "semi-synthetic" example using real world data from the SFpark dynamic pricing pilot study; it is shown that the announced prices result in an improvement for the institution's objective (target occupancy), while achieving an overall reduction in parking rates.

preprint2022arXiv

General sum stochastic games with networked information flows

Inspired by applications such as supply chain management, epidemics, and social networks, we formulate a stochastic game model that addresses three key features common across these domains: 1) network-structured player interactions, 2) pair-wise mixed cooperation and competition among players, and 3) limited global information toward individual decision-making. In combination, these features pose significant challenges for black box approaches taken by deep learning-based multi-agent reinforcement learning (MARL) algorithms and deserve more detailed analysis. We formulate a networked stochastic game with pair-wise general sum objectives and asymmetrical information structure, and empirically explore the effects of information availability on the outcomes of different MARL paradigms such as individual learning and centralized learning decentralized execution.

preprint2022arXiv

Multiplayer Performative Prediction: Learning in Decision-Dependent Games

Learning problems commonly exhibit an interesting feedback mechanism wherein the population data reacts to competing decision makers' actions. This paper formulates a new game theoretic framework for this phenomenon, called "multi-player performative prediction". We focus on two distinct solution concepts, namely (i) performatively stable equilibria and (ii) Nash equilibria of the game. The latter equilibria are arguably more informative, but can be found efficiently only when the game is monotone. We show that under mild assumptions, the performatively stable equilibria can be found efficiently by a variety of algorithms, including repeated retraining and the repeated (stochastic) gradient method. We then establish transparent sufficient conditions for strong monotonicity of the game and use them to develop algorithms for finding Nash equilibria. We investigate derivative free methods and adaptive gradient algorithms wherein each player alternates between learning a parametric description of their distribution and gradient steps on the empirical risk. Synthetic and semi-synthetic numerical experiments illustrate the results.

preprint2021arXiv

Stability of Gradient Learning Dynamics in Continuous Games: Vector Action Spaces

Towards characterizing the optimization landscape of games, this paper analyzes the stability of gradient-based dynamics near fixed points of two-player continuous games. We introduce the quadratic numerical range as a method to characterize the spectrum of game dynamics and prove the robustness of equilibria to variations in learning rates. By decomposing the game Jacobian into symmetric and skew-symmetric components, we assess the contribution of a vector field's potential and rotational components to the stability of differential Nash equilibria. Our results show that in zero-sum games, all Nash are stable and robust; in potential games, all stable points are Nash. For general-sum games, we provide a sufficient condition for instability. We conclude with a numerical example in which learning with timescale separation results in faster convergence.

preprint2020arXiv

Constrained Upper Confidence Reinforcement Learning

Constrained Markov Decision Processes are a class of stochastic decision problems in which the decision maker must select a policy that satisfies auxiliary cost constraints. This paper extends upper confidence reinforcement learning for settings in which the reward function and the constraints, described by cost functions, are unknown a priori but the transition kernel is known. Such a setting is well-motivated by a number of applications including exploration of unknown, potentially unsafe, environments. We present an algorithm C-UCRL and show that it achieves sub-linear regret ($ O(T^{\frac{3}{4}}\sqrt{\log(T/δ)})$) with respect to the reward while satisfying the constraints even while learning with probability $1-δ$. Illustrative examples are provided.

preprint2020arXiv

On Gradient-Based Learning in Continuous Games

We formulate a general framework for competitive gradient-based learning that encompasses a wide breadth of multi-agent learning algorithms, and analyze the limiting behavior of competitive gradient-based learning algorithms using dynamical systems theory. For both general-sum and potential games, we characterize a non-negligible subset of the local Nash equilibria that will be avoided if each agent employs a gradient-based learning algorithm. We also shed light on the issue of convergence to non-Nash strategies in general- and zero-sum games, which may have no relevance to the underlying game, and arise solely due to the choice of algorithm. The existence and frequency of such strategies may explain some of the difficulties encountered when using gradient descent in zero-sum games as, e.g., in the training of generative adversarial networks. To reinforce the theoretical contributions, we provide empirical results that highlight the frequency of linear quadratic dynamic games (a benchmark for multi-agent reinforcement learning) that admit global Nash equilibria that are almost surely avoided by policy gradient.

preprint2020arXiv

Safe Reinforcement Learning of Control-Affine Systems with Vertex Networks

This paper focuses on finding reinforcement learning policies for control systems with hard state and action constraints. Despite its success in many domains, reinforcement learning is challenging to apply to problems with hard constraints, especially if both the state variables and actions are constrained. Previous works seeking to ensure constraint satisfaction, or safety, have focused on adding a projection step to a learned policy. Yet, this approach requires solving an optimization problem at every policy execution step, which can lead to significant computational costs. To tackle this problem, this paper proposes a new approach, termed Vertex Networks (VNs), with guarantees on safety during exploration and on learned control policies by incorporating the safety constraints into the policy network architecture. Leveraging the geometric property that all points within a convex set can be represented as the convex combination of its vertices, the proposed algorithm first learns the convex combination weights and then uses these weights along with the pre-calculated vertices to output an action. The output action is guaranteed to be safe by construction. Numerical examples illustrate that the proposed VN algorithm outperforms vanilla reinforcement learning in a variety of benchmark control tasks.