Source author record

Alexandre Bayen

Alexandre Bayen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Systems and Control eess.SY Artificial Intelligence Machine Learning math.OC Multiagent Systems Data Structures and Algorithms Computational Complexity Computer Science and Game Theory cs.CY math.AP Numerical Analysis physics.soc-ph

Catalog footprint

What is connected

16works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Towards Automated Air Traffic Safety Assessment Around Non-Towered Airports Using Large Language Models

We investigate frameworks for post-flight safety analysis at non-towered airports using large language models (LLMs). Non-towered airports rely on the Common Traffic Advisory Frequency (CTAF) for air traffic coordination and experience frequent near mid-air collisions due to the pilot self-announcement communication protocol. We propose a general vision-language model (VLM) approach to analyze the transcribed CTAF radio communications in natural language, METeorological Aerodrome Report (METAR) weather data, Automatic Dependent Surveillance-Broadcast (ADS-B) flight trajectories, and Visual Flight Rules sectional charts of the airfield. We provide a preliminary study at Half Moon Bay Airport, with a qualitative real world case study and a quantitative evaluation using a new synthetic dataset of communications and weather modalities. We qualitatively evaluate our framework on real flight data using Gemini 2.5 Pro, demonstrating accurate identification of a right-of-way violation. The synthetic dataset is derived from real examples and includes a 12-category hazard taxonomy, and is used to benchmark three open-source (Qwen 2.5-7B, Mistral-7B, Gemma-2-9B) and three closed-source (GPT-4o, GPT-5.4, Claude Sonnet 4.6) LLM models on the subset of inputs related to CTAF and METAR. Even limited to CTAF and METAR inputs and open source LLMs, instances of our framework typically achieve a macro F1 score above 0.85 on a binary nominal/danger classification task. Future work includes a quantitative evaluation across all modalities and a larger number of real world examples. Taken together, our results suggest that VLM analysis of safety at non-towered airports may be a valuable future capability.

preprint2022arXiv

A Hierarchical MPC Approach to Car-Following via Linearly Constrained Quadratic Programming

Single-lane car-following is a fundamental task in autonomous driving. A desirable car-following controller should keep a reasonable range of distances to the preceding vehicle and do so as smoothly as possible. To achieve this, numerous control methods have been proposed: some only rely on local sensing; others also make use of non-local downstream observations. While local methods are capable of attenuating high-frequency velocity oscillation and are economical to compute, non-local methods can dampen a wider spectrum of oscillatory traffic but incur a larger cost in computing. In this article, we design a novel non-local tri-layer MPC controller that is capable of smoothing a wide range of oscillatory traffic and is amenable to real-time applications. At the core of the controller design are 1) an accessible prediction method based on ETA estimation and 2) a robust, light-weight optimization procedure, designed specifically for handling various headway constraints. Numerical simulations suggest that the proposed controller can simultaneously maintain a variable headway while driving with modest acceleration and is robust to imperfect traffic predictions.

preprint2022arXiv

A rigorous multi-population multi-lane hybrid traffic model and its mean-field limit for dissipation of waves via autonomous vehicles

In this paper, a multi-lane multi-population microscopic model, which presents stop and go waves, is proposed to simulate traffic on a ring-road. Vehicles are divided between human-driven and autonomous vehicles (AV). Control strategies are designed with the ultimate goal of using a small number of AVs (less than 5\% penetration rate) to represent Lagrangian control actuators that can smooth the multilane traffic flow and dissipate the stop-and-go waves. This in turn may reduce fuel consumption and emissions. The lane-changing mechanism is based on three components that we treat as parameters in the model: safety, incentive and cool-down time. The choice of these parameters in the lane-change mechanism is critical to modeling traffic accurately, because different parameter values can lead to drastically different traffic behaviors. In particular, the number of lane-changes and the speed variance are highly affected by the choice of parameters. Despite this modeling issue, when using sufficiently simple and robust controllers for AVs, the stabilization of uniform flow steady-state is effective for any realistic value of the parameters, and ultimately bypasses the observed modeling issue. Our approach is based on accurate and rigorous mathematical models, which allows a limit procedure that is termed, in gas dynamic terminology, mean-field. In simple words, from increasing the human-driven population to infinity, a system of coupled ordinary and partial differential equations are obtained. Moreover, control problems also pass to the limit, allowing the design to be tackled at different scales.

preprint2022arXiv

Composing MPC with LQR and Neural Network for Amortized Efficiency and Stable Control

Model predictive control (MPC) is a powerful control method that handles dynamical systems with constraints. However, solving MPC iteratively in real time, i.e., implicit MPC, remains a computational challenge. To address this, common solutions include explicit MPC and function approximation. Both methods, whenever applicable, may improve the computational efficiency of the implicit MPC by several orders of magnitude. Nevertheless, explicit MPC often requires expensive pre-computation and does not easily apply to higher-dimensional problems. Meanwhile, function approximation, although scales better with dimension, still requires pre-training on a large dataset and generally cannot guarantee to find an accurate surrogate policy, the failure of which often leads to closed-loop instability. To address these issues, we propose a triple-mode hybrid control scheme, named Memory-Augmented MPC, by combining a linear quadratic regulator, a neural network, and an MPC. From its standard form, we further derive two variants of such hybrid control scheme: one customized for chaotic systems and the other for slow systems. The proposed scheme does not require pre-computation and can improve the amortized running time of the composed MPC with a well-trained neural network. In addition, the scheme maintains closed-loop stability with any neural networks of proper input and output dimensions, alleviating the need for certifying optimality of the neural network in safety-critical applications.

preprint2022arXiv

Limitations and Improvements of the Intelligent Driver Model (IDM)

This contribution analyzes the widely used and well-known "intelligent driver model (briefly IDM), which is a second order car-following model governed by a system of ordinary differential equations. Although this model was intensively studied in recent years for properly capturing traffic phenomena and driver braking behavior, a rigorous study of the well-posedness has, to our knowledge, never been performed. First it is shown that, for a specific class of initial data, the vehicles' velocities become negative or even diverge to $-\infty$ in finite time, both undesirable properties for a car-following model. Various modifications of the IDM are then proposed in order to avoid such ill-posedness. The theoretical remediation of the model, rather than post facto by ad-hoc modification of code implementations, allows a more sound numerical implementation and preservation of the model features. Indeed, to avoid inconsistencies and ensure dynamics close to the one of the original model, one may need to inspect and clean large input data, which may result in practically impossible scenarios for large-scale simulations. Although well-posedness issues occur only for specific initial data, this may happen frequently when different traffic scenarios are analyzed, and especially in presence of lane-changing, on ramps and other network components as it is the case for most commonly used micro-simulators. On the other side, it is shown that well-posedness can be guaranteed by straight-forward improvements, such as those obtained by slightly changing the acceleration to prevent the velocity from becoming negative.

preprint2021arXiv

Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design

A wide range of reinforcement learning (RL) problems - including robustness, transfer learning, unsupervised RL, and emergent complexity - require specifying a distribution of tasks or environments in which a policy will be trained. However, creating a useful distribution of environments is error prone, and takes a significant amount of developer time and effort. We propose Unsupervised Environment Design (UED) as an alternative paradigm, where developers provide environments with unknown parameters, and these parameters are used to automatically produce a distribution over valid, solvable environments. Existing approaches to automatically generating environments suffer from common failure modes: domain randomization cannot generate structure or adapt the difficulty of the environment to the agent's learning progress, and minimax adversarial training leads to worst-case environments that are often unsolvable. To generate structured, solvable environments for our protagonist agent, we introduce a second, antagonist agent that is allied with the environment-generating adversary. The adversary is motivated to generate environments which maximize regret, defined as the difference between the protagonist and antagonist agent's return. We call our technique Protagonist Antagonist Induced Regret Environment Design (PAIRED). Our experiments demonstrate that PAIRED produces a natural curriculum of increasingly complex environments, and PAIRED agents achieve higher zero-shot transfer performance when tested in highly novel environments.

preprint2021arXiv

Learning Generalizable Multi-Lane Mixed-Autonomy Behaviors in Single Lane Representations of Traffic

Reinforcement learning techniques can provide substantial insights into the desired behaviors of future autonomous driving systems. By optimizing for societal metrics of traffic such as increased throughput and reduced energy consumption, such methods can derive maneuvers that, if adopted by even a small portion of vehicles, may significantly improve the state of traffic for all vehicles involved. These methods, however, are hindered in practice by the difficulty of designing efficient and accurate models of traffic, as well as the challenges associated with optimizing for the behaviors of dozens of interacting agents. In response to these challenges, this paper tackles the problem of learning generalizable traffic control strategies in simple representations of vehicle driving dynamics. In particular, we look to mixed-autonomy ring roads as depictions of instabilities that result in the formation of congestion. Within this problem, we design a curriculum learning paradigm that exploits the natural extendability of the network to effectively learn behaviors that reduce congestion over long horizons. Next, we study the implications of modeling lane changing on the transferability of policies. Our findings suggest that introducing lane change behaviors that even approximately match trends in more complex systems can significantly improve the generalizability of subsequent learned models to more accurate multi-lane models of traffic.

preprint2021arXiv

Parallel Network Flow Allocation in Repeated Routing Games via LQR Optimal Control

In this article, we study the repeated routing game problem on a parallel network with affine latency functions on each edge. We cast the game setup in a LQR control theoretic framework, leveraging the Rosenthal potential formulation. We use control techniques to analyze the convergence of the game dynamics with specific cases that lend themselves to optimal control. We design proper dynamics parameters so that the conservation of flow is guaranteed. We provide an algorithmic solution for the general optimal control setup using a multiparametric quadratic programming approach (explicit MPC). Finally we illustrate with numerics the impact of varying system parameters on the solutions.

preprint2021arXiv

Reinforcement Learning versus PDE Backstepping and PI Control for Congested Freeway Traffic

We develop reinforcement learning (RL) boundary controllers to mitigate stop-and-go traffic congestion on a freeway segment. The traffic dynamics of the freeway segment are governed by a macroscopic Aw-Rascle-Zhang (ARZ) model, consisting of $2\times 2$ quasi-linear partial differential equations (PDEs) for traffic density and velocity. Boundary stabilization of the linearized ARZ PDE model has been solved by PDE backstepping, guaranteeing spatial $L^2$ norm regulation of the traffic state to uniform density and velocity and ensuring that traffic oscillations are suppressed. Collocated Proportional (P) and Proportional-Integral (PI) controllers also provide stability guarantees under certain restricted conditions, and are always applicable as model-free control options through gain tuning by trail and error, or by model-free optimization. Although these approaches are mathematically elegant, the stabilization result only holds locally and is usually affected by the change of model parameters. Therefore, we reformulate the PDE boundary control problem as a RL problem that pursues stabilization without knowing the system dynamics, simply by observing the state values. The proximal policy optimization, a neural network-based policy gradient algorithm, is employed to obtain RL controllers by interacting with a numerical simulator of the ARZ PDE. Being stabilization-inspired, the RL state-feedback boundary controllers are compared and evaluated against the rigorously stabilizing controllers in two cases: (i) in a system with perfect knowledge of the traffic flow dynamics, and then (ii) in one with only partial knowledge. We obtain RL controllers that nearly recover the performance of the backstepping, P, and PI controllers with perfect knowledge and outperform them in some cases with partial knowledge.

preprint2020arXiv

On the Approximability of Time Disjoint Walks

We introduce the combinatorial optimization problem Time Disjoint Walks (TDW), which has applications in collision-free routing of discrete objects (e.g., autonomous vehicles) over a network. This problem takes as input a digraph $G$ with positive integer arc lengths, and $k$ pairs of vertices that each represent a trip demand from a source to a destination. The goal is to find a walk and delay for each demand so that no two trips occupy the same vertex at the same time, and so that a min-max or min-sum objective over the trip durations is realized. We focus here on the min-sum variant of Time Disjoint Walks, although most of our results carry over to the min-max case. We restrict our study to various subclasses of DAGs, and observe that there is a sharp complexity boundary between Time Disjoint Walks on oriented stars and on oriented stars with the central vertex replaced by a path. In particular, we present a poly-time algorithm for min-sum and min-max TDW on the former, but show that min-sum TDW on the latter is NP-hard. Our main hardness result is that for DAGs with max degree $Δ\leq3$, min-sum Time Disjoint Walks is APX-hard. We present a natural approximation algorithm for the same class, and provide a tight analysis. In particular, we prove that it achieves an approximation ratio of $Θ(k/\log k)$ on bounded-degree DAGs, and $Θ(k)$ on DAGs and bounded-degree digraphs.

preprint2016arXiv

Minimizing Regret on Reflexive Banach Spaces and Learning Nash Equilibria in Continuous Zero-Sum Games

We study a general version of the adversarial online learning problem. We are given a decision set $\mathcal{X}$ in a reflexive Banach space $X$ and a sequence of reward vectors in the dual space of $X$. At each iteration, we choose an action from $\mathcal{X}$, based on the observed sequence of previous rewards. Our goal is to minimize regret, defined as the gap between the realized reward and the reward of the best fixed action in hindsight. Using results from infinite dimensional convex analysis, we generalize the method of Dual Averaging (or Follow the Regularized Leader) to our setting and obtain general upper bounds on the worst-case regret that subsume a wide range of results from the literature. Under the assumption of uniformly continuous rewards, we obtain explicit anytime regret bounds in a setting where the decision set is the set of probability distributions on a compact metric space $S$ whose Radon-Nikodym derivatives are elements of $L^p(S)$ for some $p > 1$. Importantly, we make no convexity assumptions on either the set $S$ or the reward functions. We also prove a general lower bound on the worst-case regret for any online algorithm. We then apply these results to the problem of learning in repeated continuous two-player zero-sum games, in which players' strategy sets are compact metric spaces. In doing so, we first prove that if both players play a Hannan-consistent strategy, then with probability 1 the empirical distributions of play weakly converge to the set of Nash equilibria of the game. We then show that, under mild assumptions, Dual Averaging on the (infinite-dimensional) space of probability distributions indeed achieves Hannan-consistency. Finally, we illustrate our results through numerical examples.

preprint2014arXiv

Anatomy of a Crash

Transportation networks constitute a critical infrastructure enabling the transfers of passengers and goods, with a significant impact on the economy at different scales. Transportation modes, whether air, road or rail, are coupled and interdependent. The frequent occurrence of perturbations on one or several modes disrupts passengers' entire journeys, directly and through ripple effects. The present paper provides a case report of the Asiana Crash in San Francisco International Airport on July 6th 2013 and its repercussions on the multimodal transportation network. It studies the resulting propagation of disturbances on the transportation infrastructure in the United States. The perturbation takes different forms and varies in scale and time frame : cancellations and delays snowball in the airspace, highway traffic near the airport is impacted by congestion in previously never congested locations, and transit passenger demand exhibit unusual traffic peaks in between airports in the Bay Area. This paper, through a case study, aims at stressing the importance of further data-driven research on interdependent infrastructure networks for increased resilience. The end goal is to form the basis for optimization models behind providing more reliable passenger door-to-door journeys.

preprint2014arXiv

Building-in-Briefcase (BiB)

A building's environment has profound influence on occupant comfort and health. Continuous monitoring of building occupancy and environment is essential to fault detection, intelligent control, and building commissioning. Though many solutions for environmental measuring based on wireless sensor networks exist, they are not easily accessible to households and building owners who may lack time or technical expertise needed to set up a system and get quick and detailed overview of environmental conditions. Building-in-Briefcase (BiB) is a portable sensor network platform that is trivially easy to deploy in any building environment. Once the sensors are distributed, the environmental data is collected and communicated to the BiB router via TCP/IP protocol and WiFi technology which then forwards the data to the central database securely over the internet through a 3G radio. The user, with minimal effort, can access the aggregated data and visualize the trends in real time on the BiB web portal. Paramount to the adoption and continued operation of an indoor sensing platform is battery lifetime. This design has achieved a multi-year lifespan by careful selection of components, an efficient binary communications protocol and data compression. Our BiB sensor is capable of collecting a rich set of environmental parameters, and is expandable to measure others, such as CO2. This paper describes the power characteristics of BiB sensors and their occupancy estimation and activity recognition functionality. Our vision is large-scale deployment of BiB in thousands of buildings, which would provide ample research opportunities and opportunities to identify ways to improve the building environment and energy efficiency.

preprint2014arXiv

Computing the log-determinant of symmetric, diagonally dominant matrices in near-linear time

We present new algorithms for computing the log-determinant of symmetric, diagonally dominant matrices. Existing algorithms run with cubic complexity with respect to the size of the matrix in the worst case. Our algorithm computes an approximation of the log-determinant in time near-linear with respect to the number of non-zero entries and with high probability. This algorithm builds upon the utra-sparsifiers introduced by Spielman and Teng for Laplacian matrices and ultimately uses their refined versions introduced by Koutis, Miller and Peng in the context of solving linear systems. We also present simpler algorithms that compute upper and lower bounds and that may be of more immediate practical interest.

preprint2013arXiv

Arriving on time: estimating travel time distributions on large-scale road networks

Most optimal routing problems focus on minimizing travel time or distance traveled. Oftentimes, a more useful objective is to maximize the probability of on-time arrival, which requires statistical distributions of travel times, rather than just mean values. We propose a method to estimate travel time distributions on large-scale road networks, using probe vehicle data collected from GPS. We present a framework that works with large input of data, and scales linearly with the size of the network. Leveraging the planar topology of the graph, the method computes efficiently the time correlations between neighboring streets. First, raw probe vehicle traces are compressed into pairs of travel times and number of stops for each traversed road segment using a `stop-and-go' algorithm developed for this work. The compressed data is then used as input for training a path travel time model, which couples a Markov model along with a Gaussian Markov random field. Finally, scalable inference algorithms are developed for obtaining path travel time distributions from the composite MM-GMRF model. We illustrate the accuracy and scalability of our model on a 505,000 road link network spanning the San Francisco Bay Area.

preprint2012arXiv

The path inference filter: model-based low-latency map matching of probe vehicle data

We consider the problem of reconstructing vehicle trajectories from sparse sequences of GPS points, for which the sampling interval is between 10 seconds and 2 minutes. We introduce a new class of algorithms, called altogether path inference filter (PIF), that maps GPS data in real time, for a variety of trade-offs and scenarios, and with a high throughput. Numerous prior approaches in map-matching can be shown to be special cases of the path inference filter presented in this article. We present an efficient procedure for automatically training the filter on new data, with or without ground truth observations. The framework is evaluated on a large San Francisco taxi dataset and is shown to improve upon the current state of the art. This filter also provides insights about driving patterns of drivers. The path inference filter has been deployed at an industrial scale inside the Mobile Millennium traffic information system, and is used to map fleets of data in San Francisco, Sacramento, Stockholm and Porto.

Alexandre Bayen

What is connected

Connect this record

See the researcher in context

Building this map preview

16 published item(s)

Towards Automated Air Traffic Safety Assessment Around Non-Towered Airports Using Large Language Models

A Hierarchical MPC Approach to Car-Following via Linearly Constrained Quadratic Programming

A rigorous multi-population multi-lane hybrid traffic model and its mean-field limit for dissipation of waves via autonomous vehicles

Composing MPC with LQR and Neural Network for Amortized Efficiency and Stable Control

Limitations and Improvements of the Intelligent Driver Model (IDM)

Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design

Learning Generalizable Multi-Lane Mixed-Autonomy Behaviors in Single Lane Representations of Traffic

Parallel Network Flow Allocation in Repeated Routing Games via LQR Optimal Control

Reinforcement Learning versus PDE Backstepping and PI Control for Congested Freeway Traffic

On the Approximability of Time Disjoint Walks

Minimizing Regret on Reflexive Banach Spaces and Learning Nash Equilibria in Continuous Zero-Sum Games

Anatomy of a Crash

Building-in-Briefcase (BiB)

Computing the log-determinant of symmetric, diagonally dominant matrices in near-linear time

Arriving on time: estimating travel time distributions on large-scale road networks

The path inference filter: model-based low-latency map matching of probe vehicle data