Source author record

Alexandre M. Bayen

Alexandre M. Bayen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.OC Robotics Systems and Control Artificial Intelligence Computer Science and Game Theory eess.SY Multiagent Systems Cryptography and Security cs.CY Distributed, Parallel, and Cluster Computing Human-Computer Interaction math.NA Methodology Numerical Analysis physics.data-an physics.soc-ph Software Engineering

Catalog footprint

What is connected

14works

18topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Supervised and Unsupervised Neural Network Solver for First Order Hyperbolic Nonlinear PDEs

We present a neural network-based method for learning scalar hyperbolic conservation laws. Our method replaces the traditional numerical flux in finite volume schemes with a trainable neural network while preserving the conservative structure of the scheme. The model can be trained both in a supervised setting with efficiently generated synthetic data or in an unsupervised manner, leveraging the weak formulation of the partial differential equation. We provide theoretical results that our model can perform arbitrarily well, and provide associated upper bounds on neural network size. Extensive experiments demonstrate that our method often outperforms efficient schemes such as Godunov's scheme, WENO, and Discontinuous Galerkin for comparable computational budgets. Finally, we demonstrate the effectiveness of our method on a traffic prediction task, leveraging field experimental highway data from the Berkeley DeepDrive drone dataset.

preprint2022arXiv

Adaptive Coordination Offsets for Signalized Arterial Intersections using Deep Reinforcement Learning

Coordinating intersections in arterial networks is critical to the performance of urban transportation systems. Deep reinforcement learning (RL) has gained traction in traffic control research along with data-driven approaches for traffic control systems. To date, proposed deep RL-based traffic schemes control phase activation or duration. Yet, such approaches may bypass low volume links for several cycles in order to optimize the network-level traffic flow. Here, we propose a deep RL framework that dynamically adjusts offsets based on traffic states and preserves the planned phase timings and order derived from model-based methods. This framework allows us to improve arterial coordination while maintaining phase order and timing predictability. Using a validated and calibrated traffic model, we trained the policy of a deep RL agent that aims to reduce travel delays in the network. We evaluated the resulting policy by comparing its performance against the phase offsets deployed along a segment of Huntington Drive in the city of Arcadia. The resulting policy dynamically readjusts phase offsets in response to changes in traffic demand. Simulation results show that the proposed deep RL agent outperformed the baseline on average, effectively reducing delay time by 13.21% in the AM Scenario, 2.42% in the Noon scenario, and 6.2% in the PM scenario when offsets are adjusted in 15-minute intervals. Finally, we also show the robustness of our agent to extreme traffic conditions, such as demand surges in off-peak hours and localized traffic incidents

preprint2022arXiv

Learning energy-efficient driving behaviors by imitating experts

The rise of vehicle automation has generated significant interest in the potential role of future automated vehicles (AVs). In particular, in highly dense traffic settings, AVs are expected to serve as congestion-dampeners, mitigating the presence of instabilities that arise from various sources. However, in many applications, such maneuvers rely heavily on non-local sensing or coordination by interacting AVs, thereby rendering their adaptation to real-world settings a particularly difficult challenge. To address this challenge, this paper examines the role of imitation learning in bridging the gap between such control strategies and realistic limitations in communication and sensing. Treating one such controller as an "expert", we demonstrate that imitation learning can succeed in deriving policies that, if adopted by 5% of vehicles, may boost the energy-efficiency of networks with varying traffic conditions by 15% using only local observations. Results and code are available online at https://sites.google.com/view/il-traffic/home.

preprint2022arXiv

Unified Automatic Control of Vehicular Systems with Reinforcement Learning

Emerging vehicular systems with increasing proportions of automated components present opportunities for optimal control to mitigate congestion and increase efficiency. There has been a recent interest in applying deep reinforcement learning (DRL) to these nonlinear dynamical systems for the automatic design of effective control strategies. Despite conceptual advantages of DRL being model-free, studies typically nonetheless rely on training setups that are painstakingly specialized to specific vehicular systems. This is a key challenge to efficient analysis of diverse vehicular and mobility systems. To this end, this article contributes a streamlined methodology for vehicular microsimulation and discovers high performance control strategies with minimal manual design. A variable-agent, multi-task approach is presented for optimization of vehicular Partially Observed Markov Decision Processes. The methodology is experimentally validated on mixed autonomy traffic systems, where fractions of vehicles are automated; empirical improvement, typically 15-60% over a human driving baseline, is observed in all configurations of six diverse open or closed traffic systems. The study reveals numerous emergent behaviors resembling wave mitigation, traffic signaling, and ramp metering. Finally, the emergent behaviors are analyzed to produce interpretable control strategies, which are validated against the learned control strategies.

preprint2020arXiv

BISTRO: Berkeley Integrated System for Transportation Optimization

This article introduces BISTRO, a new open source transportation planning decision support system that uses an agent-based simulation and optimization approach to anticipate and develop adaptive plans for possible technological disruptions and growth scenarios. The new framework was evaluated in the context of a machine learning competition hosted within Uber Technologies, Inc., in which over 400 engineers and data scientists participated. For the purposes of this competition, a benchmark model, based on the city of Sioux Falls, South Dakota, was adapted to the BISTRO framework. An important finding of this study was that in spite of rigorous analysis and testing done prior to the competition, the two top-scoring teams discovered an unbounded region of the search space, rendering the solutions largely uninterpretable for the purposes of decision-support. On the other hand, a follow-on study aimed to fix the objective function, served to demonstrate BISTRO's utility as a human-in-the-loop cyberphysical system: one that uses scenario-based optimization algorithms as a feedback mechanism to assist urban planners with iteratively refining objective function and constraints specification on intervention strategies such that the portfolio of transportation intervention strategy alternatives eventually chosen achieves high-level regional planning goals developed through participatory stakeholder engagement practices.

preprint2016arXiv

Differential Privacy of Populations in Routing Games

As our ground transportation infrastructure modernizes, the large amount of data being measured, transmitted, and stored motivates an analysis of the privacy aspect of these emerging cyber-physical technologies. In this paper, we consider privacy in the routing game, where the origins and destinations of drivers are considered private. This is motivated by the fact that this spatiotemporal information can easily be used as the basis for inferences for a person's activities. More specifically, we consider the differential privacy of the mapping from the amount of flow for each origin-destination pair to the traffic flow measurements on each link of a traffic network. We use a stochastic online learning framework for the population dynamics, which is known to converge to the Nash equilibrium of the routing game. We analyze the sensitivity of this process and provide theoretical guarantees on the convergence rates as well as differential privacy values for these models. We confirm these with simulations on a small example.

preprint2016arXiv

Scalable Linear Causal Inference for Irregularly Sampled Time Series with Long Range Dependencies

Linear causal analysis is central to a wide range of important application spanning finance, the physical sciences, and engineering. Much of the existing literature in linear causal analysis operates in the time domain. Unfortunately, the direct application of time domain linear causal analysis to many real-world time series presents three critical challenges: irregular temporal sampling, long range dependencies, and scale. Moreover, real-world data is often collected at irregular time intervals across vast arrays of decentralized sensors and with long range dependencies which make naive time domain correlation estimators spurious. In this paper we present a frequency domain based estimation framework which naturally handles irregularly sampled data and long range dependencies while enabled memory and communication efficient distributed processing of time series data. By operating in the frequency domain we eliminate the need to interpolate and help mitigate the effects of long range dependencies. We implement and evaluate our new work-flow in the distributed setting using Apache Spark and demonstrate on both Monte Carlo simulations and high-frequency financial trading that we can accurately recover causal structure at scale.

preprint2015arXiv

Embarrassingly Parallel Time Series Analysis for Large Scale Weak Memory Systems

Second order stationary models in time series analysis are based on the analysis of essential statistics whose computations follow a common pattern. In particular, with a map-reduce nomenclature, most of these operations can be modeled as mapping a kernel that only depends on short windows of consecutive data and reducing the results produced by each computation. This computational pattern stems from the ergodicity of the model under consideration and is often referred to as weak or short memory when it comes to data indexed with respect to time. In the following we will show how studying weak memory systems can be done in a scalable manner thanks to a framework relying on specifically designed overlapping distributed data structures that enable fragmentation and replication of the data across many machines as well as parallelism in computations. This scheme has been implemented for Apache Spark but is certainly not system specific. Indeed we prove it is also adapted to leveraging high bandwidth fragmented memory blocks on GPUs.

preprint2014arXiv

A Necessary and Sufficient Condition for the Existence of Potential Functions for Heterogeneous Routing Games

We study a heterogeneous routing game in which vehicles might belong to more than one type. The type determines the cost of traveling along an edge as a function of the flow of various types of vehicles over that edge. We relax the assumptions needed for the existence of a Nash equilibrium in this heterogeneous routing game. We extend the available results to present necessary and sufficient conditions for the existence of a potential function. We characterize a set of tolls that guarantee the existence of a potential function when only two types of users are participating in the game. We present an upper bound for the price of anarchy (i.e., the worst-case ratio of the social cost calculated for a Nash equilibrium over the social cost for a socially optimal flow) for the case in which only two types of players are participating in a game with affine edge cost functions. A heterogeneous routing game with vehicle platooning incentives is used as an example throughout the article to clarify the concepts and to validate the results.

preprint2014arXiv

Environmental Sensing by Wearable Device for Indoor Activity and Location Estimation

We present results from a set of experiments in this pilot study to investigate the causal influence of user activity on various environmental parameters monitored by occupant carried multi-purpose sensors. Hypotheses with respect to each type of measurements are verified, including temperature, humidity, and light level collected during eight typical activities: sitting in lab / cubicle, indoor walking / running, resting after physical activity, climbing stairs, taking elevators, and outdoor walking. Our main contribution is the development of features for activity and location recognition based on environmental measurements, which exploit location- and activity-specific characteristics and capture the trends resulted from the underlying physiological process. The features are statistically shown to have good separability and are also information-rich. Fusing environmental sensing together with acceleration is shown to achieve classification accuracy as high as 99.13%. For building applications, this study motivates a sensor fusion paradigm for learning individualized activity, location, and environmental preferences for energy management and user comfort.

preprint2014arXiv

Learning Nash Equilibria in Congestion Games

We study the repeated congestion game, in which multiple populations of players share resources, and make, at each iteration, a decentralized decision on which resources to utilize. We investigate the following question: given a model of how individual players update their strategies, does the resulting dynamics of strategy profiles converge to the set of Nash equilibria of the one-shot game? We consider in particular a model in which players update their strategies using algorithms with sublinear discounted regret. We show that the resulting sequence of strategy profiles converges to the set of Nash equilibria in the sense of Cesàro means. However, strong convergence is not guaranteed in general. We show that strong convergence can be guaranteed for a class of algorithms with a vanishing upper bound on discounted regret, and which satisfy an additional condition. We call such algorithms AREP algorithms, for Approximate REPlicator, as they can be interpreted as a discrete-time approximation of the replicator equation, which models the continuous-time evolution of population strategies, and which is known to converge for the class of congestion games. In particular, we show that the discounted Hedge algorithm belongs to the AREP class, which guarantees its strong convergence.

preprint2014arXiv

Modeling and Estimation of the Humans' Effect on the CO2 Dynamics Inside a Conference Room

We develop a data-driven, {\em Partial Differential Equation-Ordinary Differential Equation} (PDE-ODE) model that describes the response of the {\em Carbon Dioxide} (\cotwon) dynamics inside a conference room, due to the presence of humans, or of a user-controlled exogenous source of \cotwon. We conduct two controlled experiments in order to develop and tune a model whose output matches the measured output concentration of \cotwo inside the room, when known inputs are applied to the model. In the first experiment, a controlled amount of \cotwo gas is released inside the room from a regulated supply, and in the second, a known number of humans produce a certain amount of \cotwo inside the room. For the estimation of the exogenous inputs, we design an observer, based on our model, using measurements of \cotwo concentrations at two locations inside the room. Parameter identifiers are also designed, based on our model, for the online estimation of the parameters of the model. We perform several simulation studies for the illustration of our designs.

preprint2012arXiv

Large Scale Estimation in Cyberphysical Systems using Streaming Data: a Case Study with Smartphone Traces

Controlling and analyzing cyberphysical and robotics systems is increasingly becoming a Big Data challenge. Pushing this data to, and processing in the cloud is more efficient than on-board processing. However, current cloud-based solutions are not suitable for the latency requirements of these applications. We present a new concept, Discretized Streams or D-Streams, that enables massively scalable computations on streaming data with latencies as short as a second. We experiment with an implementation of D-Streams on top of the Spark computing framework. We demonstrate the usefulness of this concept with a novel algorithm to estimate vehicular traffic in urban networks. Our online EM algorithm can estimate traffic on a very large city network (the San Francisco Bay Area) by processing tens of thousands of observations per second, with a latency of a few seconds.

preprint2012arXiv

Understanding Road Usage Patterns in Urban Areas

In this paper, we combine the most complete record of daily mobility, based on large-scale mobile phone data, with detailed Geographic Information System (GIS) data, uncovering previously hidden patterns in urban road usage. We find that the major usage of each road segment can be traced to its own - surprisingly few - driver sources. Based on this finding we propose a network of road usage by defining a bipartite network framework, demonstrating that in contrast to traditional approaches, which define road importance solely by topological measures, the role of a road segment depends on both: its betweeness and its degree in the road usage network. Moreover, our ability to pinpoint the few driver sources contributing to the major traffic flow allows us to create a strategy that achieves a significant reduction of the travel time across the entire road system, compared to a benchmark approach.

Alexandre M. Bayen

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

Supervised and Unsupervised Neural Network Solver for First Order Hyperbolic Nonlinear PDEs

Adaptive Coordination Offsets for Signalized Arterial Intersections using Deep Reinforcement Learning

Learning energy-efficient driving behaviors by imitating experts

Unified Automatic Control of Vehicular Systems with Reinforcement Learning

BISTRO: Berkeley Integrated System for Transportation Optimization

Differential Privacy of Populations in Routing Games

Scalable Linear Causal Inference for Irregularly Sampled Time Series with Long Range Dependencies

Embarrassingly Parallel Time Series Analysis for Large Scale Weak Memory Systems

A Necessary and Sufficient Condition for the Existence of Potential Functions for Heterogeneous Routing Games

Environmental Sensing by Wearable Device for Indoor Activity and Location Estimation

Learning Nash Equilibria in Congestion Games

Modeling and Estimation of the Humans' Effect on the CO2 Dynamics Inside a Conference Room

Large Scale Estimation in Cyberphysical Systems using Streaming Data: a Case Study with Smartphone Traces

Understanding Road Usage Patterns in Urban Areas