Researcher profile

Mahnoosh Alizadeh

Mahnoosh Alizadeh contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
14works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

14 published item(s)

preprint2022arXiv

An Online Scheduling Algorithm for a Community Energy Storage System

In this paper, we consider a community energy storage (CES) system that is shared by various electricity consumers who want to charge and discharge the CES throughout a given time span. We study the problem facing the manager of such a CES who must schedule the charging, discharging, and capacity reservations for numerous users. Moreover, we consider the case where requests to charge/discharge the CES arrive in an online fashion and the CES manager must immediately allocate charging power and energy capacity to fulfill the request or reject the request altogether. The objective of the CES manager is to maximize the total value gained by all of the users of the CES while accounting for the operational constraints of the CES. We discuss an algorithm titled \textsc{CommunityEnergyScheduling} that acts as a pricing mechanism based on online primal-dual optimization as a solution to the CES manager's problem. The online algorithm estimates the dual variables (prices) in real-time to allow for requests to be allocated or rejected immediately as they arrive. Furthermore, the proposed method promotes charging and discharging cancellations to reduce the CES's usage at popular times and is able to handle the inherent stochastic nature of the requests to charge/discharge stemming from randomness in users' net load patterns and weather uncertainties. Additionally, we are able to show that the algorithm is able to handle any adversarially chosen request sequence and will always yield total welfare within a factor of 1/a of the offline optimal welfare.

preprint2022arXiv

Collaborative Multi-agent Stochastic Linear Bandits

We study a collaborative multi-agent stochastic linear bandit setting, where $N$ agents that form a network communicate locally to minimize their overall regret. In this setting, each agent has its own linear bandit problem (its own reward parameter) and the goal is to select the best global action w.r.t. the average of their reward parameters. At each round, each agent proposes an action, and one action is randomly selected and played as the network action. All the agents observe the corresponding rewards of the played actions and use an accelerated consensus procedure to compute an estimate of the average of the rewards obtained by all the agents. We propose a distributed upper confidence bound (UCB) algorithm and prove a high probability bound on its $T$-round regret in which we include a linear growth of regret associated with each communication round. Our regret bound is of order $\mathcal{O}\Big(\sqrt{\frac{T}{N \log(1/|λ_2|)}}\cdot (\log T)^2\Big)$, where $λ_2$ is the second largest (in absolute value) eigenvalue of the communication matrix.

preprint2022arXiv

Feature and Parameter Selection in Stochastic Linear Bandits

We study two model selection settings in stochastic linear bandits (LB). In the first setting, which we refer to as feature selection, the expected reward of the LB problem is in the linear span of at least one of $M$ feature maps (models). In the second setting, the reward parameter of the LB problem is arbitrarily selected from $M$ models represented as (possibly) overlapping balls in $\mathbb R^d$. However, the agent only has access to misspecified models, i.e.,~estimates of the centers and radii of the balls. We refer to this setting as parameter selection. For each setting, we develop and analyze a computationally efficient algorithm that is based on a reduction from bandits to full-information problems. This allows us to obtain regret bounds that are not worse (up to a $\sqrt{\log M}$ factor) than the case where the true model is known. This is the best-reported dependence on the number of models $M$ in these settings. Finally, we empirically show the effectiveness of our algorithms using synthetic and real-world experiments.

preprint2022arXiv

Multi-Environment Meta-Learning in Stochastic Linear Bandits

In this work we investigate meta-learning (or learning-to-learn) approaches in multi-task linear stochastic bandit problems that can originate from multiple environments. Inspired by the work of [1] on meta-learning in a sequence of linear bandit problems whose parameters are sampled from a single distribution (i.e., a single environment), here we consider the feasibility of meta-learning when task parameters are drawn from a mixture distribution instead. For this problem, we propose a regularized version of the OFUL algorithm that, when trained on tasks with labeled environments, achieves low regret on a new task without requiring knowledge of the environment from which the new task originates. Specifically, our regret bound for the new algorithm captures the effect of environment misclassification and highlights the benefits over learning each task separately or meta-learning without recognition of the distinct mixture components.

preprint2022arXiv

Real-Time Electric Vehicle Smart Charging at Workplaces: A Real-World Case Study

We study a real-time smart charging algorithm for electric vehicles (EVs) at a workplace parking lot in order to minimize electricity cost from time-of-use electricity rates and demand charges while ensuring that the owners of the EVs receive adequate levels of charge. Notably, due to real-world constraints, our algorithm is agnostic to both the state-of-charge and the departure time of the EVs and uses scenario generation to account for each EV's unknown future departure time as well as certainty equivalent control to account for the unknown EV arrivals in the future. Real-world charging data from a Google campus in California allows us to build realistic models of charging demand for each day of the week. We then compare various results from our smart charging algorithm to the status quo for a two week period at a Google parking location.

preprint2022arXiv

Robust Distributed Optimization With Randomly Corrupted Gradients

In this paper, we propose a first-order distributed optimization algorithm that is provably robust to Byzantine failures-arbitrary and potentially adversarial behavior, where all the participating agents are prone to failure. We model each agent's state over time as a two-state Markov chain that indicates Byzantine or trustworthy behaviors at different time instants. We set no restrictions on the maximum number of Byzantine agents at any given time. We design our method based on three layers of defense: 1) temporal robust aggregation, 2) spatial robust aggregation, and 3) gradient normalization. We study two settings for stochastic optimization, namely Sample Average Approximation and Stochastic Approximation. We provide convergence guarantees of our method for strongly convex and smooth non-convex cost functions.

preprint2022arXiv

Safe Dual Gradient Method for Network Utility Maximization Problems

In this paper, we introduce a novel first-order dual gradient algorithm for solving network utility maximization problems that arise in resource allocation schemes over networks with safety-critical constraints. Inspired by applications where customers' demand can only be affected through posted prices and real-time two-way communication with customers is not available, we require an algorithm to generate \textit{safe prices}. This means that at no iteration should the realized demand in response to the posted prices violate the safety constraints of the network. Thus, in contrast to existing first-order methods, our algorithm, called the safe dual gradient method (SDGM), is guaranteed to produce feasible primal iterates at all iterations. We ensure primal feasibility by 1) adding a diminishing safety margin to the constraints, and 2) using a sign-based dual update method with different step sizes for plus and minus directions. In addition, we prove that the primal iterates produced by the SDGM achieve a sublinear static regret of ${\cal O}(\sqrt{T})$.

preprint2022arXiv

Strategic investments in multi-stage General Lotto games

In adversarial interactions, one is often required to make strategic decisions over multiple periods of time, wherein decisions made earlier impact a player's competitive standing as well as how choices are made in later stages. In this paper, we study such scenarios in the context of General Lotto games, which models the competitive allocation of resources over multiple battlefields between two players. We propose a two-stage formulation where one of the players has reserved resources that can be strategically pre-allocated across the battlefields in the first stage. The pre-allocation then becomes binding and is revealed to the other player. In the second stage, the players engage by simultaneously allocating their real-time resources against each other. The main contribution in this paper provides complete characterizations of equilibrium payoffs in the two-stage game, revealing the interplay between performance and the amount of resources expended in each stage of the game. We find that real-time resources are at least twice as effective as pre-allocated resources. We then determine the player's optimal investment when there are linear costs associated with purchasing each type of resource before play begins, and there is a limited monetary budget.

preprint2020arXiv

Constrained Thompson Sampling for Real-Time Electricity Pricing with Grid Reliability Constraints

We consider the problem of an aggregator attempting to learn customers' load flexibility models while implementing a load shaping program by means of broadcasting daily dispatch signals. We adopt a multi-armed bandit formulation to account for the stochastic and unknown nature of customers' responses to dispatch signals. We propose a constrained Thompson sampling heuristic, Con-TS-RTP, that accounts for various possible aggregator objectives (e.g., to reduce demand at peak hours, integrate more intermittent renewable generation, track a desired daily load profile, etc) and takes into account the operational constraints of a distribution system to avoid potential grid failures as a result of uncertainty in the customers' response. We provide a discussion on the regret bounds for our algorithm as well as a discussion on the operational reliability of the distribution system's constraints being upheld throughout the learning process.

preprint2020arXiv

Mobility-Aware Electric Vehicle Fast Charging Load Models with Geographical Price Variations

We study the traffic patterns as well as the charging patterns of a population of cost-minimizing EV owners traveling and charging within a transportation network equipped with fast charging stations (FCSs). Specifically, we study how the charging network operator (CNO) can influence where EV users charge in order to optimize the utilization of fast charging stations. These charging decisions of private EV owners affect aggregate congestion at stations (i.e., waiting time) as well as the aggregate EV charging load across the network. In this work, we capture the resulting equilibrium wait times and electricity load through a so-called \textit{traffic and charge assignment problem} (TCAP) in a fast charging station network. Our formulation allows us to: 1) Study the expected station wait times as well as the probability distribution of aggregate charging load of EVs in a FCS network in a mobility-aware fashion (an aspect unique to our work), while accounting for heterogeneities in users' travel patterns, energy demands, and geographically variant electricity prices. 2) Analytically characterize the special threshold-based structure that determines how EV owners choose where to charge their vehicle at equilibrium, in response to the FCS's charging price structure, users' energy demands, and users' mobility patterns. 3) Provide a convex optimization problem formulation to identify the network's unique equilibrium. Furthermore, we illustrate how to induce a socially optimal charging behavior by deriving the socially optimal plug-in fees and electricity prices at the charging stations.

preprint2020arXiv

Regret Bounds for Safe Gaussian Process Bandit Optimization

Many applications require a learner to make sequential decisions given uncertainty regarding both the system's payoff function and safety constraints. In safety-critical systems, it is paramount that the learner's actions do not violate the safety constraints at any stage of the learning process. In this paper, we study a stochastic bandit optimization problem where the unknown payoff and constraint functions are sampled from Gaussian Processes (GPs) first considered in [Srinivas et al., 2010]. We develop a safe variant of GP-UCB called SGP-UCB, with necessary modifications to respect safety constraints at every round. The algorithm has two distinct phases. The first phase seeks to estimate the set of safe actions in the decision set, while the second phase follows the GP-UCB decision rule. Our main contribution is to derive the first sub-linear regret bounds for this problem. We numerically compare SGP-UCB against existing safe Bayesian GP optimization algorithms.

preprint2020arXiv

Safe Linear Thompson Sampling with Side Information

The design and performance analysis of bandit algorithms in the presence of stage-wise safety or reliability constraints has recently garnered significant interest. In this work, we consider the linear stochastic bandit problem under additional \textit{linear safety constraints} that need to be satisfied at each round. We provide a new safe algorithm based on linear Thompson Sampling (TS) for this problem and show a frequentist regret of order $\mathcal{O} (d^{3/2}\log^{1/2}d \cdot T^{1/2}\log^{3/2}T)$, which remarkably matches the results provided by (Abeille et al., 2017) for the standard linear TS algorithm in the absence of safety constraints. We compare the performance of our algorithm with UCB-based safe algorithms and highlight how the inherently randomized nature of TS leads to a superior performance in expanding the set of safe actions the algorithm has access to at each round.

preprint2013arXiv

Real-Time Power Balancing via Decentralized Coordinated Home Energy Scheduling

It is anticipated that an uncoordinated operation of individual home energy management (HEM) systems in a neighborhood would have a rebound effect on the aggregate demand profile. To address this issue, this paper proposes a coordinated home energy management (CoHEM) architecture in which distributed HEM units collaborate with each other in order to keep the demand and supply balanced in their neighborhood. Assuming the energy requests by customers are random in time, we formulate the proposed CoHEM design as a multi-stage stochastic optimization problem. We propose novel models to describe the deferrable appliance load (e.g., Plug-in (Hybrid) Electric Vehicles (PHEV)), and apply approximation and decomposition techniques to handle the considered design problem in a decentralized fashion. The developed decentralized CoHEM algorithm allow the customers to locally compute their scheduling solutions using domestic user information and with message exchange between their neighbors only. Extensive simulation results demonstrate that the proposed CoHEM architecture can effectively improve real-time power balancing. Extensions to joint power procurement and real-time CoHEM scheduling are also presented.

preprint2012arXiv

Coordinated Home Energy Management for Real-Time Power Balancing

This paper proposes a coordinated home energy management system (HEMS) architecture where the distributed residential units cooperate with each other to achieve real-time power balancing. The economic benefits for the retailer and incentives for the customers to participate in the proposed coordinated HEMS program are given. We formulate the coordinated HEMS design problem as a dynamic programming (DP) and use approximate DP approaches to efficiently handle the design problem. A distributed implementation algorithm based on the convex optimization based dual decomposition technique is also presented. Our focus in the current paper is on the deferrable appliances, such as Plug-in (Hybrid) Electric Vehicles (PHEV), in view of their higher impact on the grid stability. Simulation results shows that the proposed coordinated HEMS architecture can efficiently improve the real-time power balancing.