Researcher profile

Esmaeil Seraj

Esmaeil Seraj contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2022arXiv

Iterated Reasoning with Mutual Information in Cooperative and Byzantine Decentralized Teaming

Information sharing is key in building team cognition and enables coordination and cooperation. High-performing human teams also benefit from acting strategically with hierarchical levels of iterated communication and rationalizability, meaning a human agent can reason about the actions of their teammates in their decision-making. Yet, the majority of prior work in Multi-Agent Reinforcement Learning (MARL) does not support iterated rationalizability and only encourage inter-agent communication, resulting in a suboptimal equilibrium cooperation strategy. In this work, we show that reformulating an agent's policy to be conditional on the policies of its neighboring teammates inherently maximizes Mutual Information (MI) lower-bound when optimizing under Policy Gradient (PG). Building on the idea of decision-making under bounded rationality and cognitive hierarchy theory, we show that our modified PG approach not only maximizes local agent rewards but also implicitly reasons about MI between agents without the need for any explicit ad-hoc regularization terms. Our approach, InfoPG, outperforms baselines in learning emergent collaborative behaviors and sets the state-of-the-art in decentralized cooperative MARL tasks. Our experiments validate the utility of InfoPG by achieving higher sample efficiency and significantly larger cumulative reward in several complex cooperative multi-agent domains.

preprint2022arXiv

Multi-UAV Planning for Cooperative Wildfire Coverage and Tracking with Quality-of-Service Guarantees

In recent years, teams of robot and Unmanned Aerial Vehicles (UAVs) have been commissioned by researchers to enable accurate, online wildfire coverage and tracking. While the majority of prior work focuses on the coordination and control of such multi-robot systems, to date, these UAV teams have not been given the ability to reason about a fire's track (i.e., location and propagation dynamics) to provide performance guarantee over a time horizon. Motivated by the problem of aerial wildfire monitoring, we propose a predictive framework which enables cooperation in multi-UAV teams towards collaborative field coverage and fire tracking with probabilistic performance guarantee. Our approach enables UAVs to infer the latent fire propagation dynamics for time-extended coordination in safety-critical conditions. We derive a set of novel, analytical temporal, and tracking-error bounds to enable the UAV-team to distribute their limited resources and cover the entire fire area according to the case-specific estimated states and provide a probabilistic performance guarantee. Our results are not limited to the aerial wildfire monitoring case-study and are generally applicable to problems, such as search-and-rescue, target tracking and border patrol. We evaluate our approach in simulation and provide demonstrations of the proposed framework on a physical multi-robot testbed to account for real robot dynamics and restrictions. Our quantitative evaluations validate the performance of our method accumulating 7.5x and 9.0x smaller tracking-error than state-of-the-art model-based and reinforcement learning benchmarks, respectively.

preprint2022arXiv

Utilizing Human Feedback for Primitive Optimization in Wheelchair Tennis

Agile robotics presents a difficult challenge with robots moving at high speeds requiring precise and low-latency sensing and control. Creating agile motion that accomplishes the task at hand while being safe to execute is a key requirement for agile robots to gain human trust. This requires designing new approaches that are flexible and maintain knowledge over world constraints. In this paper, we consider the problem of building a flexible and adaptive controller for a challenging agile mobile manipulation task of hitting ground strokes on a wheelchair tennis robot. We propose and evaluate an extension to work done on learning striking behaviors using a probabilistic movement primitive (ProMP) framework by (1) demonstrating the safe execution of learned primitives on an agile mobile manipulator setup, and (2) proposing an online primitive refinement procedure that utilizes evaluative feedback from humans on the executed trajectories.

preprint2020arXiv

Coordinated Control of UAVs for Human-Centered Active Sensing of Wildfires

Fighting wildfires is a precarious task, imperiling the lives of engaging firefighters and those who reside in the fire's path. Firefighters need online and dynamic observation of the firefront to anticipate a wildfire's unknown characteristics, such as size, scale, and propagation velocity, and to plan accordingly. In this paper, we propose a distributed control framework to coordinate a team of unmanned aerial vehicles (UAVs) for a human-centered active sensing of wildfires. We develop a dual-criterion objective function based on Kalman uncertainty residual propagation and weighted multi-agent consensus protocol, which enables the UAVs to actively infer the wildfire dynamics and parameters, track and monitor the fire transition, and safely manage human firefighters on the ground using acquired information. We evaluate our approach relative to prior work, showing significant improvements by reducing the environment's cumulative uncertainty residual by more than $ 10^2 $ and $ 10^5 $ times in firefront coverage performance to support human-robot teaming for firefighting. We also demonstrate our method on physical robots in a mock firefighting exercise.

preprint2020arXiv

Essential Motor Cortex Signal Processing: an ERP and functional connectivity MATLAB toolbox -- user guide version 2.0

The purpose of this document is to help individuals use the "Essential Motor Cortex Signal Processing MATLAB Toolbox". The toolbox implements various methods for three major aspects of investigating human motor cortex from Neuroscience view point: (1) ERP estimation and quantification, (2) Cortical Functional Connectivity analysis and (3) EMG quantification. The toolbox -- which is distributed under the terms of the GNU GENERAL PUBLIC LICENSE as a set of MATLAB R routines -- can be downloaded directly at the address: http://oset.ir/category.php?dir=Tools or from the public repository on GitHub, at address below: https://github.com/EsiSeraj/ERP Connectivity EMG Analysis The purpose of this toolbox is threefold: 1. Extract the event-related-potential (ERP) from preprocessed cerebral signals (i.e. EEG, MEG, etc.), identify and then quantify the event-related synchronization/desynchronization (ERS/ERD) events. Both time-course dynamics and time-frequency (TF) analyzes are included. 2. Measure, quantify and demonstrate the cortical functional connectivity (CFC) across scalp electrodes. These set of functions can also be applied to various types of cerebral signals (i.e. electric and magnetic). 3. Quantify electromyogram (EMG) recorded from active muscles during performing motor tasks.