Source author record

Pratap Tokekar

Pratap Tokekar appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Robotics Artificial Intelligence Machine Learning Computation and Language Computer Vision Multiagent Systems math.OC Systems and Control Discrete Mathematics eess.SY

Catalog footprint

What is connected

26works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

DeltaRubric: Generative Multimodal Reward Modeling via Joint Planning and Verification

Aligning Multimodal Large Language Models (MLLMs) requires reliable reward models, yet existing single-step evaluators can suffer from lazy judging, exploiting language priors over fine-grained visual verification. While rubric-based evaluation mitigates these biases in text-only settings, extending it to multimodal tasks is bottlenecked by the complexity of visual reasoning. The critical differences between responses often depend on instance-specific visual details. Robust evaluation requires dynamically synthesizing rubrics that isolate spatial and factual discrepancies. To address this, we introduce $\textbf{DeltaRubric}$, an approach that reformulates multimodal preference evaluation as a plan-and-execute process within a single MLLM. DeltaRubric operates in two steps: acting first as a $\textit{Disagreement Planner}$, the model generates a neutral, instance-specific verification checklist. Transitioning into a $\textit{Checklist Verifier}$, it executes these self-generated checks against the image and question to produce the final grounded judgment. We formulate DeltaRubric as a multi-role reinforcement learning problem, jointly optimizing planning and verification capabilities. Validated on Qwen3-VL 4B and 8B Instruct models, DeltaRubric achieves solid empirical gains. For instance, On VL-RewardBench, it improves base model overall accuracy by $\textbf{+22.6}$ (4B) and $\textbf{+18.8}$ (8B) points, largely outperforming standard no-rubric baselines. The results demonstrate that decomposing evaluation into structured, verifiable steps leads to more reliable and generalizable multimodal reward modeling.

preprint2026arXiv

Dual-Uncertainty Guided Policy Learning for Multimodal Reasoning

Reinforcement learning with verifiable rewards (RLVR) has advanced reasoning capabilities in multimodal large language models. However, existing methods typically treat visual inputs as deterministic, overlooking the perceptual ambiguity inherent to the visual modality. Consequently, they fail to distinguish whether a model's uncertainty stems from complex reasoning or ambiguous perception, preventing the targeted allocation of exploration or learning signals. To address this gap, we introduce DUPL, a dual-uncertainty guided policy learning approach for multimodal RLVR that quantifies and leverages both perceptual uncertainty (via symmetric KL divergence) and output uncertainty (via policy entropy) to guide policy updates. By establishing an uncertainty-driven feedback loop and employing a dynamic branch prioritization mechanism, DUPL recalibrates the policy advantage to focus learning on states with high perceptual or decisional ambiguity, enabling effective targeted exploration beyond passive data augmentation. Implemented on top of GRPO and evaluated on six multimodal mathematical and general-domain reasoning benchmarks, DUPL improves Qwen2.5-VL 3B and 7B models, achieving accuracy gains of up to 11.2% on visual math tasks and up to 7.1% on general-domain reasoning tasks, while consistently outperforming GRPO. These results demonstrate that dual-uncertainty guided policy learning is an effective and generalizable approach for multimodal RLVR.

preprint2026arXiv

Reinforcing Multimodal Reasoning Against Visual Degradation

Reinforcement Learning has significantly advanced the reasoning capabilities of Multimodal Large Language Models (MLLMs), yet the resulting policies remain brittle against real-world visual degradations such as blur, compression artifacts, and low-resolution scans. Prior robustness techniques from vision and deep RL rely on static data augmentation or value-based regularization, neither of which transfers cleanly to critic-free RL fine-tuning of autoregressive MLLMs. Reinforcing reasoning against such corruptions is non-trivial: naively injecting degraded views during rollout induces reward poisoning, where perceptual occlusions trigger hallucinated trajectories and destabilize optimization. We propose ROMA, an RL fine-tuning framework that modifies the optimization dynamics to reinforce reasoning against visual degradation while preserving clean-input performance. A dual-forward-pass strategy uses teacher forcing to evaluate corrupted views against clean-image trajectories, avoiding new rollouts on degraded inputs. For distributional consistency, we apply a token-level surrogate KL penalty against the worst-case augmentation; to prevent policy collapse under regularization, an auxiliary policy gradient loss anchored to clean-image advantages preserves a reliable reward signal; and to avoid systematically incorrect invariance, correctness-conditioned regularization restricts enforcement to successful trajectories. On Qwen3-VL 4B/8B across seven multimodal reasoning benchmarks, our method improves robustness by +2.4% on seen and +2.3% on unseen corruptions over GRPO while matching clean accuracy.

preprint2025arXiv

CAML: Collaborative Auxiliary Modality Learning for Multi-Agent Systems

Multi-modal learning has emerged as a key technique for improving performance across domains such as autonomous driving, robotics, and reasoning. However, in certain scenarios, particularly in resource-constrained environments, some modalities available during training may be absent during inference. While existing frameworks effectively utilize multiple data sources during training and enable inference with reduced modalities, they are primarily designed for single-agent settings. This poses a critical limitation in dynamic environments such as connected autonomous vehicles (CAV), where incomplete data coverage can lead to decision-making blind spots. Conversely, some works explore multi-agent collaboration but without addressing missing modality at test time. To overcome these limitations, we propose Collaborative Auxiliary Modality Learning (CAML), a novel multi-modal multi-agent framework that enables agents to collaborate and share multi-modal data during training, while allowing inference with reduced modalities during testing. Experimental results in collaborative decision-making for CAV in accident-prone scenarios demonstrate that CAML achieves up to a 58.1% improvement in accident detection. Additionally, we validate CAML on real-world aerial-ground robot data for collaborative semantic segmentation, achieving up to a 10.6% improvement in mIoU.

preprint2022arXiv

Distributed Attack-Robust Submodular Maximization for Multi-Robot Planning

In this paper, we design algorithms to protect swarm-robotics applications against sensor denial-of-service (DoS) attacks on robots. We focus on applications requiring the robots to jointly select actions, e.g., which trajectory to follow, among a set of available ones. Such applications are central in large-scale robotic applications, such as multi-robot motion planning for target tracking. But the current attack-robust algorithms are centralized. In this paper, we propose a general-purpose distributed algorithm towards robust optimization at scale, with local communications only. We name it Distributed Robust Maximization (DRM). DRM proposes a divide-and-conquer approach that distributively partitions the problem among cliques of robots. Then, the cliques optimize in parallel, independently of each other. We prove DRM achieves a close-to-optimal performance. We demonstrate DRM's performance in both Gazebo and MATLAB simulations, in scenarios of active target tracking with swarms of robots. In the simulations, DRM achieves computational speed-ups, being 1-2 orders faster than the centralized algorithms; yet, it nearly matches the tracking performance of the centralized counterparts. Since, DRM overestimates the number of attacks in each clique, in this paper we also introduce an Improved Distributed Robust Maximization (IDRM) algorithm. IDRM infers the number of attacks in each clique less conservatively than DRM by leveraging 3-hop neighboring communications. We verify IDRM improves DRM's performance in simulations.

preprint2022arXiv

Graph Neural Networks for Decentralized Multi-Robot Submodular Action Selection

The problem of decentralized multi-robot target tracking asks for jointly selecting actions, e.g., motion primitives, for the robots to maximize target tracking performance with local communications. One major challenge for practical implementations is to make target tracking approaches scalable for large-scale problem instances. In this work, we propose a general-purpose learning architecture toward collaborative target tracking at scale, with decentralized communications. Particularly, our learning architecture leverages a graph neural network (GNN) to capture local interactions of the robots and learns decentralized decision-making for the robots. We train the learning model by imitating an expert solution and implement the resulting model for decentralized action selection involving local observations and communications only. We demonstrate the performance of our GNN-based learning approach in a scenario of active target tracking with large networks of robots. The simulation results show our approach nearly matches the tracking performance of the expert algorithm, and yet runs several orders faster with up to 100 robots. Moreover, it slightly outperforms a decentralized greedy algorithm but runs faster (especially with more than 20 robots). The results also exhibit our approach's generalization capability in previously unseen scenarios, e.g., larger environments and larger networks of robots.

preprint2022arXiv

On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces

We focus on parameterized policy search for reinforcement learning over continuous action spaces. Typically, one assumes the score function associated with a policy is bounded, which fails to hold even for Gaussian policies. To properly address this issue, one must introduce an exploration tolerance parameter to quantify the region in which it is bounded. Doing so incurs a persistent bias that appears in the attenuation rate of the expected policy gradient norm, which is inversely proportional to the radius of the action space. To mitigate this hidden bias, heavy-tailed policy parameterizations may be used, which exhibit a bounded score function, but doing so can cause instability in algorithmic updates. To address these issues, in this work, we study the convergence of policy gradient algorithms under heavy-tailed parameterizations, which we propose to stabilize with a combination of mirror ascent-type updates and gradient tracking. Our main theoretical contribution is the establishment that this scheme converges with constant step and batch sizes, whereas prior works require these parameters to respectively shrink to null or grow to infinity. Experimentally, this scheme under a heavy-tailed policy parameterization yields improved reward accumulation across a variety of settings as compared with standard benchmarks.

preprint2022arXiv

Online Exploration of an Unknown Region of Interest with a Team of Aerial Robots

In this paper, we study the problem of exploring an unknown Region Of Interest (ROI) with a team of aerial robots. The size and shape of the ROI are unknown to the robots. The objective is to find a tour for each robot such that each point in the ROI must be visible from the field-of-view of some robot along its tour. In conventional exploration using ground robots, the ROI boundary is typically also as an obstacle and robots are naturally constrained to the interior of this ROI. Instead, we study the case where aerial robots are not restricted to flying inside the ROI (and can fly over the boundary of the ROI). We propose a recursive depth-first search-based algorithm that yields a constant competitive ratio for the exploration problem. Our analysis also extends to the case where the ROI is translating, \eg, in the case of marine plumes. In the simpler version of the problem where the ROI is modeled as a 2D grid, the competitive ratio is $\frac{2(S_r+S_p)(R+\lfloor\log{R}\rfloor)}{(S_r-S_p)(1+\lfloor\log{R}\rfloor)}$ where $R$ is the number of robots, and $S_r$ and $S_p$ are the robot speed and the ROI speed, respectively. We also consider a more realistic scenario where the ROI shape is not restricted to grid cells but an arbitrary shape. We show our algorithm has $\frac{2(S_r+S_p)(18R+\lfloor\log{R}\rfloor)}{(S_r-S_p)(1+\lfloor\log{R}\rfloor)}$ competitive ratio under some conditions. We empirically verify our algorithm using simulations as well as a proof-of-concept experiment mapping a 2D ROI using an aerial robot with a downwards-facing camera.

preprint2022arXiv

Risk-Aware Path Planning for Ground Vehicles using Occluded Aerial Images

We consider scenarios where a ground vehicle plans its path using data gathered by an aerial vehicle. In the aerial images, navigable areas of the scene may be occluded due to obstacles. Naively planning paths using aerial images may result in longer paths as a conservative planner may try to avoid regions that are occluded. We propose a modular, deep learning-based framework that allows the robot to predict the existence of navigable areas in the occluded regions. Specifically, we use image inpainting methods to fill in parts of the areas that are potentially occluded, which can then be semantically segmented to determine navigability. We use supervised neural networks for both modules. However, these predictions may be incorrect. Therefore, we extract uncertainty in these predictions and use a risk-aware approach that takes these uncertainties into account for path planning. We compare modules in our approach with non-learning-based approaches to show the efficacy of the proposed framework through photo-realistic simulations. The modular pipeline allows further improvement in path planning and deployment in different settings.

preprint2022arXiv

Risk-aware Resource Allocation for Multiple UAVs-UGVs Recharging Rendezvous

We study a resource allocation problem for the cooperative aerial-ground vehicle routing application, in which multiple Unmanned Aerial Vehicles (UAVs) with limited battery capacity and multiple Unmanned Ground Vehicles (UGVs) that can also act as a mobile recharging stations need to jointly accomplish a mission such as persistently monitoring a set of points. Due to the limited battery capacity of the UAVs, they sometimes have to deviate from their task to rendezvous with the UGVs and get recharged. Each UGV can serve a limited number of UAVs at a time. In contrast to prior work on deterministic multi-robot scheduling, we consider the challenge imposed by the stochasticity of the energy consumption of the UAV. We are interested in finding the optimal recharging schedule of the UAVs such that the travel cost is minimized and the probability that no UAV runs out of charge within the planning horizon is greater than a user-defined tolerance. We formulate this problem ({Risk-aware Recharging Rendezvous Problem (RRRP))} as an Integer Linear Program (ILP), in which the matching constraint captures the resource availability constraints and the knapsack constraint captures the success probability constraints. We propose a bicriteria approximation algorithm to solve RRRP. We demonstrate the effectiveness of our formulation and algorithm in the context of one persistent monitoring mission.

preprint2022arXiv

Risk-Aware Submodular Optimization for Multi-Robot Coordination

We study the problem of incorporating risk while making combinatorial decisions under uncertainty. We formulate a discrete submodular maximization problem for selecting a set using Conditional-Value-at-Risk (CVaR), a risk metric commonly used in financial analysis. While CVaR has recently been used in optimization of linear cost functions in robotics, we take the first step towards extending this to discrete submodular optimization and provide several positive results. Specifically, we propose the Sequential Greedy Algorithm that provides an approximation guarantee on finding the maxima of the CVaR cost function under a matroidal constraint. The approximation guarantee shows that the solution produced by our algorithm is within a constant factor of the optimal and an additive term that depends on the optimal. Our analysis uses the curvature of the submodular set function, and proves that the algorithm runs in polynomial time. This formulates a number of combinatorial optimization problems that appear in robotics. We use two such problems, vehicle assignment under uncertainty for mobility-on-demand and sensor selection with failures for environmental monitoring, as case studies to demonstrate the efficacy of our formulation. In particular, for the mobility-on-demand study, we propose an online triggering assignment algorithm that triggers a new assignment only can potentially lead to reducing the waiting time at demand locations. We verify the performance of the Sequential Greedy Algorithm and the online triggering assignment algorithm through simulations.

preprint2022arXiv

Risk-aware UAV-UGV Rendezvous with Chance-Constrained Markov Decision Process

We study a chance-constrained variant of the cooperative aerial-ground vehicle routing problem, in which an Unmanned Aerial Vehicle (UAV) with limited battery capacity and an Unmanned Ground Vehicle (UGV) that can also act as a mobile recharging station need to jointly accomplish a mission such as monitoring a set of points. Due to the limited battery capacity of the UAV, two vehicles sometimes have to deviate from their task to rendezvous and recharge the UAV\@. Unlike prior work that has focused on the deterministic case, we address the challenge of stochastic energy consumption of the UAV\@. We are interested in finding the optimal policy that decides when and where to rendezvous such that the expected travel time of the UAV is minimized and the probability of running out of charge is less than a user-defined tolerance. We formulate this problem as a Chance Constrained Markov Decision Process (CCMDP). To the best knowledge of the authors, this is the first CMDP-based formulation for the UAV-UGV routing problems under power consumption uncertainty. We adopt a Linear Programming (LP) based approach to solve the problem optimally. We demonstrate the effectiveness of our formulation in the context of an Intelligence Surveillance and Reconnaissance (ISR) mission.

preprint2021arXiv

Failure-Resilient Coverage Maximization with Multiple Robots

The task of maximizing coverage using multiple robots has several applications such as surveillance, exploration, and environmental monitoring. A major challenge of deploying such multi-robot systems in a practical scenario is to ensure resilience against robot failures. A recent work introduced the Resilient Coverage Maximization (RCM) problem where the goal is to maximize a submodular coverage utility when the robots are subject to adversarial attacks or failures. The RCM problem is known to be NP-hard. In this paper, we propose two approximation algorithms for the RCM problem, namely, the Ordered Greedy (OrG) and the Local Search (LS) algorithm. Both algorithms empirically outperform the state-of-the-art solution in terms of accuracy and running time. To demonstrate the effectiveness of our proposed solution, we empirically compare our proposed algorithms with the existing solution and a brute force optimal algorithm. We also perform a case study on the persistent monitoring problem to show the applicability of our proposed algorithms in a practical setting.

preprint2021arXiv

Multi-robot Symmetric Rendezvous Search on the Line

We study the Symmetric Rendezvous Search Problem for a multi-robot system. There are $n>2$ robots arbitrarily located on a line. Their goal is to meet somewhere on the line as quickly as possible. The robots do not know the initial location of any of the other robots or their own positions on the line. The symmetric version of the problem requires the robots to execute the same search strategy to achieve rendezvous. Therefore, we solve the problem in an online fashion with a randomized strategy. In this paper, we present a symmetric rendezvous algorithm which achieves a constant competitive ratio for the total distance traveled by the robots. We validate our theoretical results through simulations.

preprint2020arXiv

Combining Geometric and Information-Theoretic Approaches for Multi-Robot Exploration

We present an algorithm to explore an orthogonal polygon using a team of $p$ robots. This algorithm combines ideas from information-theoretic exploration algorithms and computational geometry based exploration algorithms. We show that the exploration time of our algorithm is competitive (as a function of $p$) with respect to the offline optimal exploration algorithm. The algorithm is based on a single-robot polygon exploration algorithm, a tree exploration algorithm for higher level planning and a submodular orienteering algorithm for lower level planning. We discuss how this strategy can be adapted to real-world settings to deal with noisy sensors. In addition to theoretical analysis, we investigate the performance of our algorithm through simulations for multiple robots and experiments with a single robot.

preprint2020arXiv

Coverage of an Environment Using Energy-Constrained Unmanned Aerial Vehicles

We study the problem of covering an environment using an Unmanned Aerial Vehicle (UAV) with limited battery capacity. We consider a scenario where the UAV can land on an Unmanned Ground Vehicle (UGV) and recharge the onboard battery. The UGV can also recharge the UAV while transporting the UAV to the next take-off site. We present an algorithm to solve a new variant of the area coverage problem that takes into account this symbiotic UAV and UGV system. The input consists of a set of boustrophedon cells -- rectangular strips whose width is equal to the field-of-view of the sensor on the UAV. The goal is to find a coordinated strategy for the UAV and UGV that visits and covers all cells in minimum time, while optimally finding how much to recharge, where to recharge, and when to recharge the battery. This includes flight time for visiting and covering all cells, recharging time, as well as the take-off and landing times. We show how to reduce this problem to a known NP-hard problem, Generalized Traveling Salesperson Problem (GTSP). Given an optimal GTSP solver, our approach finds the optimal coverage paths for the UAV and UGV. Our formulation models multi-rotor UAVs as well as hybrid UAVs that can operate in fixed-wing and Vertical Take-off and Landing modes. We evaluate our algorithm through simulations and proof-of-concept experiments.

preprint2020arXiv

Experimental Evaluation of a Pseudo-Doppler Direction-Finding System for Localizing Radio Tags

We present the design of a radio antenna system for obtaining instantaneous bearing measurements towards a radio emitter. Our work is motivated by applications where robots are used for localizing and tracking radio-tagged wildlife. The traditional method is to use directional antennas that need to be rotated in order find the bearing which is time consuming. Instead, we present a low-cost system capable of finding bearing measurements almost instantaneously using an antenna array. This is particularly appealing for wildlife tracking with Unmanned Aerial Systems (UASs) where remaining stationary can be challenging and energy consuming, in addition to being slow. The proposed system uses existing open source hardware and software systems and leverages principles of pseudo Doppler direction-finding. The resulting system was tested in an anechoic chamber and in outdoor settings. The outdoor tests with particle filtering show that the resulting system is capable of localizing radio tags within 5 meter accuracy starting with an initial estimate of 200m x 200m.

preprint2020arXiv

Learning a Spatial Field in Minimum Time with a Team of Robots

We study an informative path-planning problem where the goal is to minimize the time required to learn a spatially varying entity. We use Gaussian Process (GP) regression for learning the underlying field. Our goal is to ensure that the GP posterior variance, which is also the mean square error between the learned and actual fields, is below a predefined value. We study three versions of the problem. In the placement version, the objective is to minimize the number of measurement locations while ensuring that the posterior variance is below a predefined threshold. In the mobile robot version, we seek to minimize the total time required to visit and collect measurements from the measurement locations using a single robot. We also study a multi-robot version where the objective is to minimize the time required by the last robot to return to a common starting location called depot. By exploiting the properties of GP regression, we present constant-factor approximation algorithms. In addition to the theoretical results, we also compare the empirical performance using a real-world dataset, with other baseline strategies.

preprint2020arXiv

Multi-Fidelity Reinforcement Learning with Gaussian Processes

We study the problem of Reinforcement Learning (RL) using as few real-world samples as possible. A naive application of RL can be inefficient in large and continuous state spaces. We present two versions of Multi-Fidelity Reinforcement Learning (MFRL), model-based and model-free, that leverage Gaussian Processes (GPs) to learn the optimal policy in a real-world environment. In the MFRL framework, an agent uses multiple simulators of the real environment to perform actions. With increasing fidelity in a simulator chain, the number of samples used in successively higher simulators can be reduced. By incorporating GPs in the MFRL framework, we empirically observe up to $40\%$ reduction in the number of samples for model-based RL and $60\%$ reduction for the model-free version. We examine the performance of our algorithms through simulations and through real-world experiments for navigation with a ground robot.

preprint2020arXiv

Recreating Bat Behavior on Quad-rotor UAVs-A Simulation Approach

We develop an effective computer model to simulate sensing environments that consist of natural trees. The simulated environments are random and contain full geometry of the tree foliage. While this simulated model can be used as a general platform for studying the sensing mechanism of different flying species, our ultimate goal is to build bat-inspired Quad-rotor UAVs- UAVs that can recreate bat's flying behavior (e.g., obstacle avoidance, path planning) in dense vegetation. To this end, we also introduce an foliage echo simulator that can produce simulated echoes by mimicking bat's biosonar. In our current model, a few realistic model choices or assumptions are made. First, in order to create natural looking trees, the branching structures of trees are modeled by L-systems, whereas the detailed geometry of branches, sub-branches and leaves is created by randomizing a reference tree in a CAD object file. Additionally, the foliage echo simulator is simplified so that no shading effect is considered. We demonstrate our developed model by simulating real-world scenarios with multiple trees and compute the corresponding impulse responses along a Quad-rotor trajectory.

preprint2020arXiv

Risk-Aware Planning and Assignment for Ground Vehicles using Uncertain Perception from Aerial Vehicles

We propose a risk-aware framework for multi-robot, multi-demand assignment and planning in unknown environments. Our motivation is disaster response and search-and-rescue scenarios where ground vehicles must reach demand locations as soon as possible. We consider a setting where the terrain information is available only in the form of an aerial, georeferenced image. Deep learning techniques can be used for semantic segmentation of the aerial image to create a cost map for safe ground robot navigation. Such segmentation may still be noisy. Hence, we present a joint planning and perception framework that accounts for the risk introduced due to noisy perception. Our contributions are two-fold: (i) we show how to use Bayesian deep learning techniques to extract risk at the perception level; and (ii) use a risk-theoretical measure, CVaR, for risk-aware planning and assignment. The pipeline is theoretically established, then empirically analyzed through two datasets. We find that accounting for risk at both levels produces quantifiably safer paths and assignments.

preprint2020arXiv

Robust Multi-Agent Task Assignment in Failure-Prone and Adversarial Environments

The problem of assigning agents to tasks is a central computational challenge in many multi-agent autonomous systems. However, in the real world, agents are not always perfect and may fail due to a number of reasons. A motivating application is where the agents are robots that operate in the physical world and are susceptible to failures. This paper studies the problem of Robust Multi-Agent Task Assignment, which seeks to find an assignment that maximizes overall system performance while accounting for potential failures of the agents. We investigate both, stochastic and adversarial failures under this framework. For both cases, we present efficient algorithms that yield optimal or near-optimal results.

preprint2020arXiv

Strategies to Inject Spoofed Measurement Data to Mislead Kalman Filter

We study the problem of designing false measurement data that is injected to corrupt and mislead the output of a Kalman filter. Unlike existing works that focus on detection and filtering algorithms for the observer, we study the problem from the attacker's point-of-view. In our model, the attacker can corrupt the measurements by injecting additive spoofing signals. The attacker seeks to create a separation between the estimate of the Kalman filter with and without spoofed signals. We present a number of results on how to inject spoofing signals while minimizing the magnitude of the injected signals. The resulting strategies are evaluated through simulations along with theoretical proofs. We also evaluate the spoofing strategy in the presence of a $χ^2$ spoof detector. Building on our main result, we present a strategy that is proven to successfully mislead a Kalman filter while ensuring it is not detected.

preprint2016arXiv

Algorithms for Visibility-Based Monitoring with Robot Teams

We study the problem of planning paths for a team of robots for visually monitoring an environment. Our work is motivated by surveillance and persistent monitoring applications. We are given a set of target points in a polygonal environment that must be monitored using robots with cameras. The goal is to compute paths for all robots such that every target is visible from at least one path. In its general form, this problem is NP-hard as it generalizes the Art Gallery Problem and the Watchman Route Problem. We study two versions: (i) a geometric version in \emph{street polygons} for which we give a polynomial time $4$--approximation algorithm; and (ii) a general version for which we present a practical solution that finds the optimal solution in possibly exponential time. In addition to theoretical proofs, we also present results from simulation studies.

preprint2016arXiv

Non-Myopic Target Tracking Strategies for State-Dependent Noise

We study the problem of devising a closed-loop strategy to control the position of a robot that is tracking a possibly moving target. The robot is capable of obtaining noisy measurements of the target's position. The key idea in active target tracking is to choose control laws that drive the robot to measurement locations that will reduce the uncertainty in the target's position. The challenge is that measurement uncertainty often is a function of the (unknown) relative positions of the target and the robot. Consequently, a closed-loop control policy is desired which can map the current estimate of the target's position to an optimal control law for the robot. Our main contribution is to devise a closed-loop control policy for target tracking that plans for a sequence of control actions, instead of acting greedily. We consider scenarios where the noise in measurement is a function of the state of the target. We seek to minimize the maximum uncertainty (trace of the posterior covariance matrix) over all possible measurements. We exploit the structural properties of a Kalman Filter to build a policy tree that is orders of magnitude smaller than naive enumeration while still preserving optimality guarantees. We show how to obtain even more computational savings by relaxing the optimality guarantees. The resulting algorithms are evaluated through simulations.

preprint2016arXiv

Radiation Search Operations using Scene Understanding with Autonomous UAV and UGV

Autonomously searching for hazardous radiation sources requires the ability of the aerial and ground systems to understand the scene they are scouting. In this paper, we present systems, algorithms, and experiments to perform radiation search using unmanned aerial vehicles (UAV) and unmanned ground vehicles (UGV) by employing semantic scene segmentation. The aerial data is used to identify radiological points of interest, generate an orthophoto along with a digital elevation model (DEM) of the scene, and perform semantic segmentation to assign a category (e.g. road, grass) to each pixel in the orthophoto. We perform semantic segmentation by training a model on a dataset of images we collected and annotated, using the model to perform inference on images of the test area unseen to the model, and then refining the results with the DEM to better reason about category predictions at each pixel. We then use all of these outputs to plan a path for a UGV carrying a LiDAR to map the environment and avoid obstacles not present during the flight, and a radiation detector to collect more precise radiation measurements from the ground. Results of the analysis for each scenario tested favorably. We also note that our approach is general and has the potential to work for a variety of different sensing tasks.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint

Fields this researcher appears in

Robotics Artificial Intelligence Machine Learning Computation and Language Computer Vision Multiagent Systems math.OC Systems and Control Discrete Mathematics eess.SY

Source provenance

Where this author record came from

arxivconfidence 95%

external id: arxiv:2502.17821:author:4:pratap-tokekar

Imported May 21, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.09269:author:8:pratap-tokekar

Imported May 20, 2026Synced May 20, 2026

arxivconfidence 95%

external id: arxiv:2605.09262:author:8:pratap-tokekar

Imported May 20, 2026Synced May 20, 2026

5 works

Lifeng Zhou

Researcher

Lifeng Zhou contributes to research discovery and scholarly infrastructure.

Open to collaborate

3 works

Haitao Mi

Researcher

Haitao Mi contributes to research discovery and scholarly infrastructure.

Open to collaborate

3 works

Runpeng Dai

Researcher

Runpeng Dai contributes to research discovery and scholarly infrastructure.

Open to collaborate

3 works

Tong Zheng

Researcher

Tong Zheng contributes to research discovery and scholarly infrastructure.

Open to collaborate

Pratap Tokekar

What is connected

Connect this record

See the researcher in context

Building this map preview

26 published item(s)

DeltaRubric: Generative Multimodal Reward Modeling via Joint Planning and Verification

Dual-Uncertainty Guided Policy Learning for Multimodal Reasoning

Reinforcing Multimodal Reasoning Against Visual Degradation

CAML: Collaborative Auxiliary Modality Learning for Multi-Agent Systems

Distributed Attack-Robust Submodular Maximization for Multi-Robot Planning

Graph Neural Networks for Decentralized Multi-Robot Submodular Action Selection

On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces

Online Exploration of an Unknown Region of Interest with a Team of Aerial Robots

Risk-Aware Path Planning for Ground Vehicles using Occluded Aerial Images

Risk-aware Resource Allocation for Multiple UAVs-UGVs Recharging Rendezvous

Risk-Aware Submodular Optimization for Multi-Robot Coordination

Risk-aware UAV-UGV Rendezvous with Chance-Constrained Markov Decision Process

Failure-Resilient Coverage Maximization with Multiple Robots

Multi-robot Symmetric Rendezvous Search on the Line

Combining Geometric and Information-Theoretic Approaches for Multi-Robot Exploration

Coverage of an Environment Using Energy-Constrained Unmanned Aerial Vehicles

Experimental Evaluation of a Pseudo-Doppler Direction-Finding System for Localizing Radio Tags

Learning a Spatial Field in Minimum Time with a Team of Robots

Multi-Fidelity Reinforcement Learning with Gaussian Processes

Recreating Bat Behavior on Quad-rotor UAVs-A Simulation Approach

Risk-Aware Planning and Assignment for Ground Vehicles using Uncertain Perception from Aerial Vehicles

Robust Multi-Agent Task Assignment in Failure-Prone and Adversarial Environments

Strategies to Inject Spoofed Measurement Data to Mislead Kalman Filter

Algorithms for Visibility-Based Monitoring with Robot Teams

Non-Myopic Target Tracking Strategies for State-Dependent Noise

Radiation Search Operations using Scene Understanding with Autonomous UAV and UGV