Researcher profile

Mohan M. Trivedi

Mohan M. Trivedi contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2020arXiv

Ground Plane Polling for 6DoF Pose Estimation of Objects on the Road

This paper introduces an approach to produce accurate 3D detection boxes for objects on the ground using single monocular images. We do so by merging 2D visual cues, 3D object dimensions, and ground plane constraints to produce boxes that are robust against small errors and incorrect predictions. First, we train a single-shot convolutional neural network (CNN) that produces multiple visual and geometric cues of interest: 2D bounding boxes, 2D keypoints of interest, coarse object orientations and object dimensions. Subsets of these cues are then used to poll probable ground planes from a pre-computed database of ground planes, to identify the "best fit" plane with highest consensus. Once identified, the "best fit" plane provides enough constraints to successfully construct the desired 3D detection box, without directly predicting the 6DoF pose of the object. The entire ground plane polling (GPP) procedure is constructed as a non-parametrized layer of the CNN that outputs the desired "best fit" plane and the corresponding 3D keypoints, which together define the final 3D bounding box. Doing so allows us to poll thousands of different ground plane configurations without adding considerable overhead, while also creating a single CNN that directly produces the desired output without the need for post processing. We evaluate our method on the 2D detection and orientation estimation benchmark from the challenging KITTI dataset, and provide additional comparisons for 3D metrics of importance. This single-stage, single-pass CNN results in superior localization and orientation estimation compared to more complex and computationally expensive monocular approaches.

preprint2020arXiv

On Assessing Driver Awareness of Situational Criticalities: Multi-modal Bio-sensing and Vision-based Analysis, Evaluations, and Insights

Automobiles for our roadways are increasingly using advanced driver assistance systems. The adoption of such new technologies requires us to develop novel perception systems not only for accurately understanding the situational context of these vehicles, but also to infer the driver's awareness in differentiating between safe and critical situations. This manuscript focuses on the specific problem of inferring driver awareness in the context of attention analysis and hazardous incident activity. Even after the development of wearable and compact multi-modal bio-sensing systems in recent years, their application in driver awareness context has been scarcely explored. The~capability of simultaneously recording different kinds of bio-sensing data in addition to traditionally employed computer vision systems provides exciting opportunities to explore the limitations of these sensor modalities. In this work, we explore the applications of three different bio-sensing modalities namely electroencephalogram (EEG), photoplethysmogram (PPG) and galvanic skin response (GSR) along with a camera-based vision system in driver awareness context. We assess the information from these sensors independently and together using both signal processing- and deep learning-based tools. We show that our methods outperform previously reported studies to classify driver attention and detecting hazardous/non-hazardous situations for short time scales of two seconds. We use EEG and vision data for high resolution temporal classification (two seconds) while additionally also employing PPG and GSR over longer time periods. We evaluate our methods by collecting user data on twelve subjects for two real-world driving datasets among which one is publicly available (KITTI dataset) while the other was collected by us (LISA dataset) with the vehicle ....

preprint2020arXiv

Trajectory Prediction for Autonomous Driving based on Multi-Head Attention with Joint Agent-Map Representation

Predicting the trajectories of surrounding agents is an essential ability for autonomous vehicles navigating through complex traffic scenes. The future trajectories of agents can be inferred using two important cues: the locations and past motion of agents, and the static scene structure. Due to the high variability in scene structure and agent configurations, prior work has employed the attention mechanism, applied separately to the scene and agent configuration to learn the most salient parts of both cues. However, the two cues are tightly linked. The agent configuration can inform what part of the scene is most relevant to prediction. The static scene in turn can help determine the relative influence of agents on each other's motion. Moreover, the distribution of future trajectories is multimodal, with modes corresponding to the agent's intent. The agent's intent also informs what part of the scene and agent configuration is relevant to prediction. We thus propose a novel approach applying multi-head attention by considering a joint representation of the static scene and surrounding agents. We use each attention head to generate a distinct future trajectory to address multimodality of future trajectories. Our model achieves state of the art results on the nuScenes prediction benchmark and generates diverse future trajectories compliant with scene structure and agent configuration.