Source author record

Sudip Das

Sudip Das appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning physics.flu-dyn eess.SP

Catalog footprint

What is connected

7works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Deep Multi-Task Networks For Occluded Pedestrian Pose Estimation

Most of the existing works on pedestrian pose estimation do not consider estimating the pose of an occluded pedestrian, as the annotations of the occluded parts are not available in relevant automotive datasets. For example, CityPersons, a well-known dataset for pedestrian detection in automotive scenes does not provide pose annotations, whereas MS-COCO, a non-automotive dataset, contains human pose estimation. In this work, we propose a multi-task framework to extract pedestrian features through detection and instance segmentation tasks performed separately on these two distributions. Thereafter, an encoder learns pose specific features using an unsupervised instance-level domain adaptation method for the pedestrian instances from both distributions. The proposed framework has improved state-of-the-art performances of pose estimation, pedestrian detection, and instance segmentation.

preprint2022arXiv

Improving self-supervised pretraining models for epileptic seizure detection from EEG data

There is abundant medical data on the internet, most of which are unlabeled. Traditional supervised learning algorithms are often limited by the amount of labeled data, especially in the medical domain, where labeling is costly in terms of human processing and specialized experts needed to label them. They are also prone to human error and biased as a select few expert annotators label them. These issues are mitigated by Self-supervision, where we generate pseudo-labels from unlabelled data by seeing the data itself. This paper presents various self-supervision strategies to enhance the performance of a time-series based Diffusion convolution recurrent neural network (DCRNN) model. The learned weights in the self-supervision pretraining phase can be transferred to the supervised training phase to boost the model's prediction capability. Our techniques are tested on an extension of a Diffusion Convolutional Recurrent Neural network (DCRNN) model, an RNN with graph diffusion convolutions, which models the spatiotemporal dependencies present in EEG signals. When the learned weights from the pretraining stage are transferred to a DCRNN model to determine whether an EEG time window has a characteristic seizure signal associated with it, our method yields an AUROC score $1.56\%$ than the current state-of-the-art models on the TUH EEG seizure corpus.

preprint2022arXiv

Spatio-Contextual Deep Network Based Multimodal Pedestrian Detection For Autonomous Driving

Pedestrian Detection is the most critical module of an Autonomous Driving system. Although a camera is commonly used for this purpose, its quality degrades severely in low-light night time driving scenarios. On the other hand, the quality of a thermal camera image remains unaffected in similar conditions. This paper proposes an end-to-end multimodal fusion model for pedestrian detection using RGB and thermal images. Its novel spatio-contextual deep network architecture is capable of exploiting the multimodal input efficiently. It consists of two distinct deformable ResNeXt-50 encoders for feature extraction from the two modalities. Fusion of these two encoded features takes place inside a multimodal feature embedding module (MuFEm) consisting of several groups of a pair of Graph Attention Network and a feature fusion unit. The output of the last feature fusion unit of MuFEm is subsequently passed to two CRFs for their spatial refinement. Further enhancement of the features is achieved by applying channel-wise attention and extraction of contextual information with the help of four RNNs traversing in four different directions. Finally, these feature maps are used by a single-stage decoder to generate the bounding box of each pedestrian and the score map. We have performed extensive experiments of the proposed framework on three publicly available multimodal pedestrian detection benchmark datasets, namely KAIST, CVC-14, and UTokyo. The results on each of them improved the respective state-of-the-art performance. A short video giving an overview of this work along with its qualitative results can be seen at https://youtu.be/FDJdSifuuCs. Our source code will be released upon publication of the paper.

preprint2020arXiv

An End-to-End Framework for Unsupervised Pose Estimation of Occluded Pedestrians

Pose estimation in the wild is a challenging problem, particularly in situations of (i) occlusions of varying degrees and (ii) crowded outdoor scenes. Most of the existing studies of pose estimation did not report the performance in similar situations. Moreover, pose annotations for occluded parts of human figures have not been provided in any of the relevant standard datasets which in turn creates further difficulties to the required studies for pose estimation of the entire figure of occluded humans. Well known pedestrian detection datasets such as CityPersons contains samples of outdoor scenes but it does not include pose annotations. Here, we propose a novel multi-task framework for end-to-end training towards the entire pose estimation of pedestrians including in situations of any kind of occlusion. To tackle this problem for training the network, we make use of a pose estimation dataset, MS-COCO, and employ unsupervised adversarial instance-level domain adaptation for estimating the entire pose of occluded pedestrians. The experimental studies show that the proposed framework outperforms the SOTA results for pose estimation, instance segmentation and pedestrian detection in cases of heavy occlusions (HO) and reasonable + heavy occlusions (R + HO) on the two benchmark datasets.

preprint2020arXiv

Parametric experimental studies on the shock related unsteadiness in a hemispherical spiked body at supersonic flow

Experimental studies are carried out to investigate the effects of the geometrical parameters with a drag reducing spike on a hemispherical forebody in a supersonic freestream of $M_\infty=2.0$ at $0^{\circ}$ angle of attack. The spike length $(l/D=0.5,1.0,1.5,2.0)$, spike stem diameter $(d/D=0.06,0.12,0.18)$, and spike tip shapes are varied and their influence on the time-averaged, and time-resolved flow field are examined. When $l/D$ increases, a significant reduction in drag ($c_d$) is achieved at $l/D=1.5$, whereas the variation in $d/D$ has only a minor effect. The intensity of the shock-related unsteadiness is reduced with an increase in {$d/D$ from $0.06$ to $0.18$}, whereas changes in $l/D$ have a negligible effect. The effects of spike tip geometry are studied by replacing the sharp spike tip with a hemispherical one having three different base shapes (vertical base, circular base, and elliptical base). Hemispherical spike tip with a vertical base is performing better by reducing $c_d$ and flow unsteadiness. The dominant Spatio-temporal mode arising due to the shock-related unsteadiness is represented through modal analysis of time-resolved shadowgraph images and the findings are consistent with the other measurements.

preprint2020arXiv

Scale-Invariant Multi-Oriented Text Detection in Wild Scene Images

Automatic detection of scene texts in the wild is a challenging problem, particularly due to the difficulties in handling (i) occlusions of varying percentages, (ii) widely different scales and orientations, (iii) severe degradations in the image quality etc. In this article, we propose a fully convolutional neural network architecture consisting of a novel Feature Representation Block (FRB) capable of efficient abstraction of information. The proposed network has been trained using curriculum learning with respect to difficulties in image samples and gradual pixel-wise blurring. It is capable of detecting texts of different scales and orientations suffered by blurring from multiple possible sources, non-uniform illumination as well as partial occlusions of varying percentages. Text detection performance of the proposed framework on various benchmark sample databases including ICDAR 2015, ICDAR 2017 MLT, COCO-Text and MSRA-TD500 improves respective state-of-the-art results significantly. Source code of the proposed architecture will be made available at github.

preprint2020arXiv

Shock related unsteadiness of axisymmetric spiked bodies in the supersonic flow

Shock related unsteadiness over axisymmetric spiked body configurations is experimentally investigated at a freestream supersonic Mach number of 2.0 at 0$^\circ$ angle of attack. Three different forebody configurations mounted with a sharp spike-tip ranging from blunt to streamlined (flat-face, hemispherical, and elliptical) are considered. Steady and unsteady pressure measurements, short-exposure high-speed shadowgraphy, shock footprint analysis from $x-t$ plots, and identification of dominant spatiotemporal modes through modal analysis are carried out to explain the unsteady flow physics. The present investigation tools are validated against the well-known events of `pulsation' associated with the flat-face case. The hemispherical case is characterized by the formation of a separated free shear layer and associated localized shock oscillations. The cycle of charging and ejection of fluid mass from the recirculation zone, confined between the separated shear layer and the spiked body, is identified to drive the flow unsteadiness. Such an event triggers the out-of-phase motion between the separated and reattachment shocks. In the elliptical case, the overall flow field resembles that of the hemispherical case, except with dampened unsteadiness. The value of the cone angle ($λ$) associated with the recirculation region is found to be responsible for the fluctuations from the charging and ejection of fluid mass. Thereby it controls the extent of out-of-shock phase motion. In the elliptical case, $λ$ is observed to be smaller and exhibits a reduction in shock unsteadiness. \hlt{Based on the gathered results and understanding, the reduction in unsteadiness associated with the aerodisk mounted on the hemispherical forebody is explained via the almost complete elimination of the out-of-phase shock motion.

Sudip Das

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Deep Multi-Task Networks For Occluded Pedestrian Pose Estimation

Improving self-supervised pretraining models for epileptic seizure detection from EEG data

Spatio-Contextual Deep Network Based Multimodal Pedestrian Detection For Autonomous Driving

An End-to-End Framework for Unsupervised Pose Estimation of Occluded Pedestrians

Parametric experimental studies on the shock related unsteadiness in a hemispherical spiked body at supersonic flow

Scale-Invariant Multi-Oriented Text Detection in Wild Scene Images

Shock related unsteadiness of axisymmetric spiked bodies in the supersonic flow