Source author record

Purbayan Kar

Purbayan Kar appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision eess.AS eess.IV Sound

Catalog footprint

What is connected

2works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Leveraging Clinically Relevant Biometric Constraints To Supervise A Deep Learning Model For The Accurate Caliper Placement To Obtain Sonographic Measurements Of The Fetal Brain

Multiple studies have demonstrated that obtaining standardized fetal brain biometry from mid-trimester ultrasonography (USG) examination is key for the reliable assessment of fetal neurodevelopment and the screening of central nervous system (CNS) anomalies. Obtaining these measurements is highly subjective, expertise-driven, and requires years of training experience, limiting quality prenatal care for all pregnant mothers. In this study, we propose a deep learning (DL) approach to compute 3 key fetal brain biometry from the 2D USG images of the transcerebellar plane (TC) through the accurate and automated caliper placement (2 per biometry) by modeling it as a landmark detection problem. We leveraged clinically relevant biometric constraints (relationship between caliper points) and domain-relevant data augmentation to improve the accuracy of a U-Net DL model (trained/tested on: 596 images, 473 subjects/143 images, 143 subjects). We performed multiple experiments demonstrating the effect of the DL backbone, data augmentation, generalizability and benchmarked against a recent state-of-the-art approach through extensive clinical validation (DL vs. 7 experienced clinicians). For all cases, the mean errors in the placement of the individual caliper points and the computed biometry were comparable to error rates among clinicians. The clinical translation of the proposed framework can assist novice users from low-resource settings in the reliable and standardized assessment of fetal brain sonograms.

preprint2022arXiv

M2FNet: Multi-modal Fusion Network for Emotion Recognition in Conversation

Emotion Recognition in Conversations (ERC) is crucial in developing sympathetic human-machine interaction. In conversational videos, emotion can be present in multiple modalities, i.e., audio, video, and transcript. However, due to the inherent characteristics of these modalities, multi-modal ERC has always been considered a challenging undertaking. Existing ERC research focuses mainly on using text information in a discussion, ignoring the other two modalities. We anticipate that emotion recognition accuracy can be improved by employing a multi-modal approach. Thus, in this study, we propose a Multi-modal Fusion Network (M2FNet) that extracts emotion-relevant features from visual, audio, and text modality. It employs a multi-head attention-based fusion mechanism to combine emotion-rich latent representations of the input data. We introduce a new feature extractor to extract latent features from the audio and visual modality. The proposed feature extractor is trained with a novel adaptive margin-based triplet loss function to learn emotion-relevant features from the audio and visual data. In the domain of ERC, the existing methods perform well on one benchmark dataset but not on others. Our results show that the proposed M2FNet architecture outperforms all other methods in terms of weighted average F1 score on well-known MELD and IEMOCAP datasets and sets a new state-of-the-art performance in ERC.

Purbayan Kar

What is connected

Connect this record

See the researcher in context

Building this map preview

2 published item(s)

Leveraging Clinically Relevant Biometric Constraints To Supervise A Deep Learning Model For The Accurate Caliper Placement To Obtain Sonographic Measurements Of The Fetal Brain

M2FNet: Multi-modal Fusion Network for Emotion Recognition in Conversation