Researcher profile

Mohit Prabhushankar

Mohit Prabhushankar contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2026arXiv

RADMI: Latent Information Aggregation as a Proxy for Model Uncertainty

Epistemic uncertainty estimation is essential for identifying regions where deep learning system outputs may be unreliable. However, existing approaches require computationally expensive ensemble methods or multiple stochastic forward passes, limiting their scalability to dense prediction tasks like segmentation. We propose Resolution-Aggregated Decoder Mutual Information (RADMI), a single-pass method that estimates prediction uncertainty by measuring mutual information (MI) between consecutive decoder layers in segmentation networks. We observe that elevated inter-layer MI correlates with prediction uncertainty, as the network must integrate conflicting contextual information at ambiguous regions such as class boundaries. Evaluating on a seismic facies segmentation benchmark, RADMI achieves the highest correlation with deep ensemble uncertainty among all single-pass methods, outperforming the next-best baselines by 5.5% in Pearson and 10.7% in Spearman correlation coefficients. Compared to baselines that either lack spatial precision or demand significant computational overhead, RADMI yields sharp, boundary-localized uncertainty maps without architectural modifications. Our results suggest that linear aggregation of normalized information flow provides a principled and efficient proxy for prediction uncertainty in encoder-decoder architectures.

preprint2023arXiv

Forgetful Active Learning with Switch Events: Efficient Sampling for Out-of-Distribution Data

This paper considers deep out-of-distribution active learning. In practice, fully trained neural networks interact randomly with out-of-distribution (OOD) inputs and map aberrant samples randomly within the model representation space. Since data representations are direct manifestations of the training distribution, the data selection process plays a crucial role in outlier robustness. For paradigms such as active learning, this is especially challenging since protocols must not only improve performance on the training distribution most effectively but further render a robust representation space. However, existing strategies directly base the data selection on the data representation of the unlabeled data which is random for OOD samples by definition. For this purpose, we introduce forgetful active learning with switch events (FALSE) - a novel active learning protocol for out-of-distribution active learning. Instead of defining sample importance on the data representation directly, we formulate "informativeness" with learning difficulty during training. Specifically, we approximate how often the network "forgets" unlabeled samples and query the most "forgotten" samples for annotation. We report up to 4.5\% accuracy improvements in over 270 experiments, including four commonly used protocols, two OOD benchmarks, one in-distribution benchmark, and three different architectures.

preprint2022arXiv

DECAL: DEployable Clinical Active Learning

Conventional machine learning systems that operate on natural images assume the presence of attributes within the images that lead to some decision. However, decisions in medical domain are a resultant of attributes within medical diagnostic scans and electronic medical records (EMR). Hence, active learning techniques that are developed for natural images are insufficient for handling medical data. We focus on reducing this insufficiency by designing a deployable clinical active learning (DECAL) framework within a bi-modal interface so as to add practicality to the paradigm. Our approach is a "plug-in" method that makes natural image based active learning algorithms generalize better and faster. We find that on two medical datasets on three architectures and five learning strategies, DECAL increases generalization across 20 rounds by approximately 4.81%. DECAL leads to a 5.59% and 7.02% increase in average accuracy as an initialization strategy for optical coherence tomography (OCT) and X-Ray respectively. Our active learning results were achieved using 3000 (5%) and 2000 (38%) samples of OCT and X-Ray data respectively.

preprint2022arXiv

Explanatory Paradigms in Neural Networks

In this article, we present a leap-forward expansion to the study of explainability in neural networks by considering explanations as answers to abstract reasoning-based questions. With $P$ as the prediction from a neural network, these questions are `Why P?', `What if not P?', and `Why P, rather than Q?' for a given contrast prediction $Q$. The answers to these questions are observed correlations, observed counterfactuals, and observed contrastive explanations respectively. Together, these explanations constitute the abductive reasoning scheme. We term the three explanatory schemes as observed explanatory paradigms. The term observed refers to the specific case of post-hoc explainability, when an explanatory technique explains the decision $P$ after a trained neural network has made the decision $P$. The primary advantage of viewing explanations through the lens of abductive reasoning-based questions is that explanations can be used as reasons while making decisions. The post-hoc field of explainability, that previously only justified decisions, becomes active by being involved in the decision making process and providing limited, but relevant and contextual interventions. The contributions of this article are: ($i$) realizing explanations as reasoning paradigms, ($ii$) providing a probabilistic definition of observed explanations and their completeness, ($iii$) creating a taxonomy for evaluation of explanations, and ($iv$) positioning gradient-based complete explanainability's replicability and reproducibility across multiple applications and data modalities, ($v$) code repositories, publicly available at https://github.com/olivesgatech/Explanatory-Paradigms.

preprint2022arXiv

Gradient-Based Adversarial and Out-of-Distribution Detection

We propose to utilize gradients for detecting adversarial and out-of-distribution samples. We introduce confounding labels -- labels that differ from normal labels seen during training -- in gradient generation to probe the effective expressivity of neural networks. Gradients depict the amount of change required for a model to properly represent given inputs, providing insight into the representational power of the model established by network architectural properties as well as training data. By introducing a label of different design, we remove the dependency on ground truth labels for gradient generation during inference. We show that our gradient-based approach allows for capturing the anomaly in inputs based on the effective expressivity of the models with no hyperparameter tuning or additional processing, and outperforms state-of-the-art methods for adversarial and out-of-distribution detection.

preprint2022arXiv

Volumetric Supervised Contrastive Learning for Seismic Semantic Segmentation

In seismic interpretation, pixel-level labels of various rock structures can be time-consuming and expensive to obtain. As a result, there oftentimes exists a non-trivial quantity of unlabeled data that is left unused simply because traditional deep learning methods rely on access to fully labeled volumes. To rectify this problem, contrastive learning approaches have been proposed that use a self-supervised methodology in order to learn useful representations from unlabeled data. However, traditional contrastive learning approaches are based on assumptions from the domain of natural images that do not make use of seismic context. In order to incorporate this context within contrastive learning, we propose a novel positive pair selection strategy based on the position of slices within a seismic volume. We show that the learnt representations from our method out-perform a state of the art contrastive learning methodology in a semantic segmentation task.

preprint2020arXiv

Backpropagated Gradient Representations for Anomaly Detection

Learning representations that clearly distinguish between normal and abnormal data is key to the success of anomaly detection. Most of existing anomaly detection algorithms use activation representations from forward propagation while not exploiting gradients from backpropagation to characterize data. Gradients capture model updates required to represent data. Anomalies require more drastic model updates to fully represent them compared to normal data. Hence, we propose the utilization of backpropagated gradients as representations to characterize model behavior on anomalies and, consequently, detect such anomalies. We show that the proposed method using gradient-based representations achieves state-of-the-art anomaly detection performance in benchmark image recognition datasets. Also, we highlight the computational efficiency and the simplicity of the proposed method in comparison with other state-of-the-art methods relying on adversarial networks or autoregressive models, which require at least 27 times more model parameters than the proposed method.

preprint2020arXiv

Contrastive Explanations in Neural Networks

Visual explanations are logical arguments based on visual features that justify the predictions made by neural networks. Current modes of visual explanations answer questions of the form $`Why \text{ } P?'$. These $Why$ questions operate under broad contexts thereby providing answers that are irrelevant in some cases. We propose to constrain these $Why$ questions based on some context $Q$ so that our explanations answer contrastive questions of the form $`Why \text{ } P, \text{} rather \text{ } than \text{ } Q?'$. In this paper, we formalize the structure of contrastive visual explanations for neural networks. We define contrast based on neural networks and propose a methodology to extract defined contrasts. We then use the extracted contrasts as a plug-in on top of existing $`Why \text{ } P?'$ techniques, specifically Grad-CAM. We demonstrate their value in analyzing both networks and data in applications of large-scale recognition, fine-grained recognition, subsurface seismic analysis, and image quality assessment.

preprint2020arXiv

Implicit Saliency in Deep Neural Networks

In this paper, we show that existing recognition and localization deep architectures, that have not been exposed to eye tracking data or any saliency datasets, are capable of predicting the human visual saliency. We term this as implicit saliency in deep neural networks. We calculate this implicit saliency using expectancy-mismatch hypothesis in an unsupervised fashion. Our experiments show that extracting saliency in this fashion provides comparable performance when measured against the state-of-art supervised algorithms. Additionally, the robustness outperforms those algorithms when we add large noise to the input images. Also, we show that semantic features contribute more than low-level features for human visual saliency detection.

preprint2020arXiv

Novelty Detection Through Model-Based Characterization of Neural Networks

In this paper, we propose a model-based characterization of neural networks to detect novel input types and conditions. Novelty detection is crucial to identify abnormal inputs that can significantly degrade the performance of machine learning algorithms. Majority of existing studies have focused on activation-based representations to detect abnormal inputs, which limits the characterization of abnormality from a data perspective. However, a model perspective can also be informative in terms of the novelties and abnormalities. To articulate the significance of the model perspective in novelty detection, we utilize backpropagated gradients. We conduct a comprehensive analysis to compare the representation capability of gradients with that of activation and show that the gradients outperform the activation in novel class and condition detection. We validate our approach using four image recognition datasets including MNIST, Fashion-MNIST, CIFAR-10, and CURE-TSR. We achieve a significant improvement on all four datasets with an average AUROC of 0.953, 0.918, 0.582, and 0.746, respectively.