Source author record

Saurabh Prasad

Saurabh Prasad appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning Artificial Intelligence Cryptography and Security eess.AS eess.IV eess.SP Human-Computer Interaction Information Theory math.IT Sound

Catalog footprint

What is connected

9works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

BCI-Based Assessment of Ocular Response Time Using Dynamic Time Warping Leveraging an RDWT-Driven Deep Neural Framework

Mild traumatic brain injury (mTBI) is a prevalent condition that remains difficult to diagnose in its early stages. Oculomotor dysfunction is a well-established marker of mTBI, motivating the development of portable tools that capture both eye-movement behavior and underlying neurophysiology. In this work, we present an initial framework that integrates electroencephalogram (EEG) with augmented-reality (AR)-based Vestibular/Ocular Motor Screening (VOMS) tasks to estimate subject-specific ocular response times. Pre-processed EEG signals, obtained through band-pass filtering and average referencing, are analyzed using a Redundant Discrete Wavelet Transform (RDWT)-driven deep neural framework. The RDWT coefficients are subjected to trainable zero-phase convolutional filtering and reconstructed into the time domain via inverse RDWT, followed by channel-wise temporal and spatial filtering using 2D convolution layers and convolutional-LSTM-based decoding. An ablation study demonstrates that wavelet-domain filtering serves as an effective denoising strategy, improving prediction performance. Sliding-window predictions were validated using Pearson correlation (>= 0.5), and Dynamic Time Warping (DTW) was subsequently used to estimate ocular response times. DTW-derived metrics revealed significant inter-subject differences across all VOM tasks, supported by Mann-Whitney U tests. Cross-correlation analysis further revealed task-dependent temporal behaviors: pursuit tasks exhibited reactive tracking, whereas saccades showed anticipatory responses. Overall, the results highlight pursuit tasks as particularly informative for distinguishing timing differences and demonstrate the potential of RDWT-based EEG features combined with DTW metrics for multimodal mTBI assessment.

preprint2026arXiv

DINO Soars: DINOv3 for Open-Vocabulary Semantic Segmentation of Remote Sensing Imagery

The remote sensing (RS) domain suffers from a lack of densely labeled datasets, which are costly to obtain. Thus, models that can segment RS imagery well without supervised fine-tuning are valuable, but existing solutions fall behind supervised methods. Recently, DINOv3 surpassed SOTA RS foundation models on the GEO-bench segmentation benchmark without pre-training on RS data. Additionally, DINO.txt has enabled open vocabulary semantic segmentation (OVSS) with the DINOv3 backbone. We leverage these developments to form an OVSS model for RS imagery, free of RS-domain fine-tuning. Our model, CAFe-DINO (Cost Aggregation + Feature Upsampling with DINO) exploits the strong OVSS performance of DINOv3 for RS imagery via cost aggregation and training-free upsampling of text-image similarity scores. The robust latent of the DINOv3 backbone eliminates the need for fine-tuning on RS imagery; we instead fine-tune our model on a RS-targeted subset of COCO-Stuff. CAFe-DINO achieves state-of-the-art performance on key RS segmentation datasets, outperforming OVSS methods fine-tuned on RS data. Our code and data are publicly available at https://github.com/rfaulk/DINO_Soars.

preprint2022arXiv

Attacks as Defenses: Designing Robust Audio CAPTCHAs Using Attacks on Automatic Speech Recognition Systems

Audio CAPTCHAs are supposed to provide a strong defense for online resources; however, advances in speech-to-text mechanisms have rendered these defenses ineffective. Audio CAPTCHAs cannot simply be abandoned, as they are specifically named by the W3C as important enablers of accessibility. Accordingly, demonstrably more robust audio CAPTCHAs are important to the future of a secure and accessible Web. We look to recent literature on attacks on speech-to-text systems for inspiration for the construction of robust, principle-driven audio defenses. We begin by comparing 20 recent attack papers, classifying and measuring their suitability to serve as the basis of new "robust to transcription" but "easy for humans to understand" CAPTCHAs. After showing that none of these attacks alone are sufficient, we propose a new mechanism that is both comparatively intelligible (evaluated through a user study) and hard to automatically transcribe (i.e., $P({\rm transcription}) = 4 \times 10^{-5}$). Finally, we demonstrate that our audio samples have a high probability of being detected as CAPTCHAs when given to speech-to-text systems ($P({\rm evasion}) = 1.77 \times 10^{-4}$). In so doing, we not only demonstrate a CAPTCHA that is approximately four orders of magnitude more difficult to crack, but that such systems can be designed based on the insights gained from attack papers using the differences between the ways that humans and computers process audio.

preprint2020arXiv

Advances in Deep Learning for Hyperspectral Image Analysis--Addressing Challenges Arising in Practical Imaging Scenarios

Deep neural networks have proven to be very effective for computer vision tasks, such as image classification, object detection, and semantic segmentation -- these are primarily applied to color imagery and video. In recent years, there has been an emergence of deep learning algorithms being applied to hyperspectral and multispectral imagery for remote sensing and biomedicine tasks. These multi-channel images come with their own unique set of challenges that must be addressed for effective image analysis. Challenges include limited ground truth (annotation is expensive and extensive labeling is often not feasible), and high dimensional nature of the data (each pixel is represented by hundreds of spectral bands), despite being presented by a large amount of unlabeled data and the potential to leverage multiple sensors/sources that observe the same scene. In this chapter, we will review recent advances in the community that leverage deep learning for robust hyperspectral image analysis despite these unique challenges -- specifically, we will review unsupervised, semi-supervised and active learning approaches to image analysis, as well as transfer learning approaches for multi-source (e.g. multi-sensor, or multi-temporal) image analysis.

preprint2016arXiv

An approximate message passing approach for compressive hyperspectral imaging using a simultaneous low-rank and joint-sparsity prior

This paper considers a compressive sensing (CS) approach for hyperspectral data acquisition, which results in a practical compression ratio substantially higher than the state-of-the-art. Applying simultaneous low-rank and joint-sparse (L&S) model to the hyperspectral data, we propose a novel algorithm to joint reconstruction of hyperspectral data based on loopy belief propagation that enables the exploitation of both structured sparsity and amplitude correlations in the data. Experimental results with real hyperspectral datasets demonstrate that the proposed algorithm outperforms the state-of-the-art CS-based solutions with substantial reductions in reconstruction error.

preprint2016arXiv

Composite Kernel Local Angular Discriminant Analysis for Multi-Sensor Geospatial Image Analysis

With the emergence of passive and active optical sensors available for geospatial imaging, information fusion across sensors is becoming ever more important. An important aspect of single (or multiple) sensor geospatial image analysis is feature extraction - the process of finding "optimal" lower dimensional subspaces that adequately characterize class-specific information for subsequent analysis tasks, such as classification, change and anomaly detection etc. In recent work, we proposed and developed an angle-based discriminant analysis approach that projected data onto subspaces with maximal "angular" separability in the input (raw) feature space and Reproducible Kernel Hilbert Space (RKHS). We also developed an angular locality preserving variant of this algorithm. In this letter, we advance this work and make it suitable for information fusion - we propose and validate a composite kernel local angular discriminant analysis projection, that can operate on an ensemble of feature sources (e.g. from different sources), and project the data onto a unified space through composite kernels where the data are maximally separated in an angular sense. We validate this method with the multi-sensor University of Houston hyperspectral and LiDAR dataset, and demonstrate that the proposed method significantly outperforms other composite kernel approaches to sensor (information) fusion.

preprint2016arXiv

Person Re-identification with Hyperspectral Multi-Camera Systems --- A Pilot Study

Person re-identification in a multi-camera environment is an important part of modern surveillance systems. Person re-identification from color images has been the focus of much active research, due to the numerous challenges posed with such analysis tasks, such as variations in illumination, pose and viewpoints. In this paper, we suggest that hyperspectral imagery has the potential to provide unique information that is expected to be beneficial for the re-identification task. Specifically, we assert that by accurately characterizing the unique spectral signature for each person's skin, hyperspectral imagery can provide very useful descriptors (e.g. spectral signatures from skin pixels) for re-identification. Towards this end, we acquired proof-of-concept hyperspectral re-identification data under challenging (practical) conditions from 15 people. Our results indicate that hyperspectral data result in a substantially enhanced re-identification performance compared to color (RGB) images, when using spectral signatures over skin as the feature descriptor.

preprint2016arXiv

Sparse Representation-Based Classification: Orthogonal Least Squares or Orthogonal Matching Pursuit?

Spare representation of signals has received significant attention in recent years. Based on these developments, a sparse representation-based classification (SRC) has been proposed for a variety of classification and related tasks, including face recognition. Recently, a class dependent variant of SRC was proposed to overcome the limitations of SRC for remote sensing image classification. Traditionally, greedy pursuit based method such as orthogonal matching pursuit (OMP) are used for sparse coefficient recovery due to their simplicity as well as low time-complexity. However, orthogonal least square (OLS) has not yet been widely used in classifiers that exploit the sparse representation properties of data. Since OLS produces lower signal reconstruction error than OMP under similar conditions, we hypothesize that more accurate signal estimation will further improve the classification performance of classifiers that exploiting the sparsity of data. In this paper, we present a classification method based on OLS, which implements OLS in a classwise manner to perform the classification. We also develop and present its kernelized variant to handle nonlinearly separable data. Based on two real-world benchmarking hyperspectral datasets, we demonstrate that class dependent OLS based methods outperform several baseline methods including traditional SRC and the support vector machine classifier.

preprint2016arXiv

Spatial Context based Angular Information Preserving Projection for Hyperspectral Image Classification

Dimensionality reduction is a crucial preprocessing for hyperspectral data analysis - finding an appropriate subspace is often required for subsequent image classification. In recent work, we proposed supervised angular information based dimensionality reduction methods to find effective subspaces. Since unlabeled data are often more readily available compared to labeled data, we propose an unsupervised projection that finds a lower dimensional subspace where local angular information is preserved. To exploit spatial information from the hyperspectral images, we further extend our unsupervised projection to incorporate spatial contextual information around each pixel in the image. Additionally, we also propose a sparse representation based classifier which is optimized to exploit spatial information during classification - we hence assert that our proposed projection is particularly suitable for classifiers where local similarity and spatial context are both important. Experimental results with two real-world hyperspectral datasets demonstrate that our proposed methods provide a robust classification performance.

Saurabh Prasad

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

BCI-Based Assessment of Ocular Response Time Using Dynamic Time Warping Leveraging an RDWT-Driven Deep Neural Framework

DINO Soars: DINOv3 for Open-Vocabulary Semantic Segmentation of Remote Sensing Imagery

Attacks as Defenses: Designing Robust Audio CAPTCHAs Using Attacks on Automatic Speech Recognition Systems

Advances in Deep Learning for Hyperspectral Image Analysis--Addressing Challenges Arising in Practical Imaging Scenarios

An approximate message passing approach for compressive hyperspectral imaging using a simultaneous low-rank and joint-sparsity prior

Composite Kernel Local Angular Discriminant Analysis for Multi-Sensor Geospatial Image Analysis

Person Re-identification with Hyperspectral Multi-Camera Systems --- A Pilot Study

Sparse Representation-Based Classification: Orthogonal Least Squares or Orthogonal Matching Pursuit?

Spatial Context based Angular Information Preserving Projection for Hyperspectral Image Classification