Source author record

Sumohana S. Channappayya

Sumohana S. Channappayya appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision eess.IV Machine Learning astro-ph.IM Graphics Multimedia physics.data-an

Catalog footprint

What is connected

5works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2025arXiv

Exploring Compositionality in Vision Transformers using Wavelet Representations

While insights into the workings of the transformer model have largely emerged by analysing their behaviour on language tasks, this work investigates the representations learnt by the Vision Transformer (ViT) encoder through the lens of compositionality. We introduce a framework, analogous to prior work on measuring compositionality in representation learning, to test for compositionality in the ViT encoder. Crucial to drawing this analogy is the Discrete Wavelet Transform (DWT), which is a simple yet effective tool for obtaining input-dependent primitives in the vision setting. By examining the ability of composed representations to reproduce original image representations, we empirically test the extent to which compositionality is respected in the representation space. Our findings show that primitives from a one-level DWT decomposition produce encoder representations that approximately compose in latent space, offering a new perspective on how ViTs structure information.

preprint2022arXiv

Cosmic Ray Rejection with Attention Augmented Deep Learning

Cosmic Ray (CR) hits are the major contaminants in astronomical imaging and spectroscopic observations involving solid-state detectors. Correctly identifying and masking them is a crucial part of the image processing pipeline, since it may otherwise lead to spurious detections. For this purpose, we have developed and tested a novel Deep Learning based framework for the automatic detection of CR hits from astronomical imaging data from two different imagers: Dark Energy Camera (DECam) and Las Cumbres Observatory Global Telescope (LCOGT). We considered two baseline models namely deepCR and Cosmic-CoNN, which are the current state-of-the-art learning based algorithms that were trained using Hubble Space Telescope (HST) ACS/WFC and LCOGT Network images respectively. We have experimented with the idea of augmenting the baseline models using Attention Gates (AGs) to improve the CR detection performance. We have trained our models on DECam data and demonstrate a consistent marginal improvement by adding AGs in True Positive Rate (TPR) at 0.01% False Positive Rate (FPR) and Precision at 95% TPR over the aforementioned baseline models for the DECam dataset. We demonstrate that the proposed AG augmented models provide significant gain in TPR at 0.01% FPR when tested on previously unseen LCO test data having images from three distinct telescope classes. Furthermore, we demonstrate that the proposed baseline models with and without attention augmentation outperform state-of-the-art models such as Astro-SCRAPPY, Maximask (that is trained natively on DECam data) and pre-trained ground-based Cosmic-CoNN. This study demonstrates that the AG module augmentation enables us to get a better deepCR and Cosmic-CoNN models and to improve their generalization capability on unseen data.

preprint2022arXiv

Lung-Originated Tumor Segmentation from Computed Tomography Scan (LOTUS) Benchmark

Lung cancer is one of the deadliest cancers, and in part its effective diagnosis and treatment depend on the accurate delineation of the tumor. Human-centered segmentation, which is currently the most common approach, is subject to inter-observer variability, and is also time-consuming, considering the fact that only experts are capable of providing annotations. Automatic and semi-automatic tumor segmentation methods have recently shown promising results. However, as different researchers have validated their algorithms using various datasets and performance metrics, reliably evaluating these methods is still an open challenge. The goal of the Lung-Originated Tumor Segmentation from Computed Tomography Scan (LOTUS) Benchmark created through 2018 IEEE Video and Image Processing (VIP) Cup competition, is to provide a unique dataset and pre-defined metrics, so that different researchers can develop and evaluate their methods in a unified fashion. The 2018 VIP Cup started with a global engagement from 42 countries to access the competition data. At the registration stage, there were 129 members clustered into 28 teams from 10 countries, out of which 9 teams made it to the final stage and 6 teams successfully completed all the required tasks. In a nutshell, all the algorithms proposed during the competition, are based on deep learning models combined with a false positive reduction technique. Methods developed by the three finalists show promising results in tumor segmentation, however, more effort should be put into reducing the false positive rate. This competition manuscript presents an overview of the VIP-Cup challenge, along with the proposed algorithms and results.

preprint2020arXiv

Deep No-reference Tone Mapped Image Quality Assessment

The process of rendering high dynamic range (HDR) images to be viewed on conventional displays is called tone mapping. However, tone mapping introduces distortions in the final image which may lead to visual displeasure. To quantify these distortions, we introduce a novel no-reference quality assessment technique for these tone mapped images. This technique is composed of two stages. In the first stage, we employ a convolutional neural network (CNN) to generate quality aware maps (also known as distortion maps) from tone mapped images by training it with the ground truth distortion maps. In the second stage, we model the normalized image and distortion maps using an Asymmetric Generalized Gaussian Distribution (AGGD). The parameters of the AGGD model are then used to estimate the quality score using support vector regression (SVR). We show that the proposed technique delivers competitive performance relative to the state-of-the-art techniques. The novelty of this work is its ability to visualize various distortions as quality maps (distortion maps), especially in the no-reference setting, and to use these maps as features to estimate the quality score of tone mapped images.

preprint2016arXiv

Subjective Assessment of H.264 Compressed Stereoscopic Video

The tremendous growth in 3D (stereo) imaging and display technologies has led to stereoscopic content (video and image) becoming increasingly popular. However, both the subjective and the objective evaluation of stereoscopic video content has not kept pace with the rapid growth of the content. Further, the availability of standard stereoscopic video databases is also quite limited. In this work, we attempt to alleviate these shortcomings. We present a stereoscopic video database and its subjective evaluation. We have created a database containing a set of 144 distorted videos. We limit our attention to H.264 compression artifacts. The distorted videos were generated using 6 uncompressed pristine videos of left and right views originally created by Goldmann et al. at EPFL [1]. Further, 19 subjects participated in the subjective assessment task. Based on the subjective study, we have formulated a relation between the 2D and stereoscopic subjective scores as a function of compression rate and depth range. We have also evaluated the performance of popular 2D and 3D image/video quality assessment (I/VQA) algorithms on our database.