Source author record

P. K. Srijith

P. K. Srijith appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

astro-ph.IM Machine Learning physics.data-an Social and Information Networks Artificial Intelligence astro-ph.GA astro-ph.HE Computation and Language hep-ex

Catalog footprint

What is connected

9works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Post-Optimization Adaptive Rank Allocation for LoRA

Exponential growth in the scale of modern foundation models has led to the widespread adoption of Low-Rank Adaptation (LoRA) as a parameter-efficient fine-tuning technique. However, standard LoRA implementations disregard the varying intrinsic dimensionality of model layers and enforce a uniform rank, leading to parameter redundancy. We propose Post-Optimization Adaptive Rank Allocation (PARA), a data-free compression method for LoRA that integrates seamlessly into existing fine-tuning pipelines. PARA leverages Singular Value Decomposition to prune LoRA ranks using a global threshold over singular values across all layers. This results in non-uniform rank allocation based on layer-wise spectral importance. As a post-hoc method, PARA circumvents the training modifications and resulting instabilities that dynamic architectures typically incur. We empirically demonstrate that PARA reduces parameter count by 75-90\% while preserving the predictive performance of the original, uncompressed LoRA across multiple vision and language benchmarks. Code will be published upon acceptance.

preprint2022arXiv

Bayesian Neural Hawkes Process for Event Uncertainty Prediction

Event data consisting of time of occurrence of the events arises in several real-world applications. Recent works have introduced neural network based point processes for modeling event-times, and were shown to provide state-of-the-art performance in predicting event-times. However, neural point process models lack a good uncertainty quantification capability on predictions. A proper uncertainty quantification over event modeling will help in better decision making for many practical applications. Therefore, we propose a novel point process model, Bayesian Neural Hawkes process (BNHP) which leverages uncertainty modelling capability of Bayesian models and generalization capability of the neural networks to model event occurrence times. We augment the model with spatio-temporal modeling capability where it can consider uncertainty over predicted time and location of the events. Experiments on simulated and real-world datasets show that BNHP significantly improves prediction performance and uncertainty quantification for modelling events.

preprint2022arXiv

Cosmic Ray Rejection with Attention Augmented Deep Learning

Cosmic Ray (CR) hits are the major contaminants in astronomical imaging and spectroscopic observations involving solid-state detectors. Correctly identifying and masking them is a crucial part of the image processing pipeline, since it may otherwise lead to spurious detections. For this purpose, we have developed and tested a novel Deep Learning based framework for the automatic detection of CR hits from astronomical imaging data from two different imagers: Dark Energy Camera (DECam) and Las Cumbres Observatory Global Telescope (LCOGT). We considered two baseline models namely deepCR and Cosmic-CoNN, which are the current state-of-the-art learning based algorithms that were trained using Hubble Space Telescope (HST) ACS/WFC and LCOGT Network images respectively. We have experimented with the idea of augmenting the baseline models using Attention Gates (AGs) to improve the CR detection performance. We have trained our models on DECam data and demonstrate a consistent marginal improvement by adding AGs in True Positive Rate (TPR) at 0.01% False Positive Rate (FPR) and Precision at 95% TPR over the aforementioned baseline models for the DECam dataset. We demonstrate that the proposed AG augmented models provide significant gain in TPR at 0.01% FPR when tested on previously unseen LCO test data having images from three distinct telescope classes. Furthermore, we demonstrate that the proposed baseline models with and without attention augmentation outperform state-of-the-art models such as Astro-SCRAPPY, Maximask (that is trained natively on DECam data) and pre-trained ground-based Cosmic-CoNN. This study demonstrates that the AG module augmentation enables us to get a better deepCR and Cosmic-CoNN models and to improve their generalization capability on unseen data.

preprint2022arXiv

Two Dimensional Clustering of Gamma-Ray Bursts using durations and hardness

Gamma-Ray Bursts (GRBs) have been traditionally divided into two categories: "short" and "long" with durations less than and greater than two seconds, respectively. However, there is a lot of literature (with conflicting results) regarding the existence of a third intermediate class. To investigate this issue, we carry out a two-dimensional classification using the GRB hardness and duration, and also incorporating the uncertainties in both the variables, by using an extension of Gaussian Mixture Model called Extreme Deconvolution (XDGMM). We carry out this analysis on datasets from two detectors, viz. BATSE and Fermi-GBM. We consider the duration and hardness features in log-scale for each of these datasets and determine the best-fit parameters using XDGMM. This is followed by information theoretic criterion-based tests (AIC and BIC) to determine the optimum number of classes. For BATSE, we find that both AIC and BIC show preference for two components with close to decisive and decisive significance, respectively. For Fermi-GBM, AIC shows preference for three components with decisive significance, whereas BIC does not find any significant difference between two and three components. Our analysis codes have been made publicly available.

preprint2021arXiv

Bi-Directional Recurrent Neural Ordinary Differential Equations for Social Media Text Classification

Classification of posts in social media such as Twitter is difficult due to the noisy and short nature of texts. Sequence classification models based on recurrent neural networks (RNN) are popular for classifying posts that are sequential in nature. RNNs assume the hidden representation dynamics to evolve in a discrete manner and do not consider the exact time of the posting. In this work, we propose to use recurrent neural ordinary differential equations (RNODE) for social media post classification which consider the time of posting and allow the computation of hidden representation to evolve in a time-sensitive continuous manner. In addition, we propose a novel model, Bi-directional RNODE (Bi-RNODE), which can consider the information flow in both the forward and backward directions of posting times to predict the post label. Our experiments demonstrate that RNODE and Bi-RNODE are effective for the problem of stance classification of rumours in social media.

preprint2021arXiv

Galaxy Morphology Classification using Neural Ordinary Differential Equations

We introduce a continuous depth version of the Residual Network (ResNet) called Neural ordinary differential equations (NODE) for the purpose of galaxy morphology classification. We carry out a classification of galaxy images from the Galaxy Zoo 2 dataset, consisting of five distinct classes, and obtained an accuracy between 91-95\%, depending on the image class. We train NODE with different numerical techniques such as adjoint and Adaptive Checkpoint Adjoint (ACA) and compare them against ResNet. While ResNet has certain drawbacks, such as time consuming architecture selection (e.g. the number of layers) and the requirement of a large dataset needed for training, NODE can overcome these limitations. Through our results, we show that that the accuracy of NODE is comparable to ResNet, and the number of parameters used is about one-third as compared to ResNet, thus leading to a smaller memory footprint, which would benefit next generation surveys.

preprint2021arXiv

Variational Inference as an alternative to MCMC for parameter estimation and model selection

Most applications of Bayesian Inference for parameter estimation and model selection in astrophysics involve the use of Monte Carlo techniques such as Markov Chain Monte Carlo (MCMC) and nested sampling. However, these techniques are time consuming and their convergence to the posterior could be difficult to determine. In this work, we advocate Variational inference as an alternative to solve the above problems, and demonstrate its usefulness for parameter estimation and model selection in Astrophysics. Variational inference converts the inference problem into an optimization problem by approximating the posterior from a known family of distributions and using Kullback-Leibler divergence to characterize the difference. It takes advantage of fast optimization techniques, which make it ideal to deal with large datasets and makes it trivial to parallelize on a multicore platform. We also derive a new approximate evidence estimation based on variational posterior, and importance sampling technique called posterior weighted importance sampling for the calculation of evidence (PWISE), which is useful to perform Bayesian model selection. As a proof of principle, we apply variational inference to five different problems in astrophysics, where Monte Carlo techniques were previously used. These include assessment of significance of annual modulation in the COSINE-100 dark matter experiment, measuring exoplanet orbital parameters from radial velocity data, tests of periodicities in measurements of Newton's constant $G$, assessing the significance of a turnover in the spectral lag data of GRB 160625B and estimating the mass of a galaxy cluster using weak gravitational lensing. We find that variational inference is much faster than MCMC and nested sampling techniques for most of these problems while providing competitive results. All our analysis codes have been made publicly available.

preprint2020arXiv

HAP-SAP: Semantic Annotation in LBSNs using Latent Spatio-Temporal Hawkes Process

The prevalence of location-based social networks (LBSNs) has eased the understanding of human mobility patterns. Knowledge of human dynamics can aid in various ways like urban planning, managing traffic congestion, personalized recommendation etc. These dynamics are influenced by factors like social impact, periodicity in mobility, spatial proximity, influence among users and semantic categories etc., which makes location modelling a critical task. However, categories which act as semantic characterization of the location, might be missing for some check-ins and can adversely affect modelling the mobility dynamics of users. At the same time, mobility patterns provide a cue on the missing semantic category. In this paper, we simultaneously address the problem of semantic annotation of locations and location adoption dynamics of users. We propose our model HAP-SAP, a latent spatio-temporal multivariate Hawkes process, which considers latent semantic category influences, and temporal and spatial mobility patterns of users. The model parameters and latent semantic categories are inferred using expectation-maximization algorithm, which uses Gibbs sampling to obtain posterior distribution over latent semantic categories. The inferred semantic categories can supplement our model on predicting the next check-in events by users. Our experiments on real datasets demonstrate the effectiveness of the proposed model for the semantic annotation and location adoption modelling tasks.

preprint2016arXiv

Gaussian Process Pseudo-Likelihood Models for Sequence Labeling

Several machine learning problems arising in natural language processing can be modeled as a sequence labeling problem. We provide Gaussian process models based on pseudo-likelihood approximation to perform sequence labeling. Gaussian processes (GPs) provide a Bayesian approach to learning in a kernel based framework. The pseudo-likelihood model enables one to capture long range dependencies among the output components of the sequence without becoming computationally intractable. We use an efficient variational Gaussian approximation method to perform inference in the proposed model. We also provide an iterative algorithm which can effectively make use of the information from the neighboring labels to perform prediction. The ability to capture long range dependencies makes the proposed approach useful for a wide range of sequence labeling problems. Numerical experiments on some sequence labeling data sets demonstrate the usefulness of the proposed approach.

P. K. Srijith

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Post-Optimization Adaptive Rank Allocation for LoRA

Bayesian Neural Hawkes Process for Event Uncertainty Prediction

Cosmic Ray Rejection with Attention Augmented Deep Learning

Two Dimensional Clustering of Gamma-Ray Bursts using durations and hardness

Bi-Directional Recurrent Neural Ordinary Differential Equations for Social Media Text Classification

Galaxy Morphology Classification using Neural Ordinary Differential Equations

Variational Inference as an alternative to MCMC for parameter estimation and model selection

HAP-SAP: Semantic Annotation in LBSNs using Latent Spatio-Temporal Hawkes Process

Gaussian Process Pseudo-Likelihood Models for Sequence Labeling