Source author record

Buddhananda Banerjee

Buddhananda Banerjee appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation Methodology physics.soc-ph Populations and Evolution Social and Information Networks

Catalog footprint

What is connected

3works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

MDAS: A Diagnostic Approach to Assess the Quality of Data Splitting in Machine Learning

In the field of machine learning, model performance is usually assessed by randomly splitting data into training and test sets. Different random splits, however, can yield markedly different performance estimates, so a genuinely good model may be discarded or a poor one selected purely due to an unlucky partition. This motivates a principled way to diagnose the quality of a given data split. We propose a diagnostic framework based on a new discrepancy measure, the Mahalanobis Distribution Alignment Score (MDAS). MDAS is a symmetric dissimilarity measure between two multivariate samples, rather than a strict metric. MDAS captures both mean and covariance differences and is affine invariant. Building on this, we construct a Monte Carlo test that evaluates whether an observed split is statistically compatible with typical random splits, yielding an interpretable p-value for split quality. Using several real data sets, we study the relationship between MDAS and model robustness, including its association with the normalized Akaike information criterion. Finally, we apply MDAS to compare existing state-of-the-art deterministic data-splitting strategies with standard random splitting. The experimental results show that MDAS provides a simple, model-agnostic tool for auditing data splits and improving the reliability of empirical model evaluation.

preprint2020arXiv

A model for the spread of an epidemic from local to global: A case study of COVID-19 in India

In this paper we propose an epidemiological model for the spread of COVID-19. The dynamics of the spread is based on four fundamental categories of people in a population: Tested and infected, Non-Tested but infected, Tested but not infected, and non-Tested and not infected. The model is based on two levels of dynamics of spread in the population: at local level and at the global level. The local level growth is described with data and parameters which include testing statistics for COVID-19, preventive measures such as nationwide lockdown, and the migration of people across neighboring locations. In the context of India, the local locations are considered as districts and migration or traffic flow across districts are defined by normalized edge weight of the metapopulation network of districts which are infected with COVID-19. Based on this local growth, state level predictions for number of people tested with COVID-19 positive are made. Further, considering the local locations as states, prediction is made for the country level. The values of the model parameters are determined using grid search and minimizing an error function while training the model with real data. The predictions are made based on the present statistics of testing, and certain linear and log-linear growth of testing at state and country level. Finally, it is shown that the spread can be contained if number of testing can be increased linearly or log-linearly by certain factors along with the preventive measures in near future. This is also necessary to prevent the sharp growth in the count of infected and to get rid of the second wave of pandemic.

preprint2015arXiv

On existence of a change in mean of functional data

Functional data often arise as sequential temporal observations over a continuous state-space. A set of functional data with a possible change in its structure may lead to a wrong conclusion if it is not taken in to account. So, sometimes, it is crucial to know about the existence of change point in a given sequence of functional data before doing any further statistical inference. We develop a new methodology to provide a test for detecting a change in the mean function of the corresponding data. To obtain the test statistic we provide an alternative estimator of the covariance kernel. The proposed estimator is asymptotically unbiased under the null hypothesis and, at the same time, has smaller amount of bias than that of the existing estimator. We show here that under the null hypothesis the proposed test statistic is pivotal asymptotically. Moreover, it is shown that under alternative hypothesis the test is consistent for large enough sample size. It is also found that the proposed test is more powerful than the available test procedure in the literature. From the extensive simulation studies we observe that the proposed test outperforms the existing one with a wide margin in power for moderate sample size. The developed methodology performs satisfactorily for the average daily temperature of central England and monthly global average anomaly of temperatures.