Source author record

Ashit Baran Aich

Ashit Baran Aich appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning

Catalog footprint

What is connected

3works

1topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Bag of Coins: A Statistical Probe into Neural Confidence Structures

Modern neural networks often produce miscalibrated confidence scores and struggle to detect out-of-distribution (OOD) inputs, while most existing methods post-process outputs without testing internal consistency. We introduce the Bag-of-Coins (BoC) probe, a non-parametric diagnostic of logit coherence that compares softmax confidence $\hat p$ to an aggregate of pairwise Luce-style dominance probabilities $\bar q$, yielding a deterministic coherence score and a p-value-based structural score. Across ViT, ResNet, and RoBERTa with ID/OOD test sets, the coherence gap $Δ=\bar q-\hat p$ reveals clear ID/OOD separation for ViT (ID ${\sim}0.1$-$0.2$, OOD ${\sim}0.5$-$0.6$) but substantial overlap for ResNet and RoBERTa (both ${\sim}0$), indicating architecture-dependent uncertainty geometry. As a practical method, BoC improves calibration only when the base model is poorly calibrated (ViT: ECE $0.024$ vs.\ $0.180$) and underperforms standard calibrators (ECE ${\sim}0.005$), while for OOD detection it fails across architectures (AUROC $0.020$-$0.253$) compared to standard scores ($0.75$-$0.99$). We position BoC as a research diagnostic for interrogating how architectures encode uncertainty in logit geometry rather than a production calibration or OOD detection method.

preprint2026arXiv

Copula-Stein Discrepancy: A Generator-Based Stein Operator for Archimedean Dependence

Kernel Stein discrepancies (KSDs) are widely used for goodness-of-fit testing, but standard KSDs can be insensitive to higher-order dependence features such as tail dependence. We introduce the Copula-Stein Discrepancy (CSD), which defines a Stein operator directly on the copula density to target dependence geometry rather than the joint score. For Archimedean copulas, CSD admits a closed-form Stein kernel derived from the scalar generator. We prove that CSD metrizes weak convergence of copula distributions, admits an empirical estimator with minimax-optimal rate $O_P(n^{-1/2})$, and is sensitive to differences in tail dependence coefficients. We further extend the framework beyond Archimedean families to general copulas, including elliptical and vine constructions. Computationally, exact CSD kernel evaluation is linear in dimension, and a random-feature approximation reduces the quadratic $O(n^2)$ sample scaling to near-linear $\tilde{O}(n)$; experiments show near-nominal Type~I error, increasing power, and rapid concentration of the approximation toward the exact $\widehat{\mathrm{CSD}}_n^2$ as the number of features grows.

preprint2026arXiv

From Sublinear to Linear: Fast Convergence in Deep Networks via Locally Polyak-Lojasiewicz Regions

Gradient descent (GD) on deep neural network loss landscapes is non-convex, yet often converges far faster in practice than classical guarantees suggest. Prior work shows that within locally quasi-convex regions (LQCRs), GD converges to stationary points at sublinear rates, leaving the commonly observed near-exponential training dynamics unexplained. We show that, under a mild local Neural Tangent Kernel (NTK) stability assumption, the loss satisfies a PL-type error bound within these regions, yielding a Locally Polyak-Lojasiewicz Region (LPLR) in which the squared gradient norm controls the suboptimality gap. For properly initialized finite-width networks, we show that under local NTK stability this PL-type mechanism holds around initialization and establish linear convergence of GD as long as the iterates remain within the resulting LPLR. Empirically, we observe PL-like scaling and linear-rate loss decay in controlled full-batch training and in a ResNet-style CNN trained with mini-batch SGD on a CIFAR-10 subset, indicating that LPLR signatures can persist under modern architectures and stochastic optimization. Overall, the results connect local geometric structure, local NTK stability, and fast optimization rates in a finite-width setting.

Ashit Baran Aich

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

Bag of Coins: A Statistical Probe into Neural Confidence Structures

Copula-Stein Discrepancy: A Generator-Based Stein Operator for Archimedean Dependence

From Sublinear to Linear: Fast Convergence in Deep Networks via Locally Polyak-Lojasiewicz Regions