Researcher profile

Arindam Banerjee

Arindam Banerjee contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
18works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

18 published item(s)

preprint2026arXiv

Complexity and speed of semi-algebraic multi-persistence

Let $\mathrm{R}$ be a real closed field, $S \subset \mathrm{R}^n$ a closed and bounded semi-algebraic set, and $\mathbf{f}=(f_1,\ldots,f_p):S \rightarrow \mathrm{R}^p$ a continuous semi-algebraic map inducing a $p$-parameter semi-algebraic filtration by sublevel sets. We introduce a barcode invariant for such filtrations that directly extends the classical ($p=1$) barcode. After scaling of the parameter space, in each homological degree $\ell$ the invariant is encoded by a $\mathbb{Z}_{\ge 0}$-valued function \[ μ_\ell(S,\mathbf{f}):\ \Big(({-}1,1)^p\times(({-}1,1)^p \cup\{(1,\ldots,1)\}) \Big)\ \cap\ \{(\mathbf a,\mathbf b)\mid \mathbf a\preceq \mathbf b\} \ \longrightarrow\ \mathbb{Z}_{\ge 0}, \] where $\preceq$ denotes the product order on $\mathrm{R}^p$. We prove that $μ_\ell(S,\mathbf{f})$ is semi-algebraically constructible and establish a singly exponential upper bound on its description complexity. Moreover, we give a singly exponential-time algorithm to compute $μ_\ell(S,\mathbf{f})$, extending to arbitrary $p$ the corresponding result for $p=1$ by Basu and Karisani. Finally, for semi-algebraic filtrations of bounded description complexity we bound the number of equivalence classes of finite poset modules realizable in this way, yielding a tight analogue of "speed" bounds for algebraically defined graph classes.

preprint2026arXiv

Plan Before You Trade: Inference-Time Optimization for RL Trading Agents

Reinforcement learning agents for portfolio management are typically trained and deployed as static policies, with no mechanism for using price forecasts at inference time. We propose $\text{FPILOT}$ (**Fin**ancial **P**lugin **I**nference-time **L**earning for **O**ptimal **T**rading), a plugin inference-time optimization framework inspired by Model Predictive Control (MPC). Our key structural insight is that future prices mostly do not depend on one agent's portfolio allocation, so a suitable predictive model can produce a multi-step price trajectory without iterative action-conditioned rollouts as in typical reinforcement learning. At each decision step, we use the forecaster's predicted price trajectory to construct an allocation-based imagined return objective, and optimize the policy at inference-time before executing one step of the trade. Our framework is compatible with any pre-trained agent and adapts the policy to the forecaster's predictions without any retraining. Evaluated across five policy learning algorithms on the TradeMaster DJ30 benchmark, $\text{FPILOT}$ produces consistent improvements in total return and return-based risk-adjusted metrics (Sharpe, Sortino, Calmar), with stochastic policies benefiting more than deterministic ones. Further, using synthetic forecasts at calibrated quality levels, we show that gains consistently improve with forecaster quality, suggesting that our performance will improve based on advances in financial forecasting.

preprint2023arXiv

Improved Algorithms for Neural Active Learning

We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting. In particular, we introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work. Then, the proposed algorithm leverages the powerful representation of NNs for both exploitation and exploration, has the query decision-maker tailored for $k$-class classification problems with the performance guarantee, utilizes the full feedback, and updates parameters in a more practical and efficient manner. These careful designs lead to an instance-dependent regret upper bound, roughly improving by a multiplicative factor $O(\log T)$ and removing the curse of input dimensionality. Furthermore, we show that the algorithm can achieve the same performance as the Bayes-optimal classifier in the long run under the hard-margin setting in classification problems. In the end, we use extensive experiments to evaluate the proposed algorithm and SOTA baselines, to show the improved empirical performance.

preprint2022arXiv

Bounds for the regularity of product of edge ideals

Let $I$ and $J$ be edge ideals in a polynomial ring $R = \mathbb{K}[x_1,\ldots,x_n]$ with $I \subseteq J$. In this paper, we obtain a general upper and lower bound for the Castelnuovo-Mumford regularity of $IJ$ in terms of certain invariants associated with $I$ and $J$. Using these results, we explicitly compute the regularity of $IJ$ for several classes of edge ideals. Let $J_1,\ldots,J_d$ be edge ideals in a polynomial ring $R$ with $J_1 \subseteq \cdots \subseteq J_d$. Finally, we compute the precise expression for the regularity of $J_1 J_2\cdots J_d$ when $d \in \{3,4\}$ and $J_d$ is the edge ideal of complete graph.

preprint2022arXiv

EE-Net: Exploitation-Exploration Neural Networks in Contextual Bandits

In this paper, we propose a novel neural exploration strategy in contextual bandits, EE-Net, distinct from the standard UCB-based and TS-based approaches. Contextual multi-armed bandits have been studied for decades with various applications. To solve the exploitation-exploration tradeoff in bandits, there are three main techniques: epsilon-greedy, Thompson Sampling (TS), and Upper Confidence Bound (UCB). In recent literature, linear contextual bandits have adopted ridge regression to estimate the reward function and combine it with TS or UCB strategies for exploration. However, this line of works explicitly assumes the reward is based on a linear function of arm vectors, which may not be true in real-world datasets. To overcome this challenge, a series of neural bandit algorithms have been proposed, where a neural network is used to learn the underlying reward function and TS or UCB are adapted for exploration. Instead of calculating a large-deviation based statistical bound for exploration like previous methods, we propose "EE-Net", a novel neural-based exploration strategy. In addition to using a neural network (Exploitation network) to learn the reward function, EE-Net uses another neural network (Exploration network) to adaptively learn potential gains compared to the currently estimated reward for exploration. Then, a decision-maker is constructed to combine the outputs from the Exploitation and Exploration networks. We prove that EE-Net can achieve $\mathcal{O}(\sqrt{T\log T})$ regret and show that EE-Net outperforms existing linear and neural contextual bandit baselines on real-world datasets.

preprint2022arXiv

On the Hilbert-Samuel coefficients of Frobenius powers of an ideal

We provide suitable conditions under which the asymptotic limit of the Hilbert-Samuel coefficients of the Frobenius powers of an $\mathfrak{m}$-primary ideal exists in a Noetherian local ring $(R,\mathfrak{m})$ with prime characteristic $p>0.$ This, in turn, gives an expression of the Hilbert-Kunz multiplicity of powers of the ideal. We also prove that for a face ring $R$ of a simplicial complex and an ideal $J$ generated by pure powers of the variables, the generalized Hilbert-Kunz function $\ell(R/(J^{[q]})^k)$ is a polynomial for all $q,k$ and also give an expression of the generalized Hilbert-Kunz multiplicity of powers of $J$ in terms of Hilbert-Samuel multiplicity of $J.$ We conclude by giving a counter-example to a conjecture proposed by I. Smirnov which connects the stability of an ideal with the asymptotic limit of the first Hilbert coefficient of the Frobenius power of the ideal.

preprint2021arXiv

Experiments with Rich Regime Training for Deep Learning

In spite of advances in understanding lazy training, recent work attributes the practical success of deep learning to the rich regime with complex inductive bias. In this paper, we study rich regime training empirically with benchmark datasets, and find that while most parameters are lazy, there is always a small number of active parameters which change quite a bit during training. We show that re-initializing (resetting to their initial random values) the active parameters leads to worse generalization. Further, we show that most of the active parameters are in the bottom layers, close to the input, especially as the networks become wider. Based on such observations, we study static Layer-Wise Sparse (LWS) SGD, which only updates some subsets of layers. We find that only updating the top and bottom layers have good generalization and, as expected, only updating the top layers yields a fast algorithm. Inspired by this, we investigate probabilistic LWS-SGD, which mostly updates the top layers and occasionally updates the full network. We show that probabilistic LWS-SGD matches the generalization performance of vanilla SGD and the back-propagation time can be 2-5 times more efficient.

preprint2021arXiv

Generalized Hilbert-Kunz function of the Rees algebra of the face ring of a simplicial complex

Let $R$ be the face ring of a simplicial complex of dimension $d-1$ and ${\mathcal R}(\mathfrak{n})$ be the Rees algebra of the maximal homogeneous ideal $\mathfrak{n}$ of $R.$ We show that the generalized Hilbert-Kunz function $HK(s)=\ell({\mathcal R}(\mathfrak n)/(\mathfrak n, \mathfrak n t)^{[s]})$ is given by a polynomial for all large $s.$ We calculate it in many examples and also provide a Macaulay2 code for computing $HK(s).$

preprint2020arXiv

Atwood and Reynolds numbers effects on the evolution of buoyancy-driven homogeneous variable-density turbulence

The evolution of buoyancy-driven homogeneous variable-density turbulence (HVDT) at Atwood numbers up to 0.75 and large Reynolds numbers is studied by using high-resolution Direct Numerical Simulations. To help understand the highly non-equilibrium nature of buoyancy-driven HVDT, the flow evolution is divided into four different regimes based on the behavior of turbulent kinetic energy derivatives. The results show that each regime has a unique type of dependency on both Atwood and Reynolds numbers. It is found that the local statistics of the flow based on the flow composition are more sensitive to Atwood and Reynolds numbers compared to those based on the entire flow. It is also observed that at higher Atwood numbers, different flow features reach their asymptotic Reynolds number behavior at different times. The energy spectrum defined based on the Favre fluctuations momentum has less large scale contamination from viscous effects for variable density flows with constant properties, compared to other forms used previously. The evolution of the energy spectrum highlights distinct dynamical features of the four flow regimes. Thus, the slope of the energy spectrum at intermediate to large scales evolves from -7/3 to -1, as a function of the production to dissipation ratio. The classical Kolmogorov spectrum emerges at intermediate to high scales at the highest Reynolds numbers examined, after the turbulence starts to decay. Finally, the similarities and differences between buoyancy-driven HVDT and the more conventional stationary turbulence are discussed and new strategies and tools for analysis are proposed.

preprint2020arXiv

Private Stochastic Non-Convex Optimization: Adaptive Algorithms and Tighter Generalization Bounds

We study differentially private (DP) algorithms for stochastic non-convex optimization. In this problem, the goal is to minimize the population loss over a $p$-dimensional space given $n$ i.i.d. samples drawn from a distribution. We improve upon the population gradient bound of ${\sqrt{p}}/{\sqrt{n}}$ from prior work and obtain a sharper rate of $\sqrt[4]{p}/\sqrt{n}$. We obtain this rate by providing the first analyses on a collection of private gradient-based methods, including adaptive algorithms DP RMSProp and DP Adam. Our proof technique leverages the connection between differential privacy and adaptive data analysis to bound gradient estimation error at every iterate, which circumvents the worse generalization bound from the standard uniform convergence argument. Finally, we evaluate the proposed algorithms on two popular deep learning tasks and demonstrate the empirical advantages of DP adaptive gradient methods over standard DP SGD.

preprint2020arXiv

Structured Linear Contextual Bandits: A Sharp and Geometric Smoothed Analysis

Bandit learning algorithms typically involve the balance of exploration and exploitation. However, in many practical applications, worst-case scenarios needing systematic exploration are seldom encountered. In this work, we consider a smoothed setting for structured linear contextual bandits where the adversarial contexts are perturbed by Gaussian noise and the unknown parameter $θ^*$ has structure, e.g., sparsity, group sparsity, low rank, etc. We propose simple greedy algorithms for both the single- and multi-parameter (i.e., different parameter for each context) settings and provide a unified regret analysis for $θ^*$ with any assumed structure. The regret bounds are expressed in terms of geometric quantities such as Gaussian widths associated with the structure of $θ^*$. We also obtain sharper regret bounds compared to earlier work for the unstructured $θ^*$ setting as a consequence of our improved analysis. We show there is implicit exploration in the smoothed setting where a simple greedy algorithm works.

preprint2020arXiv

Sub-Seasonal Climate Forecasting via Machine Learning: Challenges, Analysis, and Advances

Sub-seasonal climate forecasting (SSF) focuses on predicting key climate variables such as temperature and precipitation in the 2-week to 2-month time scales. Skillful SSF would have immense societal value, in areas such as agricultural productivity, water resource management, transportation and aviation systems, and emergency planning for extreme weather events. However, SSF is considered more challenging than either weather prediction or even seasonal prediction. In this paper, we carefully study a variety of machine learning (ML) approaches for SSF over the US mainland. While atmosphere-land-ocean couplings and the limited amount of good quality data makes it hard to apply black-box ML naively, we show that with carefully constructed feature representations, even linear regression models, e.g., Lasso, can be made to perform well. Among a broad suite of 10 ML approaches considered, gradient boosting performs the best, and deep learning (DL) methods show some promise with careful architecture choices. Overall, suitable ML methods are able to outperform the climatological baseline, i.e., predictions based on the 30-year average at a given location and time. Further, based on studying feature importance, ocean (especially indices based on climatic oscillations such as El Nino) and land (soil moisture) covariates are found to be predictive, whereas atmospheric covariates are not considered helpful.

preprint2020arXiv

Variable-density buoyancy-driven turbulence with asymmetric initial density distribution

The effects of different initial density distributions on the evolution of buoyancy-driven homogeneous variable-density turbulence (HVDT) at low (0.05) and high (0.75) Atwood numbers are studied by using high-resolution direct numerical simulations. HVDT aims to mimic the acceleration-driven Rayleigh-Taylor and shock-driven Richtmyer-Meshkov instabilities and reveals new physics that arise from variable-density effects on the turbulent mixing. Here, the initial amounts of pure light and pure heavy flows are altered primarily to mimic the variable-density turbulence at the different locations of the Rayleigh-Taylor and Richtmyer-Meshkov instabilities' mixing layers where the amounts of the mixing fluids are not equal. It is found that for the low Atwood number cases, the asymmetric initial density distribution has limited effects on both global and local flow evolution for HVDT. However, at high Atwood number, both global flow evolution and the local flow structures are strongly affected by the initial composition ratio. The flow composed of more light fluid reaches higher turbulent levels and the local statistics reach their fully-developed behavior earlier in the time evolution. During the late time decay, where most of the flow is well-mixed, all parameters become independent of the initial composition ratio for both low and high Atwood number cases.

preprint2013arXiv

Bethe-ADMM for Tree Decomposition based Parallel MAP Inference

We consider the problem of maximum a posteriori (MAP) inference in discrete graphical models. We present a parallel MAP inference algorithm called Bethe-ADMM based on two ideas: tree-decomposition of the graph and the alternating direction method of multipliers (ADMM). However, unlike the standard ADMM, we use an inexact ADMM augmented with a Bethe-divergence based proximal function, which makes each subproblem in ADMM easy to solve in parallel using the sum-product algorithm. We rigorously prove global convergence of Bethe-ADMM. The proposed algorithm is extensively evaluated on both synthetic and real datasets to illustrate its effectiveness. Further, the parallel Bethe-ADMM is shown to scale almost linearly with increasing number of cores.

preprint2012arXiv

Gap Filling in the Plant Kingdom---Trait Prediction Using Hierarchical Probabilistic Matrix Factorization

Plant traits are a key to understanding and predicting the adaptation of ecosystems to environmental changes, which motivates the TRY project aiming at constructing a global database for plant traits and becoming a standard resource for the ecological community. Despite its unprecedented coverage, a large percentage of missing data substantially constrains joint trait analysis. Meanwhile, the trait data is characterized by the hierarchical phylogenetic structure of the plant kingdom. While factorization based matrix completion techniques have been widely used to address the missing data problem, traditional matrix factorization methods are unable to leverage the phylogenetic structure. We propose hierarchical probabilistic matrix factorization (HPMF), which effectively uses hierarchical phylogenetic information for trait prediction. We demonstrate HPMF's high accuracy, effectiveness of incorporating hierarchical structure and ability to capture trait correlation through experiments.

preprint2012arXiv

Gaussian Process Topic Models

We introduce Gaussian Process Topic Models (GPTMs), a new family of topic models which can leverage a kernel among documents while extracting correlated topics. GPTMs can be considered a systematic generalization of the Correlated Topic Models (CTMs) using ideas from Gaussian Process (GP) based embedding. Since GPTMs work with both a topic covariance matrix and a document kernel matrix, learning GPTMs involves a novel component-solving a suitable Sylvester equation capturing both topic and document dependencies. The efficacy of GPTMs is demonstrated with experiments evaluating the quality of both topic modeling and embedding.

preprint2012arXiv

Online Alternating Direction Method

Online optimization has emerged as powerful tool in large scale optimization. In this paper, we introduce efficient online algorithms based on the alternating directions method (ADM). We introduce a new proof technique for ADM in the batch setting, which yields the O(1/T) convergence rate of ADM and forms the basis of regret analysis in the online setting. We consider two scenarios in the online setting, based on whether the solution needs to lie in the feasible set or not. In both settings, we establish regret bounds for both the objective function as well as constraint violation for general and strongly convex functions. Preliminary results are presented to illustrate the performance of the proposed algorithms.