Researcher profile

Sayan Mukherjee

Sayan Mukherjee contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
14works
0followers
18topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

14 published item(s)

preprint2026arXiv

At the Intersection of Deep Sequential Model Framework and State-space Model Framework: Study on Option Pricing

Inference and forecast problems of the nonlinear dynamical system have arisen in a variety of contexts. Reservoir computing and deep sequential models, on the one hand, have demonstrated efficient, robust, and superior performance in modeling simple and chaotic dynamical systems. However, their innate deterministic feature has partially detracted their robustness to noisy system, and their inability to offer uncertainty measurement has also been an insufficiency of the framework. On the other hand, the traditional state-space model framework is robust to noise. It also carries measured uncertainty, forming a just-right complement to the reservoir computing and deep sequential model framework. We propose the unscented reservoir smoother, a model that unifies both deep sequential and state-space models to achieve both frameworks' superiorities. Evaluated in the option pricing setting on top of noisy datasets, URS strikes highly competitive forecasting accuracy, especially those of longer-term, and uncertainty measurement. Further extensions and implications on URS are also discussed to generalize a full integration of both frameworks.

preprint2026arXiv

Synthesizing POMDP Policies: Sampling Meets Model-checking via Learning

Partially Observable Markov Decision Processes (POMDPs) are the standard framework for decision-making under uncertainty. While sampling-based methods scale well, they lack formal correctness guarantees, making them unsuitable for safety-critical applications. Conversely, formal synthesis techniques provide correctness-by-construction but often struggle with scalability, as general POMDP synthesis is undecidable. To bridge this gap, we propose a synthesis framework that integrates sampling, automata learning, and model-checking. Inspired by Angluin's $L^*$ algorithm, our approach utilizes sampling as a membership oracle and model-checking as an equivalence oracle. This enables the synthesis of finite-state controllers with formal guarantees, provided the sampling-induced policy is regular. We establish a relative completeness result for this framework. Experimental results from our prototypical implementation demonstrate that this method successfully solves threshold-safety problems that remain challenging for existing formal synthesis tools. We believe our algorithm serves as a valuable component in a portfolio approach to tackling the inherent difficulty of POMDP synthesis problems.

preprint2023arXiv

Extended probabilities and their application to statistical inference

We propose a new, more general definition of extended probability measures. We study their properties and provide a behavioral interpretation. We put them to use in an inference procedure, whose environment is canonically represented by the probability space $(Ω,\mathcal{F},P)$, when both $P$ and the composition of $Ω$ are unknown. We develop an ex ante analysis -- taking place before the statistical analysis requiring knowledge of $Ω$ -- in which the true composition of $Ω$ is progressively learned. We describe how to update extended probabilities in this setting, and introduce the concept of lower extended probabilities. We apply our findings to a species sampling problem and to the study of the boomerang effect (the empirical observation that sometimes persuasion yields the opposite effect: the persuaded agent moves their opinion away from the opinion of the persuading agent).

preprint2022arXiv

A Grover search-based algorithm for the list coloring problem

Graph coloring is a computationally difficult problem, and currently the best known classical algorithm for $k$-coloring of graphs on $n$ vertices has runtimes $Ω(2^n)$ for $k\ge 5$. The list coloring problem asks the following more general question: given a list of available colors for each vertex in a graph, does it admit a proper coloring? We propose a quantum algorithm based on Grover search to quadratically speed up exhaustive search. Our algorithm loses in complexity to classical ones in specific restricted cases, but improves exhaustive search for cases where the lists and graphs considered are arbitrary in nature.

preprint2022arXiv

Multiple testing with persistent homology

In this paper we propose a computationally efficient multiple hypothesis testing procedure for persistent homology. The computational efficiency of our procedure is based on the observation that one can empirically simulate a null distribution that is universal across many hypothesis testing applications involving persistence homology. Our observation suggests that one can simulate the null distribution efficiently based on a small number of summaries of the collected data and use this null in the same way that p-value tables were used in classical statistics. To illustrate the efficiency and utility of the null distribution we provide procedures for rejecting acyclicity with both control of the Family-Wise Error Rate (FWER) and the False Discovery Rate (FDR). We will argue that the empirical null we propose is very general conditional on a few summaries of the data based on simulations and limit theorems for persistent homology for point processes.

preprint2022arXiv

Tight query complexity bounds for learning graph partitions

Given a partition of a graph into connected components, the membership oracle asserts whether any two vertices of the graph lie in the same component or not. We prove that for $n\ge k\ge 2$, learning the components of an $n$-vertex hidden graph with $k$ components requires at least $(k-1)n-\binom k2$ membership queries. Our result improves on the best known information-theoretic bound of $Ω(n\log k)$ queries, and exactly matches the query complexity of the algorithm introduced by [Reyzin and Srivastava, 2007] for this problem. Additionally, we introduce an oracle, with access to which one can learn the number of components of $G$ in asymptotically fewer queries than learning the full partition, thus answering another question posed by the same authors. Lastly, we introduce a more applicable version of this oracle, and prove asymptotically tight bounds of $\widetildeΘ(m)$ queries for both learning and verifying an $m$-edge hidden graph $G$ using it.

preprint2020arXiv

A Case for Quantifying Statistical Robustness of Specialized Probabilistic AI Accelerators

Statistical machine learning often uses probabilistic algorithms, such as Markov Chain Monte Carlo (MCMC), to solve a wide range of problems. Many accelerators are proposed using specialized hardware to address sampling inefficiency, the critical performance bottleneck of probabilistic algorithms. These accelerators usually improve the hardware efficiency by using some approximation techniques, such as reducing bit representation, truncating small values to zero, or simplifying the Random Number Generator (RNG). Understanding the influence of these approximations on result quality is crucial to meeting the quality requirements of real applications. Although a common approach is to compare the end-point result quality using community-standard benchmarks and metrics, we claim a probabilistic architecture should provide some measure (or guarantee) of statistical robustness. This work takes a first step towards quantifying the statistical robustness of specialized hardware MCMC accelerators by proposing three pillars of statistical robustness: sampling quality, convergence diagnostic, and goodness of fit. Each pillar has at least one quantitative metric without the need to know the ground truth data. We apply this method to analyze the statistical robustness of an MCMC accelerator proposed by previous work, with some modifications, as a case study. The method also applies to other probabilistic accelerators and can be used in design space exploration.

preprint2020arXiv

Beyond Application End-Point Results: Quantifying Statistical Robustness of MCMC Accelerators

Statistical machine learning often uses probabilistic algorithms, such as Markov Chain Monte Carlo (MCMC), to solve a wide range of problems. Probabilistic computations, often considered too slow on conventional processors, can be accelerated with specialized hardware by exploiting parallelism and optimizing the design using various approximation techniques. Current methodologies for evaluating correctness of probabilistic accelerators are often incomplete, mostly focusing only on end-point result quality ("accuracy"). It is important for hardware designers and domain experts to look beyond end-point "accuracy" and be aware of the hardware optimizations impact on other statistical properties. This work takes a first step towards defining metrics and a methodology for quantitatively evaluating correctness of probabilistic accelerators beyond end-point result quality. We propose three pillars of statistical robustness: 1) sampling quality, 2) convergence diagnostic, and 3) goodness of fit. We apply our framework to a representative MCMC accelerator and surface design issues that cannot be exposed using only application end-point result quality. Applying the framework to guide design space exploration shows that statistical robustness comparable to floating-point software can be achieved by slightly increasing the bit representation, without floating-point hardware requirements.

preprint2020arXiv

Random Lie Brackets that Induce Torsion: A Model for Noisy Vector Fields

We define and study a random Lie bracket that induces torsion in expectation. Almost all stochastic analysis on manifolds have assumed parallel transport. Mathematically this assumption is very reasonable. However, in many applied geometry and graphics problems parallel transport is not achieved, the "change in coordinates" are not exact due to noise. We formulate a stochastic model on a manifold for which parallel transport does not hold and analyze the consequences of this model with respect to classic quantities studied in Riemannian geometry. We first define a stochastic lie bracket that induces a stochastic covariant derivative. We then study the connection implied by the stochastic covariant derivative and note that the stochastic lie bracket induces torsion. We then state the induced stochastic geodesic equations and a stochastic differential equation for parallel transport. We also derive the curvature tensors for our construction and a stochastic Laplace-Beltrami operator. We close with a discussion of the motivation and relevance of our construction.

preprint2020arXiv

Stanza: A Nonlinear State Space Model for Probabilistic Inference in Non-Stationary Time Series

Time series with long-term structure arise in a variety of contexts and capturing this temporal structure is a critical challenge in time series analysis for both inference and forecasting settings. Traditionally, state space models have been successful in providing uncertainty estimates of trajectories in the latent space. More recently, deep learning, attention-based approaches have achieved state of the art performance for sequence modeling, though often require large amounts of data and parameters to do so. We propose Stanza, a nonlinear, non-stationary state space model as an intermediate approach to fill the gap between traditional models and modern deep learning approaches for complex time series. Stanza strikes a balance between competitive forecasting accuracy and probabilistic, interpretable inference for highly structured time series. In particular, Stanza achieves forecasting accuracy competitive with deep LSTMs on real-world datasets, especially for multi-step ahead forecasting.

preprint2020arXiv

Subspace Clustering through Sub-Clusters

The problem of dimension reduction is of increasing importance in modern data analysis. In this paper, we consider modeling the collection of points in a high dimensional space as a union of low dimensional subspaces. In particular we propose a highly scalable sampling based algorithm that clusters the entire data via first spectral clustering of a small random sample followed by classifying or labeling the remaining out of sample points. The key idea is that this random subset borrows information across the entire data set and that the problem of clustering points can be replaced with the more efficient and robust problem of "clustering sub-clusters". We provide theoretical guarantees for our procedure. The numerical results indicate we outperform other state-of-the-art subspace clustering algorithms with respect to accuracy and speed.

preprint2019arXiv

A study on dynamical complexity of noise induced blood flow

In this article, the dynamics and complexity of a noise induced blood flow system have been investigated. Changes in the dynamics have been recognized by measuring the periodicity over significant parameters. Chaotic as well as non-chaotic regimes have also been classified. Further, dynamical complexity has been studied by phase space based weighted entropy. Numerical results show a strong correlation between the dynamics and complexity of the noise induced system. The correlation has been confirmed by a cross-correlation analysis.

preprint2010arXiv

Learning gradients on manifolds

A common belief in high-dimensional data analysis is that data are concentrated on a low-dimensional manifold. This motivates simultaneous dimension reduction and regression on manifolds. We provide an algorithm for learning gradients on manifolds for dimension reduction for high-dimensional data with few observations. We obtain generalization error bounds for the gradient estimates and show that the convergence rate depends on the intrinsic dimension of the manifold and not on the dimension of the ambient space. We illustrate the efficacy of this approach empirically on simulated and real data and compare the method to other dimension reduction procedures.

preprint2010arXiv

Towards Stratification Learning through Homology Inference

A topological approach to stratification learning is developed for point cloud data drawn from a stratified space. Given such data, our objective is to infer which points belong to the same strata. First we define a multi-scale notion of a stratified space, giving a stratification for each radius level. We then use methods derived from kernel and cokernel persistent homology to cluster the data points into different strata, and we prove a result which guarantees the correctness of our clustering, given certain topological conditions; some geometric intuition for these topological conditions is also provided. Our correctness result is then given a probabilistic flavor: we give bounds on the minimum number of sample points required to infer, with probability, which points belong to the same strata. Finally, we give an explicit algorithm for the clustering, prove its correctness, and apply it to some simulated data.