Researcher profile

Soumik Pal

Soumik Pal contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
19works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

19 published item(s)

preprint2022arXiv

Asymptotics of Discrete Schrödinger Bridges via Chaos Decomposition

Consider the problem of matching two independent i.i.d. samples of size $N$ from two distributions $P$ and $Q$ in $\mathbb{R}^d$. For an arbitrary continuous cost function, the optimal assignment problem looks for the matching that minimizes the total cost. We consider instead in this paper the problem where each matching is endowed with a Gibbs probability weight proportional to the exponential of the negative total cost of that matching. Viewing each matching as a joint distribution with $N$ atoms, we then take a convex combination with respect to the above Gibbs probability measure. We show that this resulting random joint distribution converges, as $N\rightarrow \infty$, to the solution of a variational problem, introduced by Föllmer, called the Schrödinger problem. We also derive the first two error terms of orders $N^{-1/2}$ and $N^{-1}$, respectively. This gives us central limit theorems for integrated test functions, including for the cost of transport, and second order Gaussian chaos limits when the limiting Gaussian variance is zero. The proofs are based on a novel chaos decomposition of the discrete Schrödinger bridge by polynomial functions of the pair of empirical distributions as the first and second order Taylor approximations in the space of measures. This is achieved by extending the Hoeffding decomposition from the classical theory of U-statistics.

preprint2022arXiv

Entropy Regularized Optimal Transport Independence Criterion

We introduce an independence criterion based on entropy regularized optimal transport. Our criterion can be used to test for independence between two samples. We establish non-asymptotic bounds for our test statistic and study its statistical behavior under both the null hypothesis and the alternative hypothesis. The theoretical results involve tools from U-process theory and optimal transport theory. We also offer a random feature type approximation for large-scale problems, as well as a differentiable program implementation for deep learning applications. We present experimental results on existing benchmarks for independence testing, illustrating the interest of the proposed criterion to capture both linear and nonlinear dependencies in synthetic data and real data.

preprint2021arXiv

Ranked masses in two-parameter Fleming-Viot diffusions

In previous work, we constructed Fleming--Viot-type measure-valued diffusions (and diffusions on a space of interval partitions of the unit interval $[0,1]$) that are stationary with the Poisson--Dirichlet laws with parameters $α\in(0,1)$ and $θ\geq 0$. In this paper, we complete the proof that these processes resolve a conjecture by Feng and Sun (2010) by showing that the processes of ranked atom sizes (or of ranked interval lengths) of these diffusions are members of a two-parameter family of diffusions introduced by Petrov (2009), extending a model by Ethier and Kurtz (1981) in the case $α=0$. The latter diffusions are continuum limits of up-down Chinese restaurant processes.

preprint2021arXiv

Triangular Flows for Generative Modeling: Statistical Consistency, Smoothness Classes, and Fast Rates

Triangular flows, also known as Knöthe-Rosenblatt measure couplings, comprise an important building block of normalizing flow models for generative modeling and density estimation, including popular autoregressive flow models such as real-valued non-volume preserving transformation models (Real NVP). We present statistical guarantees and sample complexity bounds for triangular flow statistical models. In particular, we establish the statistical consistency and the finite sample convergence rates of the Kullback-Leibler estimator of the Knöthe-Rosenblatt measure coupling using tools from empirical process theory. Our results highlight the anisotropic geometry of function classes at play in triangular flows, shed light on optimal coordinate ordering, and lead to statistical guarantees for Jacobian flows. We conduct numerical experiments on synthetic data to illustrate the practical implications of our theoretical findings.

preprint2020arXiv

Multiplicative Schrödinger problem and the Dirichlet transport

We consider an optimal transport problem on the unit simplex whose solutions are given by gradients of exponentially concave functions and prove two main results. First, we show that the optimal transport is the large deviation limit of a particle system of Dirichlet processes transporting one probability measure on the unit simplex to another by coordinatewise multiplication and normalizing. The structure of our Lagrangian and the appearance of the Dirichlet process relate our problem closely to the entropic measure on the Wasserstein space as defined by von-Renesse and Sturm in the context of Wasserstein diffusion. The limiting procedure is a triangular limit where we allow simultaneously the number of particles to grow to infinity while the `noise' tends to zero. The method, which generalizes easily to many other cost functions, including the squared Euclidean distance, provides a novel combination of the Schrödinger problem approach due to C. Léonard and the related Brownian particle systems by Adams et al.which does not require gamma convergence. Second, we analyze the behavior of entropy along the paths of transport. The reference measure on the simplex is taken to be the Dirichlet measure with all zero parameters which relates to the finite-dimensional distributions of the entropic measure. The interpolating curves are not the usual McCann lines. Nevertheless we show that entropy plus a multiple of the transport cost remains convex, which is reminiscent of the semiconvexity of entropy along lines of McCann interpolations in negative curvature spaces. We also obtain, under suitable conditions, dimension-free bounds of the optimal transport cost in terms of entropy.

preprint2019arXiv

Metrics on sets of interval partitions with diversity

We first consider interval partitions whose complements are Lebesgue-null and introduce a complete metric that induces the same topology as the Hausdorff distance (between complements). This is done using correspondences between intervals. Further restricting to interval partitions with alpha-diversity, we then adjust the metric to incorporate diversities. We show that this second metric space is Lusin. An important feature of this topology is that path-continuity in this topology implies the continuous evolution of diversities. This is important in related work on tree-valued stochastic processes where diversities are branch lengths.

preprint2013arXiv

Wright-Fisher diffusion with negative mutation rates

We study a family of n-dimensional diffusions, taking values in the unit simplex of vectors with nonnegative coordinates that add up to one. These processes satisfy stochastic differential equations which are similar to the ones for the classical Wright-Fisher diffusions, except that the "mutation rates" are now nonpositive. This model, suggested by Aldous, appears in the study of a conjectured diffusion limit for a Markov chain on Cladograms. The striking feature of these models is that the boundary is not reflecting, and we kill the process once it hits the boundary. We derive the explicit exit distribution from the simplex and probabilistic bounds on the exit time. We also prove that these processes can be viewed as a "stochastic time-reversal" of a Wright-Fisher process of increasing dimensions and conditioned at a random time. A key idea in our proofs is a skew-product construction using certain one-dimensional diffusions called Bessel-square processes of negative dimensions, which have been recently introduced by Going-Jaeschke and Yor.

preprint2012arXiv

Archimedes' principle for Brownian liquid

We consider a family of hard core objects moving as independent Brownian motions confined to a vessel by reflection. These are subject to gravitational forces modeled by drifts. The stationary distribution for the process has many interesting implications, including an illustration of the Archimedes' principle. The analysis rests on constructing reflecting Brownian motion with drift in a general open connected domain and studying its stationary distribution. In dimension two we utilize known results about sphere packing.

preprint2012arXiv

Brownian approximation to counting graphs

Let C(n,k) denote the number of connected graphs with n labeled vertices and n+k-1 edges. For any sequence (k_n), the limit of C(n,k_n) as n tends to infinity is known. It has been observed that, if k_n=o(\sqrt{n}), this limit is asymptotically equal to the $k_n$th moment of the area under the standard Brownian excursion. These moments have been computed in the literature via independent methods. In this article we show why this is true for k_n=o(\sqrt[3]{n}) starting from an observation made by Joel Spencer. The elementary argument uses a result about strong embedding of the Uniform empirical process in the Brownian bridge proved by Komlos, Major, and Tusnady.

preprint2012arXiv

Markov processes on time-like graphs

We study Markov processes where the "time" parameter is replaced by paths in a directed graph from an initial vertex to a terminal one. Along each directed path the process is Markov and has the same distribution as the one along any other directed path. If two directed paths do not interact, in a suitable sense, then the distributions of the processes on the two paths are conditionally independent, given their values at the common endpoint of the two paths. Conditions on graphs that support such processes (e.g., hexagonal lattice) are established. Next we analyze a particularly suitable family of Markov processes, called harnesses, which includes Brownian motion and other Lévy processes, on such time-like graphs. Finally we investigate continuum limits of harnesses on a sequence of time-like graphs that admits a limit in a suitable sense.

preprint2012arXiv

Sparse regular random graphs: Spectral density and eigenvectors

We examine the empirical distribution of the eigenvalues and the eigenvectors of adjacency matrices of sparse regular random graphs. We find that when the degree sequence of the graph slowly increases to infinity with the number of vertices, the empirical spectral distribution converges to the semicircle law. Moreover, we prove concentration estimates on the number of eigenvalues over progressively smaller intervals. We also show that, with high probability, all the eigenvectors are delocalized.

preprint2012arXiv

Systems of Brownian particles with asymmetric collisions

We study systems of Brownian particles on the real line, which interact by splitting the local times of collisions among themselves in an asymmetric manner. We prove the strong existence and uniqueness of such processes and identify them with the collections of ordered processes in a Brownian particle system, in which the drift coefficients, the diffusion coefficients, and the collision local times for the individual particles are assigned according to their ranks. These Brownian systems can be viewed as generalizations of those arising in first-order models for equity markets in the context of stochastic portfolio theory, and are able to correct for several shortcomings of such models while being equally amenable to computations. We also show that, in addition to being of interest in their own right, such systems of Brownian particles arise as universal scaling limits of systems of jump processes on the integer lattice with local interactions. A key step in the proof is the analysis of a generalization of Skorokhod maps which include `local times' at the intersection of faces of the nonnegative orthant. The result extends the convergence of TASEP to its continuous analogue. Finally, we identify those among the Brownian particle systems which have a probabilistic structure of determinantal type.

preprint2011arXiv

Concentration for multidimensional diffusions and their boundary local times

We prove that probability laws of certain multidimensional semimartingales which includes time-inhomogenous diffusions, under suitable assumptions, satisfy Quadratic Transportation Cost Inequality under the uniform metric. From this we derive concentration properties of Lipschitz functions of process paths that depend on the entire history. In particular, we estimate concentration of boundary local time of reflected Brownian motions on a polyhedral domain. We work out explicit applications of consequences of measure concentration for the case of Brownian motion with rank-based drifts.

preprint2011arXiv

Convergence rates for rank-based models with applications to portfolio theory

We determine rates of convergence of rank-based interacting diffusions and semimartingale reflecting Brownian motions to equilibrium. Convergence rate for the total variation metric is derived using Lyapunov functions. Sharp fluctuations of additive functionals are obtained using Transportation Cost-Information inequalities for Markov processes. We work out various applications to the rank-based abstract equity markets used in Stochastic Portfolio Theory. For example, we produce quantitative bounds, including constants, for fluctuations of market weights and occupation times of various ranks for individual coordinates. Another important application is the comparison of performance between symmetric functionally generated portfolios and the market portfolio. This produces estimates of probabilities of "beating the market".

preprint2011arXiv

Extinction of Fleming-Viot-type particle systems with strong drift

We consider a Fleming-Viot-type particle system consisting of independently moving particles that are killed on the boundary of a domain. At the time of death of a particle, another particle branches. If there are only two particles and the underlying motion is a Bessel process on $(0,\infty)$, both particles converge to 0 at a finite time if and only if the dimension of the Bessel process is less than 0. If the underlying diffusion is Brownian motion with a drift stronger than (but arbitrarily close to, in a suitable sense) the drift of a Bessel process, all particles converge to 0 at a finite time, for any number of particles.

preprint2011arXiv

On the Aldous diffusion on Continuum Trees. I

Consider a Markov chain on the space of rooted real binary trees that randomly removes leaves and reinserts them on a random edge and suitably rescales the lengths of edges. This chain was introduced by David Aldous who conjectured a diffusion limit of this chain, as the size of the tree grows, on the space of continuum trees. We prove the existence of a process on continuum trees, which via a random time change, displays properties one would expect from the conjectured Aldous diffusion. The existence of our process is proved by considering an explicit scaled limit of a Poissonized version of the Aldous Markov chain running on finite trees. The analysis involves taking limit of a sequence of splitting trees whose age processes converge but the contour process does not. Several formulas about the limiting process are derived.

preprint2010arXiv

An Excursion-Theoretic Approach to Stability of Discrete-Time Stochastic Hybrid Systems

We address stability of a class of Markovian discrete-time stochastic hybrid systems. This class of systems is characterized by the state-space of the system being partitioned into a safe or target set and its exterior, and the dynamics of the system being different in each domain. We give conditions for $L_1$-boundedness of Lyapunov functions based on certain negative drift conditions outside the target set, together with some more minor assumptions. We then apply our results to a wide class of randomly switched systems (or iterated function systems), for which we give conditions for global asymptotic stability almost surely and in $L_1$. The systems need not be time-homogeneous, and our results apply to certain systems for which functional-analytic or martingale-based estimates are difficult or impossible to get.

preprint2010arXiv

Concentration of measure for systems of Brownian particles interacting through their ranks

We consider a finite or countable collection of one-dimensional Brownian particles whose dynamics at any point in time is determined by their rank in the entire particle system. Using Transportation Cost Inequalities for stochastic processes we provide uniform fluctuation bounds for the ordered particles, their local time of collisions, and various associated statistics over intervals of time. For example, such processes, when exponentiated and rescaled, exhibit power law decay under stationarity; we derive concentration bounds for the empirical estimates of the index of the power law over large intervals of time. A key ingredient in our proofs is a novel upper bound on the Lipschitz constant of the Skorokhod map that transforms a multidimensional Brownian path to a path which is constrained not to leave the positive orthant.