Source author record

Zhenjie Ren

Zhenjie Ren appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.PR math.AP math.OC Machine Learning Multiagent Systems Computer Science and Game Theory econ.GN Neural and Evolutionary Computing q-fin.EC

Catalog footprint

What is connected

19works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Continuous-time q-learning for mean-field control with common noise, part-I: Theoretical foundations

This paper investigates the continuous-time counterpart of the Q-function for entropy-regularized mean-field control (MFC) with controlled common noise, coined as q-function by Jia and Zhou (2023) in the single agent's model. We first show that, under discretely sampled actions, the value function in the exploratory formulation converges to the one in the relaxed control formulation as the time grid refines. Leveraging the relaxed control formulation, we derive the exploratory Hamilton-Jacobi-Bellman (HJB) equation, in which the controlled common noise gives rise to an additional nonlinear functional of policy, rendering the policy iteration intricate. Under certain concavity condition, we establish the existence and uniqueness of the optimal one-step policy iteration via a first-order condition using the partial linear functional derivative with respect to policy. The policy improvement at each iteration is verified by relating to an entropy-regularized optimization problem over the space of policies. In the mean-field setting, we introduce the integrated q-function (Iq-function) defined on the state distribution and the policy, and it is shown that an optimal policy is identified as a two-layer fixed point to the argmax operator of the Iq-function. Finally, we provide the explicit characterization of an optimal policy as a Gaussian distribution in the general linear-quadratic (LQ) setting.

preprint2026arXiv

Continuous-time q-learning for mean-field control with common noise, part-II: q-learning algorithms

This paper is a continuation work of Ren et al. (2026) aiming to further devise q-learning algorithms for mean-field control (MFC) with controlled common noise. Based on the relaxed control formulation, we first establish the martingale condition of the value function and the Iq-function by evaluating along the conditional state distributions generated by all test policies. As the data in the relaxed control formulation are not observable in practice, we quantify the error incurred when they are replaced by the observable ones in the exploratory formulation under discretely sampled actions. This, together with a two-layer fixed point characterization of an optimal policy in Ren et al. (2026), allows us to propose several algorithms including the Actor-Critic q-learning algorithm, in which the policy is updated in the Actor-step based on the iteration rule induced by the improved Iq-function, and the value function and Iq-function are updated in the Critic-step based on the martingale orthogonality condition using the data from the exploratory formulation. We also establish the convergence of the inner iterations in the Actor-step in an infinite-horizon linear quadratic (LQ) framework. In two examples, within and beyond LQ framework, our q-learning algorithms are implemented with satisfactory performance.

preprint2026arXiv

Discrete Flow Matching: Convergence Guarantees Under Minimal Assumptions

Flow Matching has recently emerged as a popular class of generative models for simulating a target distribution $μ_1$ from samples drawn from a source distribution $μ_0$. This framework relies on a fixed coupling between $μ_0$ and $μ_1$, and on a deterministic or stochastic bridge to define an interpolating process between the two distributions. The time marginals of this process can then be approximately sampled by estimating the transition rates, or more generally the generator, of its Markovian projection. This framework has recently been extended to the case of discrete source and target distributions, under the name Discrete Flow Matching (DFM). However, theoretical guarantees for such models remain scarce. In this paper, we study two DFM models on $\mathbb{Z}_m^d = \{0,\ldots,m-1\}^d$, sampled through time discretization, and derive non-asymptotic associated bounds for both of them. In contrast to previous work, we establish non-asymptotic bounds in Kullback--Leibler divergence for the early-stopped version of the target distribution. We also derive explicit convergence guarantees in total variation distance with respect to the true target distribution. Importantly, these bounds rely only on an approximation error assumption, relaxing standard score assumptions used in earlier works, while also yielding improved dependence on the vocabulary size $m$ and the dimension $d$.

preprint2026arXiv

Quantitative weak propagation of chaos for McKean--Vlasov branching diffusion processes

We study in this paper the weak propagation of chaos for McKean--Vlasov diffusions with branching, whose induced marginal measures are nonnegative finite measures but not necessary probability measures. The flow of marginal measures satisfies a non-linear Fokker--Planck equation, along which we provide a functional Itô's formula. We then consider a functional of the terminal marginal measure of the branching process, whose conditional value is solution to a Kolmogorov backward master equation. By using Itô's formula and based on the estimates of second-order linear and intrinsic functional derivatives of the value function, we finally derive a quantitative weak convergence rate for the empirical measures of the branching diffusion processes with finite population.

preprint2022arXiv

Entropic turnpike estimates for the kinetic Schrödinger problem

We investigate the kinetic Schrödinger problem, obtained considering Langevin dynamics instead of Brownian motion in Schrödinger's thought experiment. Under a quasilinearity assumption we establish exponential entropic turnpike estimates for the corresponding Schrödinger bridges and exponentially fast convergence of the entropic cost to the sum of the marginal entropies in the long-time regime, which provides as a corollary an entropic Talagrand inequality. In order to do so, we profit from recent advances in the understanding of classical Schrödinger bridges and adaptations of Bakry-Émery formalism to the kinetic setting. Our quantitative results are complemented by basic structural results such as dual representation of the entropic cost and the existence of Schrödinger potentials.

preprint2022arXiv

Nonlinear predictable representation and $L^1$-solutions of backward SDEs and second-order backward SDEs

The theory of backward SDEs extends the predictable representation property of Brownian motion to the nonlinear framework, thus providing a path-dependent analog of fully nonlinear parabolic PDEs. In this paper, we consider backward SDEs, their reflected version, and their second-order extension, in the context where the final data and the generator satisfy $L^1$-type of integrability condition. Our main objective is to provide the corresponding existence and uniqueness results for general Lipschitz generators. The uniqueness holds in the so-called Doob class of processes, simultaneously under an appropriate class of measures. We emphasize that the previous literature only deals with backward SDEs, and requires either that the generator is separable in $(y,z)$, see Peng [Pen97], or strictly sublinear in the gradient variable $z$, see [BDHPS03], or that the final data satisfies an $L\ln L$-integrability condition, see [HT18]. We by-pass these conditions by defining $L^1$-integrability under the nonlinear expectation operator induced by the previously mentioned class of measures.

preprint2022arXiv

On path-dependent multidimensional forward-backward SDEs

This paper extends the results of Ma, Wu, Zhang, Zhang [11] to the context of path-dependent multidimensional forward-backward stochastic differential equations (FBSDE). By path-dependent we mean that the coefficients of the forward-backward SDE at time t can depend on the whole path of the forward process up to time t. Such a situation appears when solving path-dependent stochastic control problems by means of variational calculus. At the heart of our analysis is the construction of a decoupling random field on the path space. We first prove the existence and the uniqueness of decoupling field on small time interval. Then by introducing the characteristic BSDE, we show that a global decoupling field can be constructed by patching local solutions together as long as the solution of the characteristic BSDE remains bounded. Finally, we provide a stability result for path-dependent forward-backward SDEs.

preprint2022arXiv

Principal-agent problem with multiple principals

We consider a moral hazard problem with multiple principals in a continuous-time model. The agent can only work exclusively for one principal at a given time, so faces an optimal switching problem. Using a randomized formulation, we manage to represent the agent's value function and his optimal effort by an Itô process. This representation further helps to solve the principals' problem in case we have infinite number of principals in the sense of mean field game. Finally the mean field formulation is justified by an argument of propagation of chaos.

preprint2022arXiv

Random horizon principal-agent problems

We consider a general formulation of the random horizon Principal-Agent problem with a continuous payment and a lump-sum payment at termination. In the European version of the problem, the random horizon is chosen solely by the principal with no other possible action from the agent than exerting effort on the dynamics of the output process. We also consider the American version of the contract, which covers the seminal Sannikov's model, where the agent can also quit by optimally choosing the termination time of the contract. Our main result reduces such non-zero-sum stochastic differential games to appropriate stochastic control problems which may be solved by standard methods of stochastic control theory. This reduction is obtained by following Sannikov's approach, further developed by Cvitanic, Possamai, and Touzi. We first introduce an appropriate class of contracts for which the agent's optimal effort is immediately characterized by the standard verification argument in stochastic control theory. We then show that this class of contracts is dense in an appropriate sense so that the optimization over this restricted family of contracts represents no loss of generality. The result is obtained by using the recent well-posedness result of random horizon second-order backward SDE.

preprint2022arXiv

Second order backward SDE with random terminal time

Backward stochastic differential equations extend the martingale representation theorem to the nonlinear setting. This can be seen as path-dependent counterpart of the extension from the heat equation to fully nonlinear parabolic equations in the Markov setting. This paper extends such a nonlinear representation to the context where the random variable of interest is measurable with respect to the information at a finite stopping time. We provide a complete wellposedness theory which covers the semilinear case (backward SDE), the semilinear case with obstacle (reflected backward SDE), and the fully nonlinear case (second order backward SDE).

preprint2020arXiv

Game on Random Environment, Mean-field Langevin System and Neural Networks

In this paper we study a type of games regularized by the relative entropy, where the players' strategies are coupled through a random environment variable. Besides the existence and the uniqueness of equilibria of such games, we prove that the marginal laws of the corresponding mean-field Langevin systems can converge towards the games' equilibria in different settings. As applications, the dynamic games can be treated as games on a random environment when one treats the time horizon as the environment. In practice, our results can be applied to analysing the stochastic gradient descent algorithm for deep neural networks in the context of supervised learning as well as for the generative adversarial networks.

preprint2016arXiv

A dual algorithm for stochastic control problems: Applications to Uncertain Volatility Models and CVA

We derive an algorithm in the spirit of Rogers and Davis & Burstein that leads to upper bounds for stochastic control problems. Our bounds complement lower biased estimates recently obtained in the work of Guyon & Henry-Labordère. We evaluate our estimates in numerical examples motivated from mathematical finance.

preprint2016arXiv

On the convergence of monotone schemes for path-dependent PDE

We propose a reformulation of the convergence theorem of monotone numerical schemes introduced by Zhang and Zhuo for viscosity solutions of path-dependent PDEs, which extends the seminal work of Barles and Souganidis on the viscosity solution of PDE. We prove the convergence theorem under conditions similar to those of the classical theorem in the work of Barles and Souganidis. These conditions are satisfied, to the best of our knowledge, by all classical monotone numerical schemes in the context of stochastic control theory. In particular, the paper provides a unified approach to prove the convergence of numerical schemes for non-Markovian stochastic control problems, second order BSDEs, stochastic differential games etc.

preprint2016arXiv

Viscosity Solutions of Fully Nonlinear Elliptic Path Dependent PDEs

This paper introduces a convenient solution space for the uniformly elliptic fully nonlinear path dependent PDEs. It provides a wellposedness result under standard Lipschitz-type assumptions on the nonlinearity and an additional assumption formulated on some partial differential equation defined locally by freezing the path.

preprint2015arXiv

Comparison of viscosity solutions of fully nonlinear degenerate parabolic Path-dependent PDEs

We prove a comparison result for viscosity solutions of (possibly degenerate) parabolic fully nonlinear path-dependent PDEs. In contrast with the previous result in Ekren, Touzi & Zhang, our conditions are easier to check and allow for the degenerate case, thus including first order path-dependent PDEs. Our argument follows the regularization method as introduced by Jensen, Lions & Souganidis in the corresponding finite-dimensional PDE setting. The present argument significantly simplifies the comparison proof in Ekren, Touzi & Zhang, but requires an $L^p-$type of continuity (with respect to the path) for the viscosity semi-solutions and for the nonlinearity defining the equation.

preprint2015arXiv

Perron's method for viscosity solutions of semilinear path dependent PDEs

This paper proves the existence of viscosity solutions of path dependent semilinear PDEs via Perron's method, i.e. via showing that the supremum of viscosity subsolutions is a viscosity solution. We use the notion of viscosity solutions introduced by Ekren, Keller, Touzi and Zhang, in whose work all smooth processes which are tangent in mean are considered as test functions. We also provide a comparison result for semicontinuous viscosity solutions, by using a regularization technique. As an interesting byproduct, we give a new short proof for the optimal stopping problem with semicontinuous obstacles.

preprint2014arXiv

An overview of Viscosity Solutions of Path-Dependent PDEs

This paper provides an overview of the recently developed notion of viscosity solutions of path-dependent partial di erential equations. We start by a quick review of the Crandall- Ishii notion of viscosity solutions, so as to motivate the relevance of our de nition in the path-dependent case. We focus on the wellposedness theory of such equations. In partic- ular, we provide a simple presentation of the current existence and uniqueness arguments in the semilinear case. We also review the stability property of this notion of solutions, in- cluding the adaptation of the Barles-Souganidis monotonic scheme approximation method. Our results rely crucially on the theory of optimal stopping under nonlinear expectation. In the dominated case, we provide a self-contained presentation of all required results. The fully nonlinear case is more involved and is addressed in [12].

preprint2014arXiv

Comparison of Viscosity Solutions of Semi-linear Path-Dependent PDEs

This paper provides a probabilistic proof of the comparison result for viscosity solutions of path-dependent semilinear PDEs. We consider the notion of viscosity solutions introduced in \cite{EKTZ} which considers as test functions all those smooth processes which are tangent in mean. When restricted to the Markovian case, this definition induces a larger set of test functions, and reduces to the notion of stochastic viscosity solutions analyzed in \cite{BayraktarSirbu1,BayraktarSirbu2}. Our main result takes advantage of this enlargement of the test functions, and provides an easier proof of comparison. This is most remarkable in the context of the linear path-dependent heat equation. As a key ingredient for our methodology, we introduce a notion of punctual differentiation, similar to the corresponding concept in the standard viscosity solutions \cite{CaffarelliCabre}, and we prove that semimartingales are almost everywhere punctually differentiable. This smoothness result can be viewed as the counterpart of the Aleksandroff smoothness result for convex functions. A similar comparison result was established earlier in \cite{EKTZ}. The result of this paper is more general and, more importantly, the arguments that we develop do not rely on any representation of the solution.

preprint2014arXiv

Large Deviations for Non-Markovian Diffusions and a Path-Dependent Eikonal Equation

This paper provides a large deviation principle for Non-Markovian, Brownian motion driven stochastic differential equations with random coefficients. Similar to Gao and Liu \cite{GL}, this extends the corresponding results collected in Freidlin and Wentzell \cite{FreidlinWentzell}. However, we use a different line of argument, adapting the PDE method of Fleming \cite{Fleming} and Evans and Ishii \cite{EvansIshii} to the path-dependent case, by using backward stochastic differential techniques. Similar to the Markovian case, we obtain a characterization of the action function as the unique bounded solution of a path-dependent version of the Eikonal equation. Finally, we provide an application to the short maturity asymptotics of the implied volatility surface in financial mathematics.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint