Source author record

Yuan-Hua Ni

Yuan-Hua Ni appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC q-fin.PM

Catalog footprint

What is connected

6works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Deep BSDE-ML Learning and Its Application to Model-Free Optimal Control

A modified Deep BSDE (backward differential equation) learning method with measurability loss, called Deep BSDE-ML method, is introduced in this paper to solve a kind of linear decoupled forward-backward stochastic differential equations (FBSDEs), which is encountered in the policy evaluation of learning the optimal feedback policies of a class of stochastic control problems. The measurability loss is characterized via the measurability of BSDE's state at the forward initial time, which differs from that related to terminal state of the known Deep BSDE method. Though the minima of the two loss functions are shown to be equal, this measurability loss is proved to be equal to the expected mean squared error between the true diffusion term of BSDE and its approximation. This crucial observation extends the application of the Deep BSDE method -- approximating the gradients of the solution of a partial differential equation (PDE) instead of the solution itself. Simultaneously, a learning-based framework is introduced to search an optimal feedback control of a deterministic nonlinear system. Specifically, by introducing Gaussian exploration noise, we are aiming to learn a robust optimal controller under this stochastic case. This reformulation sacrifices the optimality to some extent, but as suggested in reinforcement learning (RL) exploration noise is essential to enable the model-free learning.

preprint2022arXiv

Deterministic Dynamic Stackelberg Games: Time-Consistent Open-Loop Solution

In this paper, the known deterministic linear-quadratic Stackelberg game is revisited, whose open-loop Stackelberg solution actually possesses the nature of time inconsistency. To handle this time inconsistency, {a two-tier game framework is introduced, where the upper-tier game works according to Stackelberg's scenario with a leader and a follower, and two lower-tier intertemporal games give the follower's and leader's equilibrium response mappings that mimic the notion of time-consistent open-loop equilibrium control in existing literature. The resulting open-loop equilibrium solution of the two-tier game} is shown to be weakly time-consistent in the sense that the adopted policies will no longer be denied in the future only if past policies are consistent with the equilibrium policies. On the existence and uniqueness of such a solution, necessary and sufficient conditions are obtained, which are characterized via the solutions of several Riccati-like equations.

preprint2016arXiv

Discrete Time Mean-Field Stochastic Linear-Quadratic Optimal Control Problems

This paper first presents necessary and sufficient conditions for the solvability of discrete time, mean-field, stochastic linear-quadratic optimal control problems. Then, by introducing several sequences of bounded linear operators, the problem becomes an operator stochastic LQ problem, in which the optimal control is a linear state feedback. Furthermore, from the form of the optimal control, the problem changes to a matrix dynamic optimization problem. Solving this optimization problem, we obtain the optimal feedback gain and thus the optimal control. Finally, by completing the square, the optimality of the above control is validated.

preprint2016arXiv

Time-Inconsistent Mean-Field Stochastic LQ Problem: Open-Loop Time-Consistent Control

This paper is concerned with the open-loop time-consistent solution of time-inconsistent mean-field stochastic linear-quadratic optimal control. Different from standard stochastic linear-quadratic problems, both the system matrices and the weighting matrices are dependent on the initial times, and the conditional expectations of the control and state enter quadratically into the cost functional. Such features will ruin Bellman's principle of optimality and result in the time-inconsistency of the optimal control. Based on the dynamical nature of the systems involved, a kind of open-loop time-consistent equilibrium control is investigated in this paper. It is shown that the existence of open-loop time-consistent equilibrium control for a fixed initial pair is equivalent to the solvability of a set of forward-backward stochastic difference equations with stationary conditions and convexity conditions. By decoupling the forward-backward stochastic difference equations, necessary and sufficient conditions in terms of linear difference equations and generalized difference Riccati equations are given for the existence of open-loop time-consistent equilibrium control with a fixed initial pair. Moreover, the existence of open-loop time-consistent equilibrium control for all the initial pairs is shown to be equivalent to the solvability of a set of coupled constrained generalized difference Riccati equations and two sets of constrained linear difference equations.

preprint2013arXiv

Consensus Seeking in Multi-Agent Systems with Multiplicative Measurement Noises

In this paper, the consensus problems of the continuous-time integrator systems under noisy measurements are considered. The measurement noises, which appear when agents measure their neighbors' states, are modeled to be multiplicative. By multiplication of the noises, here, the noise intensities are proportional to the absolute value of the relative states of agent and its neighbor. By using known distributed protocols for integrator agent systems, the closed-loop {system is} described in the vector form by a singular stochastic differential equation. For the fixed and switching network topologies cases, constant consensus gains are properly selected, such that mean square consensus and strong consensus can be achieved. Especially, exponential mean square convergence of agents' states to the common value is derived for the fixed topology case. In addition, asymptotic unbiased mean square average consensus and asymptotic unbiased strong average consensus are also studied. Simulations shed light on the effectiveness of the proposed theoretical results.

preprint2013arXiv

Continuous-time Mean-Variance Portfolio Selection with Stochastic Parameters

This paper studies a continuous-time market {under stochastic environment} where an agent, having specified an investment horizon and a target terminal mean return, seeks to minimize the variance of the return with multiple stocks and a bond. In the considered model firstly proposed by [3], the mean returns of individual assets are explicitly affected by underlying Gaussian economic factors. Using past and present information of the asset prices, a partial-information stochastic optimal control problem with random coefficients is formulated. Here, the partial information is due to the fact that the economic factors can not be directly observed. Via dynamic programming theory, the optimal portfolio strategy can be constructed by solving a deterministic forward Riccati-type ordinary differential equation and two linear deterministic backward ordinary differential equations.

Yuan-Hua Ni

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

Deep BSDE-ML Learning and Its Application to Model-Free Optimal Control

Deterministic Dynamic Stackelberg Games: Time-Consistent Open-Loop Solution

Discrete Time Mean-Field Stochastic Linear-Quadratic Optimal Control Problems

Time-Inconsistent Mean-Field Stochastic LQ Problem: Open-Loop Time-Consistent Control

Consensus Seeking in Multi-Agent Systems with Multiplicative Measurement Noises

Continuous-time Mean-Variance Portfolio Selection with Stochastic Parameters