Source author record

Yuhua Zhu

Yuhua Zhu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Machine Learning math.NA Numerical Analysis

Catalog footprint

What is connected

5works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Variational Actor-Critic Algorithms

We introduce a class of variational actor-critic algorithms based on a variational formulation over both the value function and the policy. The objective function of the variational formulation consists of two parts: one for maximizing the value function and the other for minimizing the Bellman residual. Besides the vanilla gradient descent with both the value function and the policy updates, we propose two variants, the clipping method and the flipping method, in order to speed up the convergence. We also prove that, when the prefactor of the Bellman residual is sufficiently large, the fixed point of the algorithm is close to the optimal policy.

preprint2020arXiv

A consensus-based global optimization method for high dimensional machine learning problems

We improve recently introduced consensus-based optimization method, proposed in [R. Pinnau, C. Totzeck, O. Tse and S. Martin, Math. Models Methods Appl. Sci., 27(01):183--204, 2017], which is a gradient-free optimization method for general non-convex functions. We first replace the isotropic geometric Brownian motion by the component-wise one, thus removing the dimensionality dependence of the drift rate, making the method more competitive for high dimensional optimization problems. Secondly, we utilize the random mini-batch ideas to reduce the computational cost of calculating the weighted average which the individual particles tend to relax toward. For its mean-field limit--a nonlinear Fokker-Planck equation--we prove, in both time continuous and semi-discrete settings, that the convergence of the method, which is exponential in time, is guaranteed with parameter constraints {\it independent} of the dimensionality. We also conduct numerical tests to high dimensional problems to check the success rate of the method.

preprint2020arXiv

A Sharp Convergence Rate for the Asynchronous Stochastic Gradient Descent

We give a sharp convergence rate for the asynchronous stochastic gradient descent (ASGD) algorithms when the loss function is a perturbed quadratic function based on the stochastic modified equations introduced in [An et al. Stochastic modified equations for the asynchronous stochastic gradient descent, arXiv:1805.08244]. We prove that when the number of local workers is larger than the expected staleness, then ASGD is more efficient than stochastic gradient descent. Our theoretical result also suggests that longer delays result in slower convergence rate. Besides, the learning rate cannot be smaller than a threshold inversely proportional to the expected staleness.

preprint2020arXiv

Borrowing From the Future: Addressing Double Sampling in Model-free Control

In model-free reinforcement learning, the temporal difference method and its variants become unstable when combined with nonlinear function approximations. Bellman residual minimization with stochastic gradient descent (SGD) is more stable, but it suffers from the double sampling problem: given the current state, two independent samples for the next state are required, but often only one sample is available. Recently, the authors of [Zhu et al, 2020] introduced the borrowing from the future (BFF) algorithm to address this issue for the prediction problem. The main idea is to borrow extra randomness from the future to approximately re-sample the next state when the underlying dynamics of the problem are sufficiently smooth. This paper extends the BFF algorithm to action-value function based model-free control. We prove that BFF is close to unbiased SGD when the underlying dynamics vary slowly with respect to actions. We confirm our theoretical findings with numerical simulations.

preprint2020arXiv

Boundary Control of Vlasov--Fokker--Planck Equations

We introduce a novel Lyapunov function for stabilization of linear Vlasov--Fokker--Planck type equations with stiff source term. Contrary to existing results relying on transport properties to obtain stabilization, we present results based on hypocoercivity analysis for the Fokker--Planck operator. The existing estimates are extended to derive suitable feedback boundary control to guarantee the exponential stabilization. Further, we study the associated macroscopic limit and derive conditions on the feedback boundary control such that in the formal limit no boundary layer exists.

Yuhua Zhu

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Variational Actor-Critic Algorithms

A consensus-based global optimization method for high dimensional machine learning problems

A Sharp Convergence Rate for the Asynchronous Stochastic Gradient Descent

Borrowing From the Future: Addressing Double Sampling in Model-free Control

Boundary Control of Vlasov--Fokker--Planck Equations