Source author record

Tao Bian

Tao Bian appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Artificial Intelligence eess.SY Machine Learning math.NA Numerical Analysis Systems and Control

Catalog footprint

What is connected

2works

7topics

2close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Robust Policy Iteration for Continuous-time Linear Quadratic Regulation

This paper studies the robustness of policy iteration in the context of continuous-time infinite-horizon linear quadratic regulation (LQR) problem. It is shown that Kleinman's policy iteration algorithm is inherently robust to small disturbances and enjoys local input-to-state stability in the sense of Sontag. More precisely, whenever the disturbance-induced input term in each iteration is bounded and small, the solutions of the policy iteration algorithm are also bounded and enter a small neighborhood of the optimal solution of the LQR problem. Based on this result, an off-policy data-driven policy iteration algorithm for the LQR problem is shown to be robust when the system dynamics are subjected to small additive unknown bounded disturbances. The theoretical results are validated by a numerical example.

preprint2020arXiv

Temporal-Differential Learning in Continuous Environments

In this paper, a new reinforcement learning (RL) method known as the method of temporal differential is introduced. Compared to the traditional temporal-difference learning method, it plays a crucial role in developing novel RL techniques for continuous environments. In particular, the continuous-time least squares policy evaluation (CT-LSPE) and the continuous-time temporal-differential (CT-TD) learning methods are developed. Both theoretical and empirical evidences are provided to demonstrate the effectiveness of the proposed temporal-differential learning methodology.

Tao Bian

What is connected

Connect this record

See the researcher in context

Building this map preview

2 published item(s)

Robust Policy Iteration for Continuous-time Linear Quadratic Regulation

Temporal-Differential Learning in Continuous Environments