Researcher profile

Shengbo Wang

Shengbo Wang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2026arXiv

Central Limit Theorem for Two-Time-Scale Approximate Distributionally Robust RL

Designing model-free algorithms for distributionally robust reinforcement learning (DRRL) poses fundamental challenges. The robust Bellman operator is nonlinear in the transition kernel, which makes one-sample Bellman updates biased, while the adversarial optimization underlying robustness makes robust evaluation computationally demanding. To address these difficulties, we consider the natural small-ambiguity regime under Kullback--Leibler ambiguity sets and propose an approximate DRRL framework based on a first-order expansion of the relevant robust functional. This yields an approximate robust Bellman equation that removes the adversarial optimization while remaining first-order accurate in the ambiguity radius. To learn the fixed point of this approximate equation, we propose Mean-Variance Stochastic Approximation (MVSA), a model-free algorithm that uses only one-sample updates. This is achieved via a lifted stochastic approximation dynamics and a two-time-scale design. We then prove convergence and a central limit theorem for MVSA: its main iterate satisfies a central limit theorem at the canonical $n^{-1/2}$ scale, with explicitly characterized asymptotic covariances. Finally, we validate our theoretical findings with a numerical experiment.

preprint2022arXiv

Optimal Tracking Control for Unknown Linear Systems with Finite-Time Parameter Estimation

The optimal control input for linear systems can be solved from algebraic Riccati equation (ARE), from which it remains questionable to get the form of the exact solution. In engineering, the acceptable numerical solutions of ARE can be found by iteration or optimization. Recently, the gradient descent based numerical solutions has been proven effective to approximate the optimal ones. This paper introduces this method to tracking problem for heterogeneous linear systems. Differently, the parameters in the dynamics of the linear systems are all assumed to be unknown, which is intractable since the gradient as well as the allowable initialization needs the prior knowledge of system dynamics. To solve this problem, the method named dynamic regressor extension and mix (DREM) is improved to estimate the parameter matrices in finite time. Besides, a discounted factor is introduced to ensure the existence of optimal solutions for heterogeneous systems. Two simulation experiments are given to illustrate the effectiveness.

preprint2022arXiv

Robust Adaptive Safety-Critical Control for Unknown Systems with Finite-Time Element-Wise Parameter Estimation

Safety is always one of the most critical principles for a system to be controlled. This paper investigates a safety-critical control scheme for unknown structured systems by using the control barrier function (CBF) method. Benefited from the dynamic regressor extension and mixing (DREM), an extended element-wise parameter identification law is utilized to dismiss the uncertainty. On the one hand, it is shown that the proposed control scheme can always guarantee the safety in the identification process with noised signal injection excitation, which was not considered in the previous study. On the other hand, the element-wise estimation process in DREM can minimize conservatism of the safe adaptive process compared to other existing adaptive CBF algorithms. The stability as well as the forward invariance of the presented safe control-estimation scheme is proved. Furthermore, the robustness of the scheme under bounded disturbances is analyzed, where a robust CBF with modest conditions is used to ensure safety. The framework is illustrated by simulations on adaptive cruise control, where the slope resistance of the following vehicle is robustly estimated in finite time against small disturbances and the potential crash risk is avoided by the proposed safe control scheme.