Source author record

Yingzhao Lian

Yingzhao Lian appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

eess.SY Systems and Control math.OC Machine Learning

Catalog footprint

What is connected

4works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Lessons Learned from Data-Driven Building Control Experiments: Contrasting Gaussian Process-based MPC, Bilevel DeePC, and Deep Reinforcement Learning

This manuscript offers the perspective of experimentalists on a number of modern data-driven techniques: model predictive control relying on Gaussian processes, adaptive data-driven control based on behavioral theory, and deep reinforcement learning. These techniques are compared in terms of data requirements, ease of use, computational burden, and robustness in the context of real-world applications. Our remarks and observations stem from a number of experimental investigations carried out in the field of building control in diverse environments, from lecture halls and apartment spaces to a hospital surgery center. The final goal is to support others in identifying what technique is best suited to tackle their own problems.

preprint2022arXiv

On the Optimality and Convergence Properties of the Iterative Learning Model Predictive Controller

In this technical note we analyse the performance improvement and optimality properties of the Learning Model Predictive Control (LMPC) strategy for linear deterministic systems. The LMPC framework is a policy iteration scheme where closed-loop trajectories are used to update the control policy for the next execution of the control task. We show that, when a Linear Independence Constraint Qualification (LICQ) condition holds, the LMPC scheme guarantees strict iterative performance improvement and optimality, meaning that the closed-loop cost evaluated over the entire task converges asymptotically to the optimal cost of the infinite-horizon control problem. Compared to previous works this sufficient LICQ condition can be easily checked, it holds for a larger class of systems and it can be used to adaptively select the prediction horizon of the controller, as demonstrated by a numerical example.

preprint2021arXiv

Koopman based data-driven predictive control

Sparked by the Willems' fundamental lemma, a class of data-driven control methods has been developed for LTI systems. At the same time, the Koopman operator theory attempts to cast a nonlinear control problem into a standard linear one albeit infinite-dimensional. Motivated by these two ideas, a data-driven control scheme for nonlinear systems is proposed in this work. The proposed scheme is compatible with most differential regressors enabling offline learning. In particular, the model uncertainty is considered, enabling a novel data-driven simulation framework based on Wasserstein distance. Numerical experiments are performed with Bayesian neural networks to show the effectiveness of both the proposed control and simulation scheme.

preprint2020arXiv

Towards an Unified Structure for Reinforcement Learning: an Optimization Approach

Both the optimal value function and the optimal policy can be used to model an optimal controller based on the duality established by the Bellman equation. Even with this duality, no parametric model has been able to output both policy and value function with a common parameter set. In this paper, a unified structure is proposed with a parametric optimization problem. The policy and the value function modelled by this structure share all parameters, which enables seamless switching among reinforcement learning algorithms while continuing to learn. The Q-learning and policy gradient based on the proposed structure is detailed. An actor-critic algorithm based on this structure, whose actor and critic are both modelled by the same parameters, is validated by both linear and nonlinear control.