Researcher profile

Insoon Yang

Insoon Yang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
17works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

17 published item(s)

preprint2026arXiv

Distributionally Robust Kalman Filter

We study state estimation for discrete-time linear stochastic systems under distributional ambiguity in the initial state, process noise, and measurement noise. We propose a noise-centric distributionally robust Kalman filter (DRKF) based on Wasserstein ambiguity sets imposed directly on these distributions. This formulation excludes dynamically unreachable priors and yields a Kalman-type recursion driven by least-favorable covariances computed via semidefinite programs (SDP). In the time-invariant case, the steady-state DRKF is obtained from a single stationary SDP, producing a constant gain with Kalman-level online complexity. We establish the convergence of the DR Riccati covariance iteration to the stationary SDP solution, together with an explicit sufficient condition for a prescribed convergence rate. We further show that the proposed noise-centric model induces a priori spectral bounds on all feasible covariances and a Kalman filter sandwiching property for the DRKF covariances. Finally, we prove that the steady-state error dynamics are Schur stable, and the steady-state DRKF is asymptotically minimax optimal with respect to worst-case mean-square error.

preprint2023arXiv

Unifying Nesterov's Accelerated Gradient Methods for Convex and Strongly Convex Objective Functions: From Continuous-Time Dynamics to Discrete-Time Algorithms

Although Nesterov's accelerated gradient (NAG) methods have been studied from various perspectives, it remains unclear why the most popular forms of NAG must handle convex and strongly convex objective functions separately. Motivated by this inconsistency, we propose an NAG method that unifies the existing ones for the convex and strongly convex cases. We first design a Lagrangian function that continuously extends the first Bregman Lagrangian to the strongly convex setting. As a specific case of the Euler--Lagrange equation for this Lagrangian, we derive an ordinary differential equation (ODE) model, which we call the unified NAG ODE, that bridges the gap between the ODEs that model NAG for convex and strongly convex objective functions. We then design the unified NAG, a novel momentum method whereby the continuous-time limit corresponds to the unified ODE. The coefficients and the convergence rates of the unified NAG and unified ODE are continuous in the strong convexity parameter $μ$ on $[0, +\infty)$. Unlike the existing popular algorithm and ODE for strongly convex objective functions, the unified NAG and the unified NAG ODE always have superior convergence guarantees compared to the known algorithms and ODEs for non-strongly convex objective functions. This property is beneficial in practical perspective when considering strongly convex objective functions with small $μ$. Furthermore, we extend our unified dynamics and algorithms to the higher-order setting. Last but not least, we propose the unified NAG-G ODE, a novel ODE model for minimizing the gradient norm of strongly convex objective functions. Our unified Lagrangian framework is crucial in the process of constructing this ODE. Fascinatingly, using our novel tool, called the differential kernel, we observe that the unified NAG ODE and the unified NAG-G ODE have an anti-transpose relationship.

preprint2022arXiv

Infusing model predictive control into meta-reinforcement learning for mobile robots in dynamic environments

The successful operation of mobile robots requires them to adapt rapidly to environmental changes. To develop an adaptive decision-making tool for mobile robots, we propose a novel algorithm that combines meta-reinforcement learning (meta-RL) with model predictive control (MPC). Our method employs an off-policy meta-RL algorithm as a baseline to train a policy using transition samples generated by MPC when the robot detects certain events that can be effectively handled by MPC, with its explicit use of robot dynamics. The key idea of our method is to switch between the meta-learned policy and the MPC controller in a randomized and event-triggered fashion to make up for suboptimal MPC actions caused by the limited prediction horizon. During meta-testing, the MPC module is deactivated to significantly reduce computation time in motion control. We further propose an online adaptation scheme that enables the robot to infer and adapt to a new task within a single trajectory. The performance of our method has been demonstrated through simulations using a nonlinear car-like vehicle model with (i) synthetic movements of obstacles, and (ii) real-world pedestrian motion data. The simulation results indicate that our method outperforms other algorithms in terms of learning efficiency and navigation quality.

preprint2022arXiv

On Affine Policies for Wasserstein Distributionally Robust Unit Commitment

This paper proposes a unit commitment (UC) model based on data-driven Wasserstein distributionally robust optimization (WDRO) for power systems under uncertainty of renewable generation as well as its tractable exact reformulation. The proposed model is formulated as a WDRO problem relying on an affine policy, which nests an infinite-dimensional worst-case expectation problem and satisfies the non-anticipativity constraint. To reduce conservativeness, we develop a novel technique that defines a subset of the uncertainty set with a probabilistic guarantee. Subsequently, the proposed model is recast as a semi-infinite programming problem that can be efficiently solved using existing algorithms. Notably, the scale of this reformulation is invariant with the sample size. As a result, a number of samples are easily incorporated without using sophisticated decomposition algorithms. Numerical simulations on 6- and 24-bus test systems demonstrate the economic and computational efficiency of the proposed model.

preprint2022arXiv

Risk-sensitive safety analysis using Conditional Value-at-Risk

This paper develops a safety analysis method for stochastic systems that is sensitive to the possibility and severity of rare harmful outcomes. We define risk-sensitive safe sets as sub-level sets of the solution to a non-standard optimal control problem, where a random maximum cost is assessed via Conditional Value-at-Risk (CVaR). The objective function represents the maximum extent of constraint violation of the state trajectory, averaged over a given percentage of worst cases. This problem is well-motivated but difficult to solve tractably because the temporal decomposition for CVaR is history-dependent. Our primary theoretical contribution is to derive computationally tractable under-approximations to risk-sensitive safe sets. Our method provides a novel, theoretically guaranteed, parameter-dependent upper bound to the CVaR of a maximum cost without the need to augment the state space. For a fixed parameter value, the solution to only one Markov decision process problem is required to obtain the under-approximations for any family of risk-sensitivity levels. In addition, we propose a second definition for risk-sensitive safe sets and provide a tractable method for their estimation without using a parameter-dependent upper bound. The second definition is expressed in terms of a new coherent risk functional, which is inspired by CVaR. We demonstrate our primary theoretical contribution via numerical examples.

preprint2022arXiv

Using affine policies to reformulate two-stage Wasserstein distributionally robust linear programs to be independent of sample size

Intensively studied in theory as a promising data-driven tool for decision-making under ambiguity, two-stage distributionally robust optimization (DRO) problems over Wasserstein balls are not necessarily easy to solve in practice. This is partly due to large sample size. In this article, we study a generic two-stage distributionally robust linear program (2-DRLP) over a 1-Wasserstein ball using an affine policy. The 2-DRLP has right-hand-side uncertainty with a rectangular support. Our main contribution is to show that the 2-DRLP problem has a tractable reformulation with a scale independent of sample size. The reformulated problem can be solved within a pre-defined optimality tolerance using robust optimization techniques. To reduce the inevitable conservativeness of the affine policy while preserving independence of sample size, we further develop a method for constructing an uncertainty set with a probabilistic guarantee over which the Wasserstein ball is re-defined. As an application, we present a novel unit commitment model for power systems under uncertainty of renewable energy generation to examine the effectiveness of the proposed 2-DRLP technique. Extensive numerical experiments demonstrate that our model leads to better out-of-sample performance on average than other state-of-the-art distributionally robust unit commitment models while staying computationally competent.

preprint2022arXiv

Wasserstein Distributionally Robust Control of Partially Observable Linear Systems: Tractable Approximation and Performance Guarantee

Wasserstein distributionally robust control (WDRC) is an effective method for addressing inaccurate distribution information about disturbances in stochastic systems. It provides various salient features, such as an out-of-sample performance guarantee, while most of the existing methods use full-state observations. In this paper, we develop a computationally tractable WDRC method for discrete-time partially observable linear-quadratic (LQ) control problems. The key idea is to reformulate the WDRC problem as a novel minimax control problem with an approximate Wasserstein penalty. We derive a closed-form expression of the optimal control policy of the approximate problem using a nontrivial Riccati equation. We further show the guaranteed cost property of the resulting controller and identify a provable bound for the optimality gap. Finally, we evaluate the performance of our method through numerical experiments using both Gaussian and non-Gaussian disturbances.

preprint2021arXiv

Distributional robustness in minimax linear quadratic control with Wasserstein distance

To address the issue of inaccurate distributions in practical stochastic systems, a minimax linear-quadratic control method is proposed using the Wasserstein metric. Our method aims to construct a control policy that is robust against errors in an empirical distribution of underlying uncertainty, by adopting an adversary that selects the worst-case distribution. The opponent receives a Wasserstein penalty proportional to the amount of deviation from the empirical distribution. A closed-form expression of the finite-horizon optimal policy pair is derived using a Riccati equation. The result is then extended to the infinite-horizon average cost setting by identifying conditions under which the Riccati recursion converges to the unique positive semi-definite solution to an algebraic Riccati equation. Our method is shown to possess several salient features including closed-loop stability, and an out-of-sample performance guarantee. We also discuss how to optimize the penalty parameter for enhancing the distributional robustness of our control policy. Last but not least, a theoretical connection to the classical $H_\infty$-method is identified from the perspective of distributional robustness.

preprint2020arXiv

Hamilton-Jacobi-Bellman Equations for Q-Learning in Continuous Time

In this paper, we introduce Hamilton-Jacobi-Bellman (HJB) equations for Q-functions in continuous time optimal control problems with Lipschitz continuous controls. The standard Q-function used in reinforcement learning is shown to be the unique viscosity solution of the HJB equation. A necessary and sufficient condition for optimality is provided using the viscosity solution framework. By using the HJB equation, we develop a Q-learning method for continuous-time dynamical systems. A DQN-like algorithm is also proposed for high-dimensional state and control spaces. The performance of the proposed Q-learning algorithm is demonstrated using 1-, 10- and 20-dimensional dynamical systems.

preprint2020arXiv

Learning-based distributionally robust motion control with Gaussian processes

Safety is a critical issue in learning-based robotic and autonomous systems as learned information about their environments is often unreliable and inaccurate. In this paper, we propose a risk-aware motion control tool that is robust against errors in learned distributional information about obstacles moving with unknown dynamics. The salient feature of our model predictive control (MPC) method is its capability of limiting the risk of unsafety even when the true distribution deviates from the distribution estimated by Gaussian process (GP) regression, within an ambiguity set. Unfortunately, the distributionally robust MPC problem with GP is intractable because the worst-case risk constraint involves an infinite-dimensional optimization problem over the ambiguity set. To remove the infinite-dimensionality issue, we develop a systematic reformulation approach exploiting modern distributionally robust optimization techniques. The performance and utility of our method are demonstrated through simulations using a nonlinear car-like vehicle model for autonomous driving.

preprint2020arXiv

Minimax control of ambiguous linear stochastic systems using the Wasserstein metric

In this paper, we propose a minimax linear-quadratic control method to address the issue of inaccurate distribution information in practical stochastic systems. To construct a control policy that is robust against errors in an empirical distribution of uncertainty, our method is to adopt an adversary, which selects the worst-case distribution. To systematically adjust the conservativeness of our method, the opponent receives a penalty proportional to the amount, measured with the Wasserstein metric, of deviation from the empirical distribution. In the finite-horizon case, using a Riccati equation, we derive a closed-form expression of the unique optimal policy and the opponent's policy that generates the worst-case distribution. This result is then extended to the infinite-horizon setting by identifying conditions under which the Riccati recursion converges to the unique positive semi-definite solution to an associated algebraic Riccati equation (ARE). The resulting optimal policy is shown to stabilize the expected value of the system state under the worst-case distribution. We also discuss that our method can be interpreted as a distributional generalization of the $H_\infty$-method.

preprint2020arXiv

Multi-Objective Predictive Taxi Dispatch via Network Flow Optimization

In this paper, we discuss a large-scale fleet management problem in a multi-objective setting. We aim to seek a receding horizon taxi dispatch solution that serves as many ride requests as possible while minimizing the cost of relocating vehicles. To obtain the desired solution, we first convert the multi-objective taxi dispatch problem into a network flow problem, which can be solved using the classical minimum cost maximum flow (MCMF) algorithm. We show that a solution obtained using the MCMF algorithm is integer-valued; thus, it does not require any additional rounding procedure that may introduce undesirable numerical errors. Furthermore, we prove the time-greedy property of the proposed solution, which justifies the use of receding horizon optimization. For computational efficiency, we propose a linear programming method to obtain an optimal solution in near real time. The results of our simulation studies using real-world data for the metropolitan area of Seoul, South Korea indicate that the performance of the proposed predictive method is almost as good as that of the oracle that foresees the future.

preprint2020arXiv

Risk-sensitive safety specifications for stochastic systems using Conditional Value-at-Risk

This paper proposes a safety analysis method that facilitates a tunable balance between the worst-case and risk-neutral perspectives. First, we define a risk-sensitive safe set to specify the degree of safety attained by a stochastic system. This set is defined as a sublevel set of the solution to an optimal control problem that is expressed using the Conditional Value-at-Risk (CVaR) measure. This problem does not satisfy Bellman's Principle, thus our next contribution is to show how risk-sensitive safe sets can be under-approximated by the solution to a CVaR-Markov Decision Process. We adopt an existing value iteration algorithm to find an approximate solution to the reduced problem for a class of linear systems. Then, we develop a realistic numerical example of a stormwater system to show that this approach can be applied to non-linear systems. Finally, we compare the CVaR criterion to the exponential disutility criterion. The latter allocates control effort evenly across the cost distribution to reduce variance, while the CVaR criterion focuses control effort on a given worst-case quantile--where it matters most for safety.

preprint2020arXiv

Safe reinforcement learning for probabilistic reachability and safety specifications: A Lyapunov-based approach

Emerging applications in robotics and autonomous systems, such as autonomous driving and robotic surgery, often involve critical safety constraints that must be satisfied even when information about system models is limited. In this regard, we propose a model-free safety specification method that learns the maximal probability of safe operation by carefully combining probabilistic reachability analysis and safe reinforcement learning (RL). Our approach constructs a Lyapunov function with respect to a safe policy to restrain each policy improvement stage. As a result, it yields a sequence of safe policies that determine the range of safe operation, called the safe set, which monotonically expands and gradually converges. We also develop an efficient safe exploration scheme that accelerates the process of identifying the safety of unexamined states. Exploiting the Lyapunov shielding, our method regulates the exploratory policy to avoid dangerous states with high confidence. To handle high-dimensional systems, we further extend our approach to deep RL by introducing a Lagrangian relaxation technique to establish a tractable actor-critic algorithm. The empirical performance of our method is demonstrated through continuous control benchmark problems, such as a reaching task on a planar robot arm.

preprint2020arXiv

STAR: Spatio-Temporal Prediction of Air Quality Using A Multimodal Approach

With the increase of global economic activities and high energy demand, many countries have raised concerns about air pollution. However, air quality prediction is a challenging issue due to the complex interaction of many factors. In this paper, we propose a multimodal approach for spatio-temporal air quality prediction. Our model learns the multimodal fusion of critical factors to predict future air quality levels. Based on the analyses of data, we also assessed the impacts of critical factors on air quality prediction. We conducted experiments on two real-world air pollution datasets. For Seoul dataset, our method achieved 11% and 8.2% improvement of the mean absolute error in long-term predictions of PM2.5 and PM10, respectively, compared to baselines. Our method also reduced the mean absolute error of PM2.5 predictions by 20% compared to the previous state-of-the-art results on China 1-year dataset.

preprint2020arXiv

Wasserstein Distributionally Robust Motion Control for Collision Avoidance Using Conditional Value-at-Risk

In this paper, a risk-aware motion control scheme is considered for mobile robots to avoid randomly moving obstacles when the true probability distribution of uncertainty is unknown. We propose a novel model predictive control (MPC) method for limiting the risk of unsafety even when the true distribution of the obstacles' movements deviates, within an ambiguity set, from the empirical distribution obtained using a limited amount of sample data. By choosing the ambiguity set as a statistical ball with its radius measured by the Wasserstein metric, we achieve a probabilistic guarantee of the out-of-sample risk, evaluated using new sample data generated independently of the training data. To resolve the infinite-dimensionality issue inherent in the distributionally robust MPC problem, we reformulate it as a finite-dimensional nonlinear program using modern distributionally robust optimization techniques based on the Kantorovich duality principle. To find a globally optimal solution in the case of affine dynamics and output equations, a spatial branch-and-bound algorithm is designed using McCormick relaxation. The performance of the proposed method is demonstrated and analyzed through simulation studies using a nonlinear car-like vehicle model and a linearized quadrotor model.

preprint2017arXiv

Optimal Control of Conditional Value-at-Risk in Continuous Time

We consider continuous-time stochastic optimal control problems featuring Conditional Value-at-Risk (CVaR) in the objective. The major difficulty in these problems arises from time-inconsistency, which prevents us from directly using dynamic programming. To resolve this challenge, we convert to an equivalent bilevel optimization problem in which the inner optimization problem is standard stochastic control. Furthermore, we provide conditions under which the outer objective function is convex and differentiable. We compute the outer objective's value via a Hamilton-Jacobi-Bellman equation and its gradient via the viscosity solution of a linear parabolic equation, which allows us to perform gradient descent. The significance of this result is that we provide an efficient dynamic programming-based algorithm for optimal control of CVaR without lifting the state-space. To broaden the applicability of the proposed algorithm, we propose convergent approximation schemes in cases where our key assumptions do not hold and characterize relevant suboptimality bounds. In addition, we extend our method to a more general class of risk metrics, which includes mean-variance and median-deviation. We also demonstrate a concrete application to portfolio optimization under CVaR constraints. Our results contribute an efficient framework for solving time-inconsistent CVaR-based sequential optimization.