Source author record

Chuangchuang Sun

Chuangchuang Sun appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

eess.SY Systems and Control Machine Learning math.OC Multiagent Systems Robotics

Catalog footprint

What is connected

8works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

Towards an Adaptable and Generalizable Optimization Engine in Decision and Control: A Meta Reinforcement Learning Approach

Sampling-based model predictive control (MPC) has found significant success in optimal control problems with non-smooth system dynamics and cost function. Many machine learning-based works proposed to improve MPC by a) learning or fine-tuning the dynamics/ cost function, or b) learning to optimize for the update of the MPC controllers. For the latter, imitation learning-based optimizers are trained to update the MPC controller by mimicking the expert demonstrations, which, however, are expensive or even unavailable. More significantly, many sequential decision-making problems are in non-stationary environments, requiring that an optimizer should be adaptable and generalizable to update the MPC controller for solving different tasks. To address those issues, we propose to learn an optimizer based on meta-reinforcement learning (RL) to update the controllers. This optimizer does not need expert demonstration and can enable fast adaptation (e.g., few-shots) when it is deployed in unseen control tasks. Experimental results validate the effectiveness of the learned optimizer regarding fast adaptation.

preprint2022arXiv

An efficient approach for nonconvex semidefinite optimization via customized alternating direction method of multipliers

We investigate a class of general combinatorial graph problems, including MAX-CUT and community detection, reformulated as quadratic objectives over nonconvex constraints and solved via the alternating direction method of multipliers (ADMM). We propose two reformulations: one using vector variables and a binary constraint, and the other further reformulating the Burer-Monteiro form for simpler subproblems. Despite the nonconvex constraint, we prove the ADMM iterates converge to a stationary point in both formulations, under mild assumptions. Additionally, recent work suggests that in this latter form, when the matrix factors are wide enough, local optimum with high probability is also the global optimum. To demonstrate the scalability of our algorithm, we include results for MAX-CUT, community detection, and image segmentation benchmark and simulated examples.

preprint2022arXiv

Reachability Analysis of Neural Feedback Loops

Neural Networks (NNs) can provide major empirical performance improvements for closed-loop systems, but they also introduce challenges in formally analyzing those systems' safety properties. In particular, this work focuses on estimating the forward reachable set of \textit{neural feedback loops} (closed-loop systems with NN controllers). Recent work provides bounds on these reachable sets, but the computationally tractable approaches yield overly conservative bounds (thus cannot be used to verify useful properties), and the methods that yield tighter bounds are too intensive for online computation. This work bridges the gap by formulating a convex optimization problem for the reachability analysis of closed-loop systems with NN controllers. While the solutions are less tight than previous (semidefinite program-based) methods, they are substantially faster to compute, and some of those computational time savings can be used to refine the bounds through new input set partitioning techniques, which is shown to dramatically reduce the tightness gap. The new framework is developed for systems with uncertainty (e.g., measurement and process noise) and nonlinearities (e.g., polynomial dynamics), and thus is shown to be applicable to real-world systems. To inform the design of an initial state set when only the target state set is known/specified, a novel algorithm for backward reachability analysis is also provided, which computes the set of states that are guaranteed to lead to the target set. The numerical experiments show that our approach (based on linear relaxations and partitioning) gives a $5\times$ reduction in conservatism in $150\times$ less computation time compared to the state-of-the-art. Furthermore, experiments on quadrotor, 270-state, and polynomial systems demonstrate the method's ability to handle uncertainty sources, high dimensionality, and nonlinear dynamics, respectively.

preprint2021arXiv

Temporal-Logic-Based Intermittent, Optimal, and Safe Continuous-Time Learning for Trajectory Tracking

In this paper, we develop safe reinforcement-learning-based controllers for systems tasked with accomplishing complex missions that can be expressed as linear temporal logic specifications, similar to those required by search-and-rescue missions. We decompose the original mission into a sequence of tracking sub-problems under safety constraints. We impose the safety conditions by utilizing barrier functions to map the constrained optimal tracking problem in the physical space to an unconstrained one in the transformed space. Furthermore, we develop policies that intermittently update the control signal to solve the tracking sub-problems with reduced burden in the communication and computation resources. Subsequently, an actor-critic algorithm is utilized to solve the underlying Hamilton-Jacobi-Bellman equations. Finally, we support our proposed framework with stability proofs and showcase its efficacy via simulation results.

preprint2020arXiv

Optimal Composition of Heterogeneous Multi-Agent Teams for Coverage Problems with Performance Bound Guarantees

We consider the problem of determining the optimal composition of a heterogeneous multi-agent team for coverage problems by including costs associated with different agents and subject to an upper bound on the maximal allowable number of agents. We formulate a resource allocation problem without introducing additional non-convexities to the original problem. We develop a distributed Projected Gradient Ascent (PGA) algorithm to solve the optimal team composition problem. To deal with non-convexity, we initialize the algorithm using a greedy method and exploit the submodularity and curvature properties of the coverage objective function to derive novel tighter performance bound guarantees on the optimization problem solution. Numerical examples are included to validate the effectiveness of this approach in diverse mission space configurations and different heterogeneous multi-agent collections. Comparative results obtained using a commercial mixed-integer nonlinear programming problem solver demonstrate both the accuracy and computational efficiency of the distributed PGA algorithm.

preprint2020arXiv

Scaling Up Multiagent Reinforcement Learning for Robotic Systems: Learn an Adaptive Sparse Communication Graph

The complexity of multiagent reinforcement learning (MARL) in multiagent systems increases exponentially with respect to the agent number. This scalability issue prevents MARL from being applied in large-scale multiagent systems. However, one critical feature in MARL that is often neglected is that the interactions between agents are quite sparse. Without exploiting this sparsity structure, existing works aggregate information from all of the agents and thus have a high sample complexity. To address this issue, we propose an adaptive sparse attention mechanism by generalizing a sparsity-inducing activation function. Then a sparse communication graph in MARL is learned by graph neural networks based on this new attention mechanism. Through this sparsity structure, the agents can communicate in an effective as well as efficient way via only selectively attending to agents that matter the most and thus the scale of the MARL problem is reduced with little optimality compromised. Comparative results show that our algorithm can learn an interpretable sparse structure and outperforms previous works by a significant margin on applications involving a large-scale multiagent system.

preprint2019arXiv

Joint Estimation of OD Demands and Cost Functions in Transportation Networks from Data

Existing work has tackled the problem of estimating Origin-Destination (OD) demands and recovering travel latency functions in transportation networks under the Wardropian assumption. The ultimate objective is to derive an accurate predictive model of the network to enable optimization and control. However, these two problems are typically treated separately and estimation is based on parametric models. In this paper, we propose a method to jointly recover nonparametric travel latency cost functions and estimate OD demands using traffic flow data. We formulate the problem as a bilevel optimization problem and develop an iterative first-order optimization algorithm to solve it. A numerical example using the Braess Network is presented to demonstrate the effectiveness of our method.

preprint2016arXiv

An Iterative Method for Nonconvex Quadratically Constrained Quadratic Programs

This paper examines the nonconvex quadratically constrained quadratic programming (QCQP) problems using an iterative method. One of the existing approaches for solving nonconvex QCQP problems relaxes the rank one constraint on the unknown matrix into semidefinite constraint to obtain the bound on the optimal value without finding the exact solution. By reconsidering the rank one matrix, an iterative rank minimization (IRM) method is proposed to gradually approach the rank one constraint. Each iteration of IRM is formulated as a convex problem with semidefinite constraints. An augmented Lagrangian method, named extended Uzawa algorithm, is developed to solve the subproblem at each iteration of IRM for improved scalability and computational efficiency. Simulation examples are presented using the proposed method and comparative results obtained from the other methods are provided and discussed.