Researcher profile

Keyou You

Keyou You contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
23works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

23 published item(s)

preprint2026arXiv

LLM4Branch: Large Language Model for Discovering Efficient Branching Policies of Integer Programs

Efficient branching policies are essential for accelerating Mixed Integer Linear Programming (MILP) solvers. Their design has long relied on hand-crafted heuristics, and now machine learning has emerged as a promising paradigm to automate this process. However, existing learning-based methods are often hindered by their dependence on expensive expert demonstrations and the gap between training objectives and the solver's end-to-end performance. In this work, we propose LLM4Branch, a novel framework that leverages Large Language Models (LLMs) to automate the discovery of efficient branching policies. Specifically, the discovered policy is an executable program with a program skeleton generated by the LLM and a parameter vector, which is optimized via a zeroth-order method over a few instances with their end-to-end performance feedback. Extensive experiments on standard MILP benchmarks demonstrate that LLM4Branch establishes a new state-of-the-art among CPU-based methods and achieves performance competitive with advanced GPU-based models. Codes are available at https://github.com/hzn18/LLM4Branch.

preprint2023arXiv

Minimax Q-learning Control for Linear Systems Using the Wasserstein Metric

Stochastic optimal control usually requires an explicit dynamical model with probability distributions, which are difficult to obtain in practice. In this work, we consider the linear quadratic regulator (LQR) problem of unknown linear systems and adopt a Wasserstein penalty to address the distribution uncertainty of additive stochastic disturbances. By constructing an equivalent deterministic game of the penalized LQR problem, we propose a Q-learning method with convergence guarantees to learn an optimal minimax controller.

preprint2022arXiv

A Distributed Implementation of Steady-State Kalman Filter

This paper studies the distributed state estimation in sensor network, where $m$ sensors are deployed to infer the $n$-dimensional state of a linear time-invariant (LTI) Gaussian system. By a lossless decomposition of optimal steady-state Kalman filter, we show that the problem of distributed estimation can be reformulated as synchronization of homogeneous linear systems. Based on such decomposition, a distributed estimator is proposed, where each sensor node runs a local filter using only its own measurement and fuses the local estimate of each node with a consensus algorithm. We show that the average of the estimate from all sensors coincides with the optimal Kalman estimate. Numerical examples are provided in the end to illustrate the performance of the proposed scheme.

preprint2022arXiv

An Exact Method for the Daily Package Shipment Problem with Outsourcing

The package shipment problem requires to optimally co-design paths for both packages and a heterogeneous fleet in a transit center network (TCN). Instances arising from the package delivery industry in China usually involve more than ten thousand origin-destination (OD) pairs and have to be solved daily within an hour. Motivated by the fact that there is no interaction among different origin centers due to their competitive relationship, we propose a novel two-layer localized package shipment on a TCN (LPS-TCN) model that exploits outsourcing for cost saving. Consequently, the original problem breaks into a set of much smaller shipment problems, each of which has hundreds of OD pairs and is subsequently modelled as a mixed integer program (MIP). Since the LPS-TCN model is proved to be Strongly NP-hard and contains tens of thousands of feasible paths, an off-the-shelf MIP solver cannot produce a reliable solution in a practically acceptable amount of time. We develop a column generation based algorithm that iteratively adds "profitable" paths and further enhance it by problem-specific cutting planes and variable bound tightening techniques. Computational experiments on realistic instances from a major Chinese package express company demonstrate that the LPS-TCN model can yield solutions that bring daily economic cost reduction up to 1 million CNY for the whole TCN. In addition, our proposed algorithm solves the LPS-TCN model substantially faster than CPLEX, one of the state-of-the-art commercial MIP solvers.

preprint2022arXiv

Data-driven Control of Unknown Linear Systems via Quantized Feedback

Control using quantized feedback is a fundamental approach to system synthesis with limited communication capacity. In this paper, we address the stabilization problem for unknown linear systems with logarithmically quantized feedback, via a direct data-driven control method. By leveraging a recently developed matrix S-lemma, we prove a sufficient and necessary condition for the existence of a common stabilizing controller for all possible dynamics consistent with data, in the form of a linear matrix inequality. Moreover, we formulate semi-definite programming to solve the coarsest quantization density. By establishing its connections to unstable eigenvalues of the state matrix, we further prove a necessary rank condition on the data for quantized feedback stabilization. Finally, we validate our theoretical results by numerical examples.

preprint2022arXiv

Distributed Adaptive Newton Methods with Global Superlinear Convergence

This paper considers the distributed optimization problem where each node of a peer-to-peer network minimizes a finite sum of objective functions by communicating with its neighboring nodes. In sharp contrast to the existing literature where the fastest distributed algorithms converge either with a global linear or a local superlinear rate, we propose a distributed adaptive Newton (DAN) algorithm with a global quadratic convergence rate. Our key idea lies in the design of a finite-time set-consensus method with Polyak's adaptive stepsize. Moreover, we introduce a low-rank matrix approximation (LA) technique to compress the innovation of Hessian matrix so that each node only needs to transmit message of dimension $\mathcal{O}(p)$ (where $p$ is the dimension of decision vectors) per iteration, which is essentially the same as that of first-order methods. Nevertheless, the resulting DAN-LA converges to an optimal solution with a global superlinear rate. Numerical experiments on logistic regression problems are conducted to validate their advantages over existing methods.

preprint2022arXiv

Distributed Online Optimization in Time-Varying Unbalanced Networks without Explicit Subgradients

This paper studies a distributed online constrained optimization problem over time-varying unbalanced digraphs without explicit subgradients. In sharp contrast to the existing algorithms, we design a novel consensus-based distributed online algorithm with a local randomized zeroth-order oracle and then rescale the oracle by constructing row-stochastic matrices, which aims to address the unbalancedness of time-varying digraphs. Under mild conditions, the average dynamic regret over a time horizon is shown to asymptotically converge at a sublinear rate provided that the accumulated variation grows sublinearly with a specific order. Moreover, the counterpart of the proposed algorithm when subgradients are available is also provided, along with its dynamic regret bound, which reflects that the convergence of our algorithm is essentially not affected by the zeroth-order oracle. Simulations on distributed targets tracking problem and dynamic sparse signal recovery problem in sensor networks are employed to demonstrate the effectiveness of the proposed algorithm.

preprint2022arXiv

Minimum Input Design for Direct Data-driven Property Identification of Unknown Linear Systems

In a direct data-driven approach, this paper studies the {\em property identification(ID)} problem to analyze whether an unknown linear system has a property of interest, e.g., stabilizability and structural properties. In sharp contrast to the model-based analysis, we approach it by directly using the input and state feedback data of the unknown system. Via a new concept of sufficient richness of input sectional data, we first establish the necessary and sufficient condition for the minimum input design to excite the system for property ID. Specifically, the input sectional data is sufficiently rich for property ID {\em if and only if} it spans a linear subspace that contains a property dependent minimum linear subspace, any basis of which can also be easily used to form the minimum excitation input. Interestingly, we show that many structural properties can be identified with the minimum input that is however unable to identify the explicit system model. Overall, our results rigorously quantify the advantages of the direct data-driven analysis over the model-based analysis for linear systems in terms of data efficiency.

preprint2022arXiv

Optimal $(0,1)$-Matrix Completion with Majorization Ordered Objectives (To the memory of Pravin Varaiya)

We propose and examine two optimal $(0,1)$-matrix completion problems with majorization ordered objectives. They elevate the seminal study by Gale and Ryser from feasibility to optimality in partial order programming (POP), referring to optimization with partially ordered objectives. We showcase their applications in electric vehicle charging, portfolio optimization, and secure data storage. Solving such integer POP (iPOP) problems is challenging because of the possible non-comparability among objective values and the integer requirements. Nevertheless, we prove the essential uniqueness of all optimal objective values and identify two particular ones for each of the two inherently symmetric iPOP problems. Furthermore, for every optimal objective value, we decompose the construction of an associated optimal~$(0,1)$-matrix into a series of sorting processes, respectively agreeing with the rule of thumb "peak shaving" or "valley filling." We show that the resulting algorithms have linear time complexities and verify their empirical efficiency via numerical simulations compared to the standard order-preserving method for POP.

preprint2021arXiv

Asynchronous Networked Aggregative Games

We propose a fully asynchronous networked aggregative game (Asy-NAG) where each player minimizes a cost function that depends on its local action and the aggregate of all players' actions. In sharp contrast to the existing NAGs, each player in our Asy-NAG can compute an estimate of the aggregate action at any wall-clock time by only using (possibly stale) information from nearby players of a directed network. Such an asynchronous update does not require any coordination among players. Moreover, we design a novel distributed algorithm with an aggressive mechanism for each player to adaptively adjust the optimization stepsize per update. Particularly, the slow players in terms of updating their estimates smartly increase their stepsizes to catch up with the fast ones. Then, we develop an augmented system approach to address the asynchronicity and the information delays between players, and rigorously show the convergence to a Nash equilibrium of the Asy-NAG via a perturbed coordinate algorithm which is also of independent interest. Finally, we evaluate the performance of the distributed algorithm through numerical simulations.

preprint2021arXiv

Fully Asynchronous Distributed Optimization with Linear Convergence in Directed Networks

We consider the distributed optimization problem, the goal of which is to minimize the sum of local objective functions over a directed network. Though it has been widely studied recently, most of the existing algorithms are designed for synchronized or randomly activated implementation, which may create deadlocks in practice. In sharp contrast, we propose a \emph{fully} asynchronous push-pull gradient algorithm (APPG) where each node updates without waiting for any other node by using (possibly stale) information from neighbors. Thus, it is both deadlock-free and robust to any bounded communication delay. Moreover, we construct two novel augmented networks to theoretically evaluate its performance from the worst-case point of view and show that if local functions have Lipschitz-continuous gradients and their sum satisfies the Polyak-Łojasiewicz condition (convexity is not required), each node of APPG converges to the same optimal solution at a linear rate of $\mathcal{O}(λ^k)$, where $λ\in(0,1)$ and the virtual counter $k$ increases by one no matter which node updates. This largely elucidates its linear speedup efficiency and shows its advantage over the synchronous version. Finally, the performance of APPG is numerically validated via a logistic regression problem on the \emph{Covertype} dataset.

preprint2021arXiv

Fully Asynchronous Policy Evaluation in Distributed Reinforcement Learning over Networks

This paper proposes a \emph{fully asynchronous} scheme for the policy evaluation problem of distributed reinforcement learning (DisRL) over directed peer-to-peer networks. Without waiting for any other node of the network, each node can locally update its value function at any time by using (possibly delayed) information from its neighbors. This is in sharp contrast to the gossip-based scheme where a pair of nodes concurrently update. Though the fully asynchronous setting involves a difficult multi-timescale decision problem, we design a novel stochastic average gradient (SAG) based distributed algorithm and develop a push-pull augmented graph approach to prove its exact convergence at a linear rate of $\mathcal{O}(c^k)$ where $c\in(0,1)$ and $k$ increases by one no matter on which node updates. Finally, numerical experiments validate that our method speeds up linearly with respect to the number of nodes, and is robust to straggler nodes.

preprint2021arXiv

Smart Train Operation Algorithms based on Expert Knowledge and Reinforcement Learning

During recent decades, the automatic train operation (ATO) system has been gradually adopted in many subway systems for its low-cost and intelligence. This paper proposes two smart train operation algorithms by integrating the expert knowledge with reinforcement learning algorithms. Compared with previous works, the proposed algorithms can realize the control of continuous action for the subway system and optimize multiple critical objectives without using an offline speed profile. Firstly, through learning historical data of experienced subway drivers, we extract the expert knowledge rules and build inference methods to guarantee the riding comfort, the punctuality, and the safety of the subway system. Then we develop two algorithms for optimizing the energy efficiency of train operation. One is the smart train operation (STO) algorithm based on deep deterministic policy gradient named (STOD) and the other is the smart train operation algorithm based on normalized advantage function (STON). Finally, we verify the performance of proposed algorithms via some numerical simulations with the real field data from the Yizhuang Line of the Beijing Subway and illustrate that the developed smart train operation algorithm are better than expert manual driving and existing ATO algorithms in terms of energy efficiency. Moreover, STOD and STON can adapt to different trip times and different resistance conditions.

preprint2020arXiv

Bayesian Filtering with Unknown Sensor Measurement Losses

This work studies the state estimation problem of a stochastic nonlinear system with unknown sensor measurement losses. If the estimator knows the sensor measurement losses of a linear Gaussian system, the minimum variance estimate is easily computed by the celebrated intermittent Kalman filter (IKF). However, this will no longer be the case when the measurement losses are unknown and/or the system is nonlinear or non-Gaussian. By exploiting the binary property of the measurement loss process and the IKF, we design three suboptimal filters for the state estimation, i.e., BKF-I, BKF-II and RBPF. The BKF-I is based on the MAP estimator of the measurement loss process and the BKF-II is derived by estimating the conditional loss probability. The RBPF is a particle filter based algorithm which marginalizes out the loss process to increase the efficiency of particles. All the proposed filters can be easily implemented in recursive forms. Finally, a linear system, a target tracking system and a quadrotor's path control problem are included to illustrate their effectiveness, and show the tradeoff between computational complexity and estimation accuracy of the proposed filters.

preprint2020arXiv

Cooperative Source Seeking via Networked Multi-vehicle Systems

This paper studies the cooperative source seeking problem via a networked multi-vehicle system. In contrast to existing literature, the multi-vehicle system is controlled to the source position that maximizes aggregated multiple unknown scalar fields and each sensor-enabled vehicle only samples measurements of one scalar field. Thus, a single vehicle is unable to localize the source and has to cooperate with its neighboring vehicles. By jointly exploiting the ideas of the consensus algorithm and the stochastic extremum seeking (ES), this paper proposes novel distributed stochastic ES controllers, which are gradient-free and do not need any absolute information, such that the multi-vehicle system simultaneously approaches the source position. The effectiveness of the proposed controllers is proved for quadratic scalar fields. Finally, illustrative examples are included to validate the theoretical results.

preprint2020arXiv

Coordinate-free Isoline Tracking in Unknown 2-D Scalar Fields

The isoline tracking of this work is concerned with the control design for a sensing robot to track a given isoline of an unknown 2-D scalar filed. To this end, we propose a coordinate-free controller with a simple PI-like form using only the concentration feedback for a Dubins robot, which is particularly useful in GPS-denied environments. The key idea lies in the novel design of a sliding surface based error term in the standard PI controller. Interestingly, we also prove that the tracking error can be reduced by increasing the proportion gain, and is eliminated for circular fields with a non-zero integral gain. The effectiveness of our controller is validated via simulations by using a fixed-wing UAV on the real dataset of the concentration distribution of PM 2.5 in Handan, China.

preprint2020arXiv

Decentralized Stochastic Gradient Tracking for Non-convex Empirical Risk Minimization

This paper studies a decentralized stochastic gradient tracking (DSGT) algorithm for non-convex empirical risk minimization problems over a peer-to-peer network of nodes, which is in sharp contrast to the existing DSGT only for convex problems. To ensure exact convergence and handle the variance among decentralized datasets, each node performs a stochastic gradient (SG) tracking step by using a mini-batch of samples, where the batch size is designed to be proportional to the size of the local dataset. We explicitly evaluate the convergence rate of DSGT with respect to the number of iterations in terms of algebraic connectivity of the network, mini-batch size, gradient variance, etc. Under certain conditions, we further show that DSGT has a network independence property in the sense that the network topology only affects the convergence rate up to a constant factor. Hence, the convergence rate of DSGT can be comparable to the centralized SGD method. Moreover, a linear speedup of DSGT with respect to the number of nodes is achievable for some scenarios. Numerical experiments for neural networks and logistic regression problems on CIFAR-10 finally illustrate the advantages of DSGT.

preprint2020arXiv

Distributed Dual Gradient Tracking for Resource Allocation in Unbalanced Networks

This paper proposes a distributed dual gradient tracking algorithm (DDGT) to solve resource allocation problems over an unbalanced network, where each node in the network holds a private cost function and computes the optimal resource by interacting only with its neighboring nodes. Our key idea is the novel use of the distributed push-pull gradient algorithm (PPG) to solve the dual problem of the resource allocation problem. To study the convergence of the DDGT, we first establish the sublinear convergence rate of PPG for non-convex objective functions, which advances the existing results on PPG as they require the strong-convexity of objective functions. Then we show that the DDGT converges linearly for strongly convex and Lipschitz smooth cost functions, and sublinearly without the Lipschitz smoothness. Finally, experimental results suggest that DDGT outperforms existing algorithms.

preprint2020arXiv

Range-based Coordinate Alignment for Cooperative Mobile Sensor Network Localization

This paper studies a coordinate alignment problem for cooperative mobile sensor network localization with range-based measurements. The network consists of target nodes, each of which has only access position information in a local fixed coordinate frame, and anchor nodes with GPS position information. To localize target nodes, we aim to align their coordinate frames, which leads to a non-convex optimization problem over a rotation group $\text{SO}(3)$. Then, we reformulate it as an optimization problem with a convex objective function over spherical surfaces. We explicitly design both iterative and recursive algorithms for localizing a target node with an anchor node, and extend to the case with multiple target nodes. Finally, the advantages of our algorithms against the literature are validated via simulations.

preprint2020arXiv

Second-order Conic Programming Approach for Wasserstein Distributionally Robust Two-stage Linear Programs

This paper proposes a second-order conic programming (SOCP) approach to solve distributionally robust two-stage stochastic linear programs over 1-Wasserstein balls. We start from the case with distribution uncertainty only in the objective function and exactly reformulate it as an SOCP problem. Then, we study the case with distribution uncertainty only in constraints, and show that such a robust program is generally NP-hard as it involves a norm maximization problem over a polyhedron. However, it is reduced to an SOCP problem if the extreme points of the polyhedron are given as a prior. This motivates to design a constraint generation algorithm with provable convergence to approximately solve the NP-hard problem. In sharp contrast to the exiting literature, the distribution achieving the worst-case cost is given as an "empirical" distribution by simply perturbing each sample for both cases. Finally, experiments illustrate the advantages of the proposed model in terms of the out-of-sample performance and the computational complexity.

preprint2020arXiv

Suspension Regulation of Medium-low-speed Maglev Trains via Deep Reinforcement Learning

The suspension regulation is critical to the operation of medium-low-speed maglev trains (mlsMTs). Due to uncertain environment, strong disturbances and high nonlinearity of the system dynamics, this problem cannot be well solved by most of the model-based controllers. In this paper, we propose a model-free controller by reformulating it as a continuous-state, continuous-action Markov decision process (MDP) with unknown transition probabilities. With the deterministic policy gradient and neural network approximation, we design reinforcement learning (RL) algorithms to solve the MDP and obtain a state-feedback controller by using sampled data from the suspension system. To further improve its performance, we adopt a double Q-learning scheme for learning the regulation controller. We illustrate that the proposed controllers outperform the existing PID controller with a real dataset from the mlsMT in Changsha, China and is even comparable to model-based controllers, which assume that the complete information of the model is known, via simulations.

preprint2019arXiv

Flight Control for UAV Loitering Over a Ground Target with Unknown Maneuver

This paper proposes a flight controller for an unmanned aerial vehicle (UAV) to loiter over a ground moving target (GMT). We are concerned with the scenario that the stochastically time-varying maneuver of the GMT is unknown to the UAV, which renders it challenging to estimate the GMT's motion state. Assuming that the state of the GMT is available, we first design a discrete-time Lyapunov vector field for the loitering guidance and then design a discrete-time integral sliding mode control (ISMC) to track the guidance commands. By modeling the maneuver process as a finite-state Markov chain, we propose a Rao-Blackwellised particle filter (RBPF), which only requires a few number of particles, to simultaneously estimate the motion state and the maneuver of the GMT with a camera or radar sensor. Then, we apply the principle of certainty equivalence to the ISMC and obtain the flight controller for completing the loitering task. Finally, the effectiveness and advantages of our controller are validated via simulations.

preprint2019arXiv

Target Encirclement with any Smooth Pattern Using Range-only Measurements

This paper proposes a coordinate-free controller to drive a mobile robot to encircle a target at unknown position by only using range measurements. Different from the existing works, a backstepping based controller is proposed to encircle the target with zero steady-state error for any desired smooth pattern. Moreover, we show its asymptotic exponential convergence under a fixed set of control parameters, which are independent of the initial distance to the target. The effectiveness and advantages of the proposed controller are validated via simulations.