Researcher profile

Lintao Ye

Lintao Ye contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2026arXiv

Learning to Sparsify Stochastic Linear Bandits

This paper addresses the problem of learning to sparsify stochastic linear bandits, where a decision-maker sequentially selects actions from a high-dimensional space subject to a sparsity constraint on the number of nonzero elements in the action vector. The key challenge lies in minimizing cumulative regret while tackling the potential NP-hardness of finding optimal sparse actions due to the inherent combinatorial structure of the problem. We propose an adaptively phased exploration and exploitation algorithmic framework, utilizing ordinary least squares for parameter learning and specialized subroutines for sparse action selection. When the action set is a Euclidean ball, optimal sparse actions can be efficiently computed, enabling us to establish a $\tilde{\mathcal{O}}(d\sqrt{T})$ regret, where $d$ is the dimension of the action vector and $T$ is the time horizon length. For general convex and compact action sets where finding optimal sparse actions is intractable, we employ a greedy subroutine. For general strongly convex action sets, we derive a $\tilde{\mathcal{O}}(d \sqrt{T})$ $α$-regret; for general compact sets lacking strong convexity, we establish a $\tilde{\mathcal{O}}(d T^{2/3})$ $α$-regret, where $α$ pertains to the approximation ratio of the greedy algorithm. Finally, we validate the performance of our algorithms using extensive experiments including an application to recommendation system.

preprint2022arXiv

Identifying the Dynamics of a System by Leveraging Data from Similar Systems

We study the problem of identifying the dynamics of a linear system when one has access to samples generated by a similar (but not identical) system, in addition to data from the true system. We use a weighted least squares approach and provide finite sample performance guarantees on the quality of the identified dynamics. Our results show that one can effectively use the auxiliary data generated by the similar system to reduce the estimation error due to the process noise, at the cost of adding a portion of error that is due to intrinsic differences in the models of the true and auxiliary systems. We also provide numerical experiments to validate our theoretical results. Our analysis can be applied to a variety of important settings. For example, if the system dynamics change at some point in time (e.g., due to a fault), how should one leverage data from the prior system in order to learn the dynamics of the new system? As another example, if there is abundant data available from a simulated (but imperfect) model of the true system, how should one weight that data compared to the real data from the system? Our analysis provides insights into the answers to these questions.

preprint2022arXiv

Model-free Learning for Risk-constrained Linear Quadratic Regulator with Structured Feedback in Networked Systems

We develop a model-free learning algorithm for the infinite-horizon linear quadratic regulator (LQR) problem. Specifically, (risk) constraints and structured feedback are considered, in order to reduce the state deviation while allowing for a sparse communication graph in practice. By reformulating the dual problem as a nonconvex-concave minimax problem, we adopt the gradient descent max-oracle (GDmax), and for modelfree setting, the stochastic (S)GDmax using zero-order policy gradient. By bounding the Lipschitz and smoothness constants of the LQR cost using specifically defined sublevel sets, we can design the stepsize and related parameters to establish convergence to a stationary point (at a high probability). Numerical tests in a networked microgrid control problem have validated the convergence of our proposed SGDmax algorithm while demonstrating the effectiveness of risk constraints. The SGDmax algorithm has attained a satisfactory optimality gap compared to the classical LQR control, especially for the full feedback case.

preprint2022arXiv

On the Sample Complexity of Decentralized Linear Quadratic Regulator with Partially Nested Information Structure

We study the problem of control policy design for decentralized state-feedback linear quadratic control with a partially nested information structure, when the system model is unknown. We propose a model-based learning solution, which consists of two steps. First, we estimate the unknown system model from a single system trajectory of finite length, using least squares estimation. Next, based on the estimated system model, we design a control policy that satisfies the desired information structure. We show that the suboptimality gap between our control policy and the optimal decentralized control policy (designed using accurate knowledge of the system model) scales linearly with the estimation error of the system model. Using this result, we provide an end-to-end sample complexity result for learning decentralized controllers for a linear quadratic control problem with a partially nested information structure.

preprint2020arXiv

On the Complexity and Approximability of Optimal Sensor Selection and Attack for Kalman Filtering

Given a linear dynamical system affected by stochastic noise, we consider the problem of selecting an optimal set of sensors (at design-time) to minimize the trace of the steady state a priori or a posteriori error covariance of the Kalman filter, subject to certain selection budget constraints. We show the fundamental result that there is no polynomial-time constant-factor approximation algorithm for this problem. This contrasts with other classes of sensor selection problems studied in the literature, which typically pursue constant-factor approximations by leveraging greedy algorithms and submodularity (or supermodularity) of the cost function. Here, we provide a specific example showing that greedy algorithms can perform arbitrarily poorly for the problem of design-time sensor selection for Kalman filtering. We then study the problem of attacking (i.e., removing) a set of installed sensors, under predefined attack budget constraints, to maximize the trace of the steady state a priori or a posteriori error covariance of the Kalman filter. Again, we show that there is no polynomial-time constant-factor approximation algorithm for this problem, and show specifically that greedy algorithms can perform arbitrarily poorly.

preprint2020arXiv

Resilient Sensor Placement for Kalman Filtering in Networked Systems: Complexity and Algorithms

Given a linear dynamical system affected by noise, we study the problem of optimally placing sensors (at design-time) subject to a sensor placement budget constraint in order to minimize the trace of the steady-state error covariance of the corresponding Kalman filter. While this problem is NP-hard in general, we consider the underlying graph associated with the system dynamics matrix, and focus on the case when there is a single input at one of the nodes in the graph. We provide an optimal strategy (computed in polynomial-time) to place the sensors over the network. Next, we consider the problem of attacking (i.e., removing) the placed sensors under a sensor attack budget constraint in order to maximize the trace of the steady-state error covariance of the resulting Kalman filter. Using the insights obtained for the sensor placement problem, we provide an optimal strategy (computed in polynomial-time) to attack the placed sensors. Finally, we consider the scenario where a system designer places the sensors under a sensor placement budget constraint, and an adversary then attacks the placed sensors subject to a sensor attack budget constraint. The resilient sensor placement problem is to find a sensor placement strategy to minimize the trace of the steady-state error covariance of the Kalman filter corresponding to the sensors that survive the attack. We show that this problem is NP-hard, and provide a pseudo-polynomial-time algorithm to solve it.