Researcher profile

Sai Li

Sai Li contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
13works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

13 published item(s)

preprint2026arXiv

Personalizing black-box models for nonparametric regression with minimax optimality

Recent advances in large-scale models, including deep neural networks and large language models, have substantially improved performance across a wide range of learning tasks. The widespread availability of such pre-trained models creates new opportunities for data-efficient statistical learning, provided they can be effectively integrated into downstream tasks. Motivated by this setting, we study few-shot personalization, where a pre-trained black-box model is adapted to a target domain using a limited number of samples. We develop a theoretical framework for few-shot personalization in nonparametric regression and propose algorithms that can incorporate a black-box pre-trained model into the regression procedure. We establish the minimax optimal rate for the personalization problem and show that the proposed method attains this rate. Our results clarify the statistical benefits of leveraging pre-trained models under sample scarcity and provide robustness guarantees when the pre-trained model is not informative. We illustrate the finite-sample performance of the methods through simulations and an application to the California housing dataset with several pre-trained models.

preprint2026arXiv

Uncertainty-Calibrated Recommendations for Low-Active Users

A fundamental challenge in recommender systems is balancing reliability for Low-Active Users (LAUs) with diversity for High-Active Users (HAUs). The key to this balance lies in quantifying model uncertainty, which approximates the risk of prediction errors and reveals the limits of the model's current knowledge. On large-scale short-video and livestream platforms, model uncertainty can warn of low-quality recommendations that may lead to disengagement of LAUs and at the same time identify opportunities to diversify content recommendation for HAUs. To leverage this dichotomy, we introduce a unified, production-ready framework that calibrates uncertainty to drive differentiated strategies. Specifically, we implement a model-uncertainty-based risk-averse deboosting policy for LAUs to suppress unreliable recommendations, while employing a risk-seeking Upper Confidence Bound (UCB) strategy for HAUs to encourage exploration. Validated on a major livestream platform, our framework demonstrates significant improvements in retention (active hours) and satisfaction (quality watch time ratio) for LAUs as well as remarkable increases in interest diversity and category coverage for HAUs, proving the value of uncertainty-aware recommendation in industrial settings.

preprint2022arXiv

A Focusing Framework for Testing Bi-Directional Causal Effects with GWAS Summary Data

Mendelian randomization (MR) is a powerful method that uses genetic variants as instrumental variables (IVs) to infer the causal effect of a modifiable exposure on an outcome. Although recent years have seen many extensions of basic MR methods to be robust to certain violations of assumptions, few methods were proposed to infer bi-directional causal relationships, especially for phenotypes with limited biological understandings. The presence of horizontal pleiotropy adds another layer of complexity. In this article, we show that assumptions for common MR methods are often impossible or too stringent in the existence of bi-directional relationships. We then propose a new focusing framework for testing bi-directional causal effects between two traits with possibly pleiotropic genetic variants. Our proposal can be coupled with many state-of-art MR methods. We provide theoretical guarantees on the Type I error and power of the proposed methods. We demonstrate the robustness of the proposed methods using several simulated and real datasets.

preprint2022arXiv

Causal Inference for Nonlinear Outcome Models with Possibly Invalid Instrumental Variables

Instrumental variable methods are widely used for inferring the causal effect in the presence of unmeasured confounders. Existing instrumental variable methods for nonlinear outcome models require stringent identifiability conditions. This paper considers a flexible semi-parametric potential outcome model that allows for possibly invalid instruments. We propose new identifiability conditions to identify the causal parameters when the majority of the instrumental variables are valid. We devise a novel inference procedure for a new average structural function and the conditional average treatment effect. We establish the asymptotic normality of the proposed estimators and construct confidence intervals for the causal estimands by bootstrap. The proposed method is demonstrated in large-scale simulation studies and is applied to infer the effect of income on house ownership.

preprint2022arXiv

Estimation and Inference with Proxy Data and its Genetic Applications

Existing high-dimensional statistical methods are largely established for analyzing individual-level data. In this work, we study estimation and inference for high-dimensional linear models where we only observe "proxy data", which include the marginal statistics and sample covariance matrix that are computed based on different sets of individuals. We develop a rate optimal method for estimation and inference for the regression coefficient vector and its linear functionals based on the proxy data. Moreover, we show the intrinsic limitations in the proxy-data based inference: the minimax optimal rate for estimation is slower than that in the conventional case where individual data are observed; the power for testing and multiple testing does not go to one as the signal strength goes to infinity. These interesting findings are illustrated through simulation studies and an analysis of a dataset concerning the genetic associations of hindlimb muscle weight in a mouse population.

preprint2022arXiv

Fast quantum state transfer and entanglement for cavity-coupled many qubits via dark pathways

Quantum state transfer (QST) and entangled state generation (ESG) are important building blocks for modern quantum information processing. To achieve these tasks, convention wisdom is to consult the quantum adiabatic evolution, which is time-consuming, and thus is of low fidelity. Here, using the shortcut to adiabaticity technique, we propose a general method to realize high-fidelity fast QST and ESG in a cavity-coupled many qubits system via its dark pathways, which can be further designed for high-fidelity quantum tasks with different optimization purpose. Specifically, with a proper dark pathway, QST and ESG between any two qubits can be achieved without decoupling the others, which simplifies experimental demonstrations. Meanwhile, ESG among all qubits can also be realized in a single step. In addition, our scheme can be implemented in many quantum systems, and we illustrate its implementation on superconducting quantum circuits. Therefore, we propose a powerful strategy for selective quantum manipulation, which is promising in cavity coupled quantum systems and could find many convenient applications in quantum information processing.

preprint2022arXiv

Improving Out-of-Distribution Robustness via Selective Augmentation

Machine learning algorithms typically assume that training and test examples are drawn from the same distribution. However, distribution shift is a common problem in real-world applications and can cause models to perform dramatically worse at test time. In this paper, we specifically consider the problems of subpopulation shifts (e.g., imbalanced data) and domain shifts. While prior works often seek to explicitly regularize internal representations or predictors of the model to be domain invariant, we instead aim to learn invariant predictors without restricting the model's internal representations or predictors. This leads to a simple mixup-based technique which learns invariant predictors via selective augmentation called LISA. LISA selectively interpolates samples either with the same labels but different domains or with the same domain but different labels. Empirically, we study the effectiveness of LISA on nine benchmarks ranging from subpopulation shifts to domain shifts, and we find that LISA consistently outperforms other state-of-the-art methods and leads to more invariant predictors. We further analyze a linear setting and theoretically show how LISA leads to a smaller worst-group error.

preprint2022arXiv

Scalable Method for Eliminating Residual $ZZ$ Interaction between Superconducting Qubits

Unwanted $ZZ$ interaction is a quantum-mechanical crosstalk phenomenon which correlates qubit dynamics and is ubiquitous in superconducting qubit systems. It adversely affects the quality of quantum operations and can be detrimental in scalable quantum information processing. Here we propose and experimentally demonstrate a practically extensible approach for complete cancellation of residual $ZZ$ interaction between fixed-frequency transmon qubits, which are known for long coherence and simple control. We apply to the intermediate coupler that connects the qubits a weak microwave drive at a properly chosen frequency in order to noninvasively induce an ac Stark shift for $ZZ$ cancellation. We verify the cancellation performance by measuring vanishing two-qubit entangling phases and $ZZ$ correlations. In addition, we implement a randomized benchmarking experiment to extract the idling gate fidelity which shows good agreement with the coherence limit, demonstrating the effectiveness of $ZZ$ cancellation. Our method allows independent addressability of each qubit-qubit connection, and is applicable to both nontunable and tunable couplers, promising better compatibility with future large-scale quantum processors.

preprint2021arXiv

Inference for high-dimensional linear mixed-effects models: A quasi-likelihood approach

Linear mixed-effects models are widely used in analyzing clustered or repeated measures data. We propose a quasi-likelihood approach for estimation and inference of the unknown parameters in linear mixed-effects models with high-dimensional fixed effects. The proposed method is applicable to general settings where the dimension of the random effects and the cluster sizes are possibly large. Regarding the fixed effects, we provide rate optimal estimators and valid inference procedures that do not rely on the structural information of the variance components. We also study the estimation of variance components with high-dimensional fixed effects in general settings. The algorithms are easy to implement and computationally fast. The proposed methods are assessed in various simulation settings and are applied to a real study regarding the associations between body mass index and genetic polymorphic markers in a heterogeneous stock mice population.

preprint2020arXiv

Fast holonomic quantum computation on superconducting circuits with optimal control

Geometric phases induced in quantum evolutions have built-in noise-resilient characters, and thus can find applications in many robust quantum manipulation tasks. Here, we propose a feasible and fast scheme for universal quantum computation on superconducting circuits with nonadiabatic non-Abelian geometric phases, using resonant interaction of three-level quantum system. In our scheme, arbitrary single-qubit quantum gates can be implemented in a single-loop scenario by shaping both the amplitudes and phases of the two driving microwave fields resonantly coupled to a transmon device. Moreover, nontrivial two-qubit gates can also be realized with an auxiliary transmon simultaneously coupled to the two target transmons in an effective resonant way. In particular, our proposal can be compatible to various optimal control techniques, which further enhances the robustness of the quantum operations. Therefore, our proposal represents a promising way towards fault-tolerant quantum computation on solid-state quantum circuits.

preprint2020arXiv

High-fidelity geometric gate for silicon-based spin qubits

High-fidelity manipulation is the key for the physical realization of fault-tolerant quantum computation. Here, we present a protocol to realize universal nonadiabatic geometric gates for silicon-based spin qubits. We find that the advantage of geometric gates over dynamical gates depends crucially on the evolution loop for the construction of the geometric phase. Under appropriate evolution loops, both the geometric single-qubit gates and the CNOT gate can outperform their dynamical counterparts for both systematic and detuning noises. We also perform randomized benchmarking using noise amplitudes consistent with experiments in silicon. For the static noise model, the averaged fidelities of geometric gates are around 99.90\% or above, while for the time-dependent $1/f$-type noise, the fidelities are around 99.98\% when only the detuning noise is present. We also show that the improvement in fidelities of the geometric gates over dynamical ones typically increases with the exponent $α$ of the $1/f$ noise, and the ratio can be as high as 4 when $α\approx 3$. Our results suggest that geometric gates with judiciously chosen evolution loops can be a powerful way to realize high-fidelity quantum gates.

preprint2020arXiv

Nonadiabatic geometric quantum computation with optimal control on superconducting circuits

Quantum gates, which are the essential building blocks of quantum computers, are very fragile. Thus, to realize robust quantum gates with high fidelity is the ultimate goal of quantum manipulation. Here, we propose a nonadiabatic geometric quantum computation scheme on superconducting circuits to engineer arbitrary quantum gates, which share both the robust merit of geometric phases and the capacity to combine with optimal control technique to further enhance the gate robustness. Specifically, in our proposal, arbitrary geometric single-qubit gates can be realized on a transmon qubit, by a resonant microwave field driving, with both the amplitude and phase of the driving being time-dependent. Meanwhile, nontrivial two-qubit geometric gates can be implemented by two capacitively coupled transmon qubits, with one of the transmon qubits' frequency being modulated to obtain effective resonant coupling between them. Therefore, our scheme provides a promising step towards fault-tolerant solid-state quantum computation.

preprint2020arXiv

Transfer Learning for High-dimensional Linear Regression: Prediction, Estimation, and Minimax Optimality

This paper considers the estimation and prediction of a high-dimensional linear regression in the setting of transfer learning, using samples from the target model as well as auxiliary samples from different but possibly related regression models. When the set of "informative" auxiliary samples is known, an estimator and a predictor are proposed and their optimality is established. The optimal rates of convergence for prediction and estimation are faster than the corresponding rates without using the auxiliary samples. This implies that knowledge from the informative auxiliary samples can be transferred to improve the learning performance of the target problem. In the case that the set of informative auxiliary samples is unknown, we propose a data-driven procedure for transfer learning, called Trans-Lasso, and reveal its robustness to non-informative auxiliary samples and its efficiency in knowledge transfer. The proposed procedures are demonstrated in numerical studies and are applied to a dataset concerning the associations among gene expressions. It is shown that Trans-Lasso leads to improved performance in gene expression prediction in a target tissue by incorporating the data from multiple different tissues as auxiliary samples.