Researcher profile

Zhengyuan Zhu

Zhengyuan Zhu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - Emerging
11works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2026arXiv

TabCF: Distributional Control Function Estimation with Tabular Foundation Models

Instrumental variable (IV) and control function (CF) methods are powerful tools for causal effect estimation in the presence of unmeasured confounding, yet most existing approaches target only mean effects and/or demand substantial fitting and tuning effort. In this paper, we introduce a simple method, TabCF, for control function regression using tabular foundation models, which enables accurate, fast, identification-transparent, and tuning-light causal estimation of distributional quantities, such as interventional means and quantiles; we also propose a copula-based approximation for multivariate outcomes. TabCF performs favorably against representative methods across a broad range of small- to medium-sized synthetic and real data scenarios. The central message is two-fold: for practitioners, it highlights that TabCF is an effective tool for distributional causal inference; for researchers, it suggests that the proposed approach could be considered a strong baseline for future method development. Code is available at https://github.com/GepingChen/TabCF.

preprint2022arXiv

A-Optimal Split Questionnaire Designs for Multivariate Continuous Variables

A split questionnaire design (SQD), an alternative to full questionnaires, can reduce the response burden and improve survey quality. One can design a split questionnaire to reduce the information loss from missing data induced by the split questionnaire. This study develops a methodology for finding optimal SQD (OSQD) for multivariate continuous variables, applying a probabilistic design and optimality criterion approach. Our method employs previous survey data to compute the Fisher information matrix and A-optimality criterion to find OSQD for the current survey study. We derive theoretical findings on the relationship between the correlation structure and OSQD and the robustness of local OSQD. We conduct simulation studies to compare local and two global OSQDs; mini-max OSQD and Bayes OSQD) to baselines. We also apply our method to the 2016 Pet Demographic Survey (PDS) data. In both simulation studies and the real data application, local and global OSQDs outperform the baselines.

preprint2022arXiv

NET-FLEET: Achieving Linear Convergence Speedup for Fully Decentralized Federated Learning with Heterogeneous Data

Federated learning (FL) has received a surge of interest in recent years thanks to its benefits in data privacy protection, efficient communication, and parallel data processing. Also, with appropriate algorithmic designs, one could achieve the desirable linear speedup for convergence effect in FL. However, most existing works on FL are limited to systems with i.i.d. data and centralized parameter servers and results on decentralized FL with heterogeneous datasets remains limited. Moreover, whether or not the linear speedup for convergence is achievable under fully decentralized FL with data heterogeneity remains an open question. In this paper, we address these challenges by proposing a new algorithm, called NET-FLEET, for fully decentralized FL systems with data heterogeneity. The key idea of our algorithm is to enhance the local update scheme in FL (originally intended for communication efficiency) by incorporating a recursive gradient correction technique to handle heterogeneous datasets. We show that, under appropriate parameter settings, the proposed NET-FLEET algorithm achieves a linear speedup for convergence. We further conduct extensive numerical experiments to evaluate the performance of the proposed NET-FLEET algorithm and verify our theoretical findings.

preprint2021arXiv

A Geospatial Functional Model For OCO-2 Data with Application on Imputation and Land Fraction Estimation

Data from NASA's Orbiting Carbon Observatory-2 (OCO-2) satellite is essential to many carbon management strategies. A retrieval algorithm is used to estimate CO2 concentration using the radiance data measured by OCO-2. However, due to factors such as cloud cover and cosmic rays, the spatial coverage of the retrieval algorithm is limited in some areas of critical importance for carbon cycle science. Mixed land/water pixels along the coastline are also not used in the retrieval processing due to the lack of valid ancillary variables including land fraction. We propose an approach to model spatial spectral data to solve these two problems by radiance imputation and land fraction estimation. The spectral observations are modeled as spatially indexed functional data with footprint-specific parameters and are reduced to much lower dimensions by functional principal component analysis. The principal component scores are modeled as random fields to account for the spatial dependence, and the missing spectral observations are imputed by kriging the principal component scores. The proposed method is shown to impute spectral radiance with high accuracy for observations over the Pacific Ocean. An unmixing approach based on this model provides much more accurate land fraction estimates in our validation study along Greece coastlines.

preprint2020arXiv

Private and Communication-Efficient Edge Learning: A Sparse Differential Gaussian-Masking Distributed SGD Approach

With rise of machine learning (ML) and the proliferation of smart mobile devices, recent years have witnessed a surge of interest in performing ML in wireless edge networks. In this paper, we consider the problem of jointly improving data privacy and communication efficiency of distributed edge learning, both of which are critical performance metrics in wireless edge network computing. Toward this end, we propose a new decentralized stochastic gradient method with sparse differential Gaussian-masked stochastic gradients (SDM-DSGD) for non-convex distributed edge learning. Our main contributions are three-fold: i) We theoretically establish the privacy and communication efficiency performance guarantee of our SDM-DSGD method, which outperforms all existing works; ii) We show that SDM-DSGD improves the fundamental training-privacy trade-off by {\em two orders of magnitude} compared with the state-of-the-art. iii) We reveal theoretical insights and offer practical design guidelines for the interactions between privacy preservation and communication efficiency, two conflicting performance goals. We conduct extensive experiments with a variety of learning models on MNIST and CIFAR-10 datasets to verify our theoretical findings. Collectively, our results contribute to the theory and algorithm design for distributed edge learning.

preprint2020arXiv

Taming Convergence for Asynchronous Stochastic Gradient Descent with Unbounded Delay in Non-Convex Learning

Understanding the convergence performance of asynchronous stochastic gradient descent method (Async-SGD) has received increasing attention in recent years due to their foundational role in machine learning. To date, however, most of the existing works are restricted to either bounded gradient delays or convex settings. In this paper, we focus on Async-SGD and its variant Async-SGDI (which uses increasing batch size) for non-convex optimization problems with unbounded gradient delays. We prove $o(1/\sqrt{k})$ convergence rate for Async-SGD and $o(1/k)$ for Async-SGDI. Also, a unifying sufficient condition for Async-SGD's convergence is established, which includes two major gradient delay models in the literature as special cases and yields a new delay model not considered thus far.

preprint2020arXiv

Variable Selection in Macroeconomic Forecasting with Many Predictors

In the data-rich environment, using many economic predictors to forecast a few key variables has become a new trend in econometrics. The commonly used approach is factor augment (FA) approach. In this paper, we pursue another direction, variable selection (VS) approach, to handle high-dimensional predictors. VS is an active topic in statistics and computer science. However, it does not receive as much attention as FA in economics. This paper introduces several cutting-edge VS methods to economic forecasting, which includes: (1) classical greedy procedures; (2) l1 regularization; (3) gradient descent with sparsification and (4) meta-heuristic algorithms. Comprehensive simulation studies are conducted to compare their variable selection accuracy and prediction performance under different scenarios. Among the reviewed methods, a meta-heuristic algorithm called sequential Monte Carlo algorithm performs the best. Surprisingly the classical forward selection is comparable to it and better than other more sophisticated algorithms. In addition, we apply these VS methods on economic forecasting and compare with the popular FA approach. It turns out for employment rate and CPI inflation, some VS methods can achieve considerable improvement over FA, and the selected predictors can be well explained by economic theories.

preprint2016arXiv

One-dimensional Nonstationary Process Variance Function Estimation

Many spatial processes exhibit nonstationary features. We estimate a variance function from a single process observation where the errors are nonstationary and correlated. We propose a difference-based approach for a one-dimensional nonstationary process and develop a bandwidth selection method for smoothing, taking into account the correlation in the errors. The estimation results are compared to that of a local-likelihood approach proposed by Anderes and Stein(2011). A simulation study shows that our method has a smaller integrated MSE, easily fixes the boundary bias problem, and requires far less computing time than the likelihood-based method.

preprint2015arXiv

Request Prediction in Cloud with a Cyclic Window Learning Algorithm

Automatic resource scaling is one advantage of Cloud systems. Cloud systems are able to scale the number of physical machines depending on user requests. Therefore, accurate request prediction brings a great improvement in Cloud systems' performance. If we can make accurate requests prediction, the appropriate number of physical machines that can accommodate predicted amount of requests can be activated and Cloud systems will save more energy by preventing excessive activation of physical machines. Also, Cloud systems can implement advanced load distribution with accurate requests prediction. We propose an algorithm that predicts a probability distribution parameters of requests for each time interval. Maximum Likelihood Estimation (MLE) and Local Linear Regression (LLR) are used to implement this algorithm. An evaluation of the proposed algorithm is performed with the Google cluster-trace data. The prediction is implemented about the number of task arrivals, CPU requests, and memory requests. Then the accuracy of prediction is measured with Mean Absolute Percentage Error (MAPE).

preprint2015arXiv

Semiparametric estimation of spectral density function for irregular spatial data

Estimation of the covariance structure of spatial processes is of fundamental importance in spatial statistics. In the literature, several non-parametric and semi-parametric methods have been developed to estimate the covariance structure based on the spectral representation of covariance functions. However,they either ignore the high frequency properties of the spectral density, which are essential to determine the performance of interpolation procedures such as Kriging, or lack of theoretical justification. We propose a new semi-parametric method to estimate spectral densities of isotropic spatial processes with irregular observations. The spectral density function at low frequencies is estimated using smoothing spline, while a parametric model is used for the spectral density at high frequencies, and the parameters are estimated by a method-of-moment approach based on empirical variograms at small lags. We derive the asymptotic bounds for bias and variance of the proposed estimator. The simulation study shows that our method outperforms the existing non-parametric estimator by several performance criteria.

preprint2012arXiv

Spatial Multiresolution Cluster Detection Method

A novel multi-resolution cluster detection (MCD) method is proposed to identify irregularly shaped clusters in space. Multi-scale test statistic on a single cell is derived based on likelihood ratio statistic for Bernoulli sequence, Poisson sequence and Normal sequence. A neighborhood variability measure is defined to select the optimal test threshold. The MCD method is compared with single scale testing methods controlling for false discovery rate and the spatial scan statistics using simulation and f-MRI data. The MCD method is shown to be more effective for discovering irregularly shaped clusters, and the implementation of this method does not require heavy computation, making it suitable for cluster detection for large spatial data.