Researcher profile

Yinan Shen

Yinan Shen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
5topics
3close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2026arXiv

Improved performance guarantees for Tukey's median

Is there a natural way to order data in dimension greater than one? The approach based on the notion of data depth, often associated with John Tukey, is among the most popular. Tukey's depth has found applications in robust statistics, graph theory, and the study of elections and social choice. We present improved performance guarantees for empirical Tukey's median, a deepest point associated with a given sample, when the data-generating distribution is elliptically symmetric and possibly anisotropic. Some of our results remain valid in the wider class of affine equivariant estimators. As a corollary of our bounds, we show that the typical diameter of the set of all empirical Tukey's medians scales like $o(n^{-1/2})$ where $n$ is the sample size. Moreover, when the data follow the bivariate normal distribution, we prove that with high probability, the diameter is of order $O(n^{-3/4}\log^{1/2}(n))$. On the technical side, we show how affine equivariance can be leveraged to improve concentration bounds; moreover, we develop sharp strong approximation results for empirical processes indexed by halfspaces that could be of independent interest.

preprint2026arXiv

SGD with Dependent Data: Optimal Estimation, Regret, and Inference

This work investigates the performance of the final iterate produced by stochastic gradient descent (SGD) under temporally dependent data. We consider two complementary sources of dependence: $(i)$ martingale-type dependence in both the covariate and noise processes, which accommodates non-stationary and non-mixing time series data, and $(ii)$ dependence induced by sequential decision making. Our formulation runs in parallel with classical notions of (local) stationarity and strong mixing, while neither framework fully subsumes the other. Remarkably, SGD is shown to automatically accommodate both independent and dependent information under a broad class of stepsize schedules and exploration rate schemes. Non-asymptotically, we show that SGD simultaneously achieves statistically optimal estimation error and regret, extending and improving existing results. In particular, our tail bounds remain sharp even for potentially infinite horizon $T=+\infty$. Asymptotically, the SGD iterates converge to a Gaussian distribution with only an $O_{\PP}(1/\sqrt{t})$ remainder, demonstrating that the supposed estimation-regret trade-off claimed in prior work can in fact be avoided. We further propose a new ``conic'' approximation of the decision region that allows the covariates to have unbounded support. For online sparse regression, we develop a new SGD-based algorithm that uses only $d$ units of storage and requires $O(d)$ flops per iteration, achieving the long term statistical optimality. Intuitively, each incoming observation contributes to estimation accuracy, while aggregated summary statistics guide support recovery.