Source author record

Lijun Ding

Lijun Ding appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.OC Artificial Intelligence Information Theory math.IT math.ST Statistics Theory

Catalog footprint

What is connected

5works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

TailedTS: Benchmark Dataset for Heavy-Tailed Time Series Prediction and Periodicity Quantification

We present TailedTS, a large-scale benchmark dataset derived from Wikipedia hourly page view observations throughout 2024, specifically designed to test time series forecasting models under heavy-tailed, zero-inflated, and non-Gaussian conditions. The dataset comprises approximately 24.69 billion data points spanning roughly 3 million unique Wikipedia pages per month, stored in high-efficiency Apache Parquet format. Wikipedia traffic follows a pronounced power-law distribution where roughly 5% of pages account for over 70% of total page views, creating a natural and rigorous testbed for model robustness against extreme volatility that are absent from or underrepresented in existing benchmarks such as M4, M5, and UCI electricity datasets. TailedTS enables several research tasks. First, we introduce a periodicity quantification framework based on sparse autoregression with sparsity and non-negativity constraints, revealing that frequently-viewed pages exhibit significantly weaker periodic structure than their less-viewed counterparts, showing direct implications for server allocation and traffic forecasting on large digital platforms. Second, we provide standardized prediction benchmarks evaluated under a suite of non-Gaussian loss functions, including $\ell_1$-norm, Huber, quantile, and $\ell_p$-norm losses, demonstrating that standard Gaussian-based estimators degrade substantially on high-volume page categories, while robust alternatives provide consistent gains across all traffic scales. TailedTS is publicly available at https://doi.org/10.5281/zenodo.17070469.

preprint2021arXiv

Bundle Method Sketching for Low Rank Semidefinite Programming

In this paper, we show that the bundle method can be applied to solve semidefinite programming problems with a low rank solution without ever constructing a full matrix. To accomplish this, we use recent results from randomly sketching matrix optimization problems and from the analysis of bundle methods. Under strong duality and strict complementarity of SDP, our algorithm produces primal and the dual sequences converging in feasibility at a rate of $\tilde{O}(1/ε)$ and in optimality at a rate of $\tilde{O}(1/ε^2)$. Moreover, our algorithm outputs a low rank representation of its approximate solution with distance to the optimal solution at most $O(\sqrtε)$ within $\tilde{O}(1/ε^2)$ iterations.

preprint2020arXiv

An Optimal-Storage Approach to Semidefinite Programming using Approximate Complementarity

This paper develops a new storage-optimal algorithm that provably solves generic semidefinite programs (SDPs) in standard form. This method is particularly effective for weakly constrained SDPs. The key idea is to formulate an approximate complementarity principle: Given an approximate solution to the dual SDP, the primal SDP has an approximate solution whose range is contained in the eigenspace with small eigenvalues of the dual slack matrix. For weakly constrained SDPs, this eigenspace has very low dimension, so this observation significantly reduces the search space for the primal solution. This result suggests an algorithmic strategy that can be implemented with minimal storage: (1) Solve the dual SDP approximately; (2) compress the primal SDP to the eigenspace with small eigenvalues of the dual slack matrix; (3) solve the compressed primal SDP. The paper also provides numerical experiments showing that this approach is successful for a range of interesting large-scale SDPs.

preprint2020arXiv

Leave-one-out Approach for Matrix Completion: Primal and Dual Analysis

In this paper, we introduce a powerful technique based on Leave-one-out analysis to the study of low-rank matrix completion problems. Using this technique, we develop a general approach for obtaining fine-grained, entrywise bounds for iterative stochastic procedures in the presence of probabilistic dependency. We demonstrate the power of this approach in analyzing two of the most important algorithms for matrix completion: (i) the non-convex approach based on Projected Gradient Descent (PGD) for a rank-constrained formulation, also known as the Singular Value Projection algorithm, and (ii) the convex relaxation approach based on nuclear norm minimization (NNM). Using this approach, we establish the first convergence guarantee for the original form of PGD without regularization or sample splitting}, and in particular shows that it converges linearly in the infinity norm. For NNM, we use this approach to study a fictitious iterative procedure that arises in the dual analysis. Our results show that \NNM recovers an $ d $-by-$ d $ rank-$ r $ matrix with $\mathcal{O}(μr \log(μr) d \log d )$ observed entries. This bound has optimal dependence on the matrix dimension and is independent of the condition number. To the best of our knowledge, this is the first sample complexity result for a tractable matrix completion algorithm that satisfies these two properties simultaneously.

preprint2020arXiv

Spectral Frank-Wolfe Algorithm: Strict Complementarity and Linear Convergence

We develop a novel variant of the classical Frank-Wolfe algorithm, which we call spectral Frank-Wolfe, for convex optimization over a spectrahedron. The spectral Frank-Wolfe algorithm has a novel ingredient: it computes a few eigenvectors of the gradient and solves a small-scale SDP in each iteration. Such procedure overcomes slow convergence of the classical Frank-Wolfe algorithm due to ignoring eigenvalue coalescence. We demonstrate that strict complementarity of the optimization problem is key to proving linear convergence of various algorithms, such as the spectral Frank-Wolfe algorithm as well as the projected gradient method and its accelerated version.

Lijun Ding

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

TailedTS: Benchmark Dataset for Heavy-Tailed Time Series Prediction and Periodicity Quantification

Bundle Method Sketching for Low Rank Semidefinite Programming

An Optimal-Storage Approach to Semidefinite Programming using Approximate Complementarity

Leave-one-out Approach for Matrix Completion: Primal and Dual Analysis

Spectral Frank-Wolfe Algorithm: Strict Complementarity and Linear Convergence