Source author record

Yixuan Qiu

Yixuan Qiu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computation Artificial Intelligence Mathematical Software Methodology Social and Information Networks

Catalog footprint

What is connected

4works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

cuRegOT: A GPU-Accelerated Solver for Entropic-Regularized Optimal Transport

Optimal transport (OT) has emerged as a fundamental tool in modern machine learning, yet its computational cost remains a significant bottleneck for large-scale applications. While harnessing the massive parallelism of modern GPU hardware is critical for efficiency, the de facto standard Sinkhorn algorithm, despite its ease of parallelization, often suffers from slow convergence in challenging problems. More recently, the sparse-plus-low-rank quasi-Newton method offers a balance between convergence rate and per-iteration complexity; however, its efficiency on GPUs is severely hindered by the serial nature of sparse matrix symbolic analysis and irregular memory access patterns. To bridge this gap, we present cuRegOT, a high-performance GPU solver tailored for entropic-regularized OT. We introduce a suite of algorithmic and architectural optimizations, including an amortized symbolic analysis strategy to mitigate CPU bottlenecks, an asynchronous Sinkhorn iterates generation mechanism, and a fused kernel for bandwidth-efficient gradient evaluation. These strategies are backed by rigorous theoretical guarantees ensuring algorithmic convergence. Extensive numerical experiments demonstrate that cuRegOT achieves significant speedups over state-of-the-art GPU-based solvers across a variety of benchmark tasks.

preprint2022arXiv

Learning Multitask Gaussian Bayesian Networks

Major depressive disorder (MDD) requires study of brain functional connectivity alterations for patients, which can be uncovered by resting-state functional magnetic resonance imaging (rs-fMRI) data. We consider the problem of identifying alterations of brain functional connectivity for a single MDD patient. This is particularly difficult since the amount of data collected during an fMRI scan is too limited to provide sufficient information for individual analysis. Additionally, rs-fMRI data usually has the characteristics of incompleteness, sparsity, variability, high dimensionality and high noise. To address these problems, we proposed a multitask Gaussian Bayesian network (MTGBN) framework capable for identifying individual disease-induced alterations for MDD patients. We assume that such disease-induced alterations show some degrees of similarity with the tool to learn such network structures from observations to understanding of how system are structured jointly from related tasks. First, we treat each patient in a class of observation as a task and then learn the Gaussian Bayesian networks (GBNs) of this data class by learning from all tasks that share a default covariance matrix that encodes prior knowledge. This setting can help us to learn more information from limited data. Next, we derive a closed-form formula of the complete likelihood function and use the Monte-Carlo Expectation-Maximization(MCEM) algorithm to search for the approximately best Bayesian network structures efficiently. Finally, we assess the performance of our methods with simulated and real-world rs-fMRI data.

preprint2022arXiv

Randomized spectral co-clustering for large-scale directed networks

Directed networks are broadly used to represent asymmetric relationships among units. Co-clustering aims to cluster the senders and receivers of directed networks simultaneously. In particular, the well-known spectral clustering algorithm could be modified as the spectral co-clustering to co-cluster directed networks. However, large-scale networks pose great computational challenges to it. In this paper, we leverage sketching techniques and derive two randomized spectral co-clustering algorithms, one \emph{random-projection-based} and the other \emph{random-sampling-based}, to accelerate the co-clustering of large-scale directed networks. We theoretically analyze the resulting algorithms under two generative models -- the stochastic co-block model and the degree-corrected stochastic co-block model, and establish their approximation error rates and misclustering error rates, indicating better bounds than the state-of-the-art results of co-clustering literature. Numerically, we design and conduct simulations to support our theoretical results and test the efficiency of the algorithms on real networks with up to millions of nodes. A publicly available R package \textsf{RandClust} is developed for better usability and reproducibility of the proposed methods.

preprint2020arXiv

Stochastic Approximate Gradient Descent via the Langevin Algorithm

We introduce a novel and efficient algorithm called the stochastic approximate gradient descent (SAGD), as an alternative to the stochastic gradient descent for cases where unbiased stochastic gradients cannot be trivially obtained. Traditional methods for such problems rely on general-purpose sampling techniques such as Markov chain Monte Carlo, which typically requires manual intervention for tuning parameters and does not work efficiently in practice. Instead, SAGD makes use of the Langevin algorithm to construct stochastic gradients that are biased in finite steps but accurate asymptotically, enabling us to theoretically establish the convergence guarantee for SAGD. Inspired by our theoretical analysis, we also provide useful guidelines for its practical implementation. Finally, we show that SAGD performs well experimentally in popular statistical and machine learning problems such as the expectation-maximization algorithm and the variational autoencoders.