Source author record

Longxiu Huang

Longxiu Huang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.NA Numerical Analysis Machine Learning Information Retrieval Artificial Intelligence Digital Libraries eess.SP Information Theory math.CA math.CV math.FA math.IT math.OC

Catalog footprint

What is connected

8works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Distributed randomized Kaczmarz for the adversarial workers

Developing large-scale distributed methods that are robust to the presence of adversarial or corrupted workers is an important part of making such methods practical for real-world problems. Here, we propose an iterative approach that is adversary-tolerant for least-squares problems. The algorithm utilizes simple statistics to guarantee convergence and is capable of learning the adversarial distributions. Additionally, the efficiency of the proposed method is shown in simulations in the presence of adversaries. The results demonstrate the great capability of such methods to tolerate different levels of adversary rates and to identify the erroneous workers with high accuracy.

preprint2022arXiv

Guided Semi-Supervised Non-negative Matrix Factorization on Legal Documents

Classification and topic modeling are popular techniques in machine learning that extract information from large-scale datasets. By incorporating a priori information such as labels or important features, methods have been developed to perform classification and topic modeling tasks; however, most methods that can perform both do not allow for guidance of the topics or features. In this paper, we propose a method, namely Guided Semi-Supervised Non-negative Matrix Factorization (GSSNMF), that performs both classification and topic modeling by incorporating supervision from both pre-assigned document class labels and user-designed seed words. We test the performance of this method through its application to legal documents provided by the California Innocence Project, a nonprofit that works to free innocent convicted persons and reform the justice system. The results show that our proposed method improves both classification accuracy and topic coherence in comparison to past methods like Semi-Supervised Non-negative Matrix Factorization (SSNMF) and Guided Non-negative Matrix Factorization (Guided NMF).

preprint2021arXiv

Rapid Robust Principal Component Analysis: CUR Accelerated Inexact Low Rank Estimation

Robust principal component analysis (RPCA) is a widely used tool for dimension reduction. In this work, we propose a novel non-convex algorithm, coined Iterated Robust CUR (IRCUR), for solving RPCA problems, which dramatically improves the computational efficiency in comparison with the existing algorithms. IRCUR achieves this acceleration by employing CUR decomposition when updating the low rank component, which allows us to obtain an accurate low rank approximation via only three small submatrices. Consequently, IRCUR is able to process only the small submatrices and avoid expensive computing on the full matrix through the entire algorithm. Numerical experiments establish the computational advantage of IRCUR over the state-of-art algorithms on both synthetic and real-world datasets.

preprint2021arXiv

Sampling the flow of a bandlimited function

We analyze the problem of reconstruction of a bandlimited function $f$ from the space-time samples of its states $f_t=ϕ_t\ast f$ resulting from the convolution with a kernel $ϕ_t$. It is well-known that, in natural phenomena, uniform space-time samples of $f$ are not sufficient to reconstruct $f$ in a stable way. To enable stable reconstruction, a space-time sampling with periodic nonuniformly spaced samples must be used as was shown by Lu and Vetterli. We show that the stability of reconstruction, as measured by a condition number, controls the maximal gap between the spacial samples. We provide a quantitative statement of this result. In addition, instead of irregular space-time samples, we show that uniform dynamical samples at sub-Nyquist spatial rate allow one to stably reconstruct the function $\widehat f$ away from certain, explicitly described blind spots. We also consider several classes of finite dimensional subsets of bandlimited functions in which the stable reconstruction is possible, even inside the blind spots. We obtain quantitative estimates for it using Remez-Turán type inequalities. En route, we obtain a Remez-Turán inequality for prolate spheroidal wave functions. To illustrate our results, we present some numerics and explicit estimates for the heat flow problem.

preprint2020arXiv

COVID-19 Literature Topic-Based Search via Hierarchical NMF

A dataset of COVID-19-related scientific literature is compiled, combining the articles from several online libraries and selecting those with open access and full text available. Then, hierarchical nonnegative matrix factorization is used to organize literature related to the novel coronavirus into a tree structure that allows researchers to search for relevant literature based on detected topics. We discover eight major latent topics and 52 granular subtopics in the body of literature, related to vaccines, genetic structure and modeling of the disease and patient studies, as well as related diseases and virology. In order that our tool may help current researchers, an interactive website is created that organizes available literature using this hierarchical structure.

preprint2020arXiv

Perturbations of CUR Decompositions

The CUR decomposition is a factorization of a low-rank matrix obtained by selecting certain column and row submatrices of it. We perform a thorough investigation of what happens to such decompositions in the presence of noise. Since CUR decompositions are non-uniquely formed, we investigate several variants and give perturbation estimates for each in terms of the magnitude of the noise matrix in a broad class of norms which includes all Schatten $p$--norms. The estimates given here are qualitative and illustrate how the choice of columns and rows affects the quality of the approximation, and additionally we obtain new state-of-the-art bounds for some variants of CUR approximations.

preprint2020arXiv

Stability of Sampling for CUR Decompositions

This article studies how to form CUR decompositions of low-rank matrices via primarily random sampling, though deterministic methods due to previous works are illustrated as well. The primary problem is to determine when a column submatrix of a rank $k$ matrix also has rank $k$. For random column sampling schemes, there is typically a tradeoff between the number of columns needed to be chosen and the complexity of determining the sampling probabilities. We discuss several sampling methods and their complexities as well as stability of the method under perturbations of both the probabilities and the underlying matrix. As an application, we give a high probability guarantee of the exact solution of the Subspace Clustering Problem via CUR decompositions when columns are sampled according to their Euclidean lengths.

preprint2020arXiv

Tensor Completion through Total Variationwith Initialization from Weighted HOSVD

In our paper, we have studied the tensor completion problem when the sampling pattern is deterministic. We first propose a simple but efficient weighted HOSVD algorithm for recovery from noisy observations. Then we use the weighted HOSVD result as an initialization for the total variation. We have proved the accuracy of the weighted HOSVD algorithm from theoretical and numerical perspectives. In the numerical simulation parts, we also showed that by using the proposed initialization, the total variation algorithm can efficiently fill the missing data for images and videos.

Longxiu Huang

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Distributed randomized Kaczmarz for the adversarial workers

Guided Semi-Supervised Non-negative Matrix Factorization on Legal Documents

Rapid Robust Principal Component Analysis: CUR Accelerated Inexact Low Rank Estimation

Sampling the flow of a bandlimited function

COVID-19 Literature Topic-Based Search via Hierarchical NMF

Perturbations of CUR Decompositions

Stability of Sampling for CUR Decompositions

Tensor Completion through Total Variationwith Initialization from Weighted HOSVD