Source author record

Foo Hui-Mean

Foo Hui-Mean appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation Machine Learning Applications Methodology

Catalog footprint

What is connected

3works

4topics

1close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Efficient Data Reduction Via PCA-Guided Quantile Based Sampling

In large-scale statistical modeling, reducing data size through subsampling is essential for balancing computational efficiency and statistical accuracy. We propose a new method, Principal Component Analysis guided Quantile Sampling (PCA-QS), which projects data onto principal components and applies quantile-based sampling to retain representative and diverse subsets. Compared with uniform random sampling, leverage score sampling, and coreset methods, PCA-QS consistently achieves lower mean squared error and better preservation of key data characteristics, while also being computationally efficient. This approach is adaptable to a variety of data scenarios and shows strong potential for broad applications in statistical computing.

preprint2026arXiv

Integrating Multi-Armed Bandit, Active Learning, and Distributed Computing for Scalable Optimization

Modern optimization problems in scientific and engineering domains often rely on expensive black-box evaluations, such as those arising in physical simulations or deep learning pipelines, where gradient information is unavailable or unreliable. In these settings, conventional optimization methods quickly become impractical due to prohibitive computational costs and poor scalability. We propose ALMAB-DC, a unified and modular framework for scalable black-box optimization that integrates active learning, multi-armed bandits, and distributed computing, with optional GPU acceleration. The framework leverages surrogate modeling and information-theoretic acquisition functions to guide informative sample selection, while bandit-based controllers dynamically allocate computational resources across candidate evaluations in a statistically principled manner. These decisions are executed asynchronously within a distributed multi-agent system, enabling high-throughput parallel evaluation. We establish theoretical regret bounds for both UCB-based and Thompson-sampling-based variants and develop a scalability analysis grounded in Amdahl's and Gustafson's laws. Empirical results across synthetic benchmarks, reinforcement learning tasks, and scientific simulation problems demonstrate that ALMAB-DC consistently outperforms state-of-the-art black-box optimizers. By design, ALMAB-DC is modular, uncertainty-aware, and extensible, making it particularly well suited for high-dimensional, resource-intensive optimization challenges.

preprint2026arXiv

PCA-Guided Quantile Sampling: Preserving Data Structure in Large-Scale Subsampling

We introduce Principal Component Analysis guided Quantile Sampling (PCA QS), a novel sampling framework designed to preserve both the statistical and geometric structure of large scale datasets. Unlike conventional PCA, which reduces dimensionality at the cost of interpretability, PCA QS retains the original feature space while using leading principal components solely to guide a quantile based stratification scheme. This principled design ensures that sampling remains representative without distorting the underlying data semantics. We establish rigorous theoretical guarantees, deriving convergence rates for empirical quantiles, Kullback Leibler divergence, and Wasserstein distance, thus quantifying the distributional fidelity of PCA QS samples. Practical guidelines for selecting the number of principal components, quantile bins, and sampling rates are provided based on these results. Extensive empirical studies on both synthetic and real-world datasets show that PCA QS consistently outperforms simple random sampling, yielding better structure preservation and improved downstream model performance. Together, these contributions position PCA QS as a scalable, interpretable, and theoretically grounded solution for efficient data summarization in modern machine learning workflows.

Foo Hui-Mean

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

Efficient Data Reduction Via PCA-Guided Quantile Based Sampling

Integrating Multi-Armed Bandit, Active Learning, and Distributed Computing for Scalable Optimization

PCA-Guided Quantile Sampling: Preserving Data Structure in Large-Scale Subsampling