Source author record

Edmond Chow

Edmond Chow appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.NA Numerical Analysis Distributed, Parallel, and Cluster Computing cond-mat.mtrl-sci Machine Learning math.OC physics.comp-ph

Catalog footprint

What is connected

7works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

HiGP: A high-performance Python package for Gaussian Process

Gaussian Processes (GPs) are flexible, nonparametric Bayesian models widely used for regression and classification because of their ability to capture complex data patterns and quantify predictive uncertainty. However, the O(n^3) computational cost of kernel matrix operations poses a major obstacle to applying GPs at scale. HiGP is a high-performance Python package designed to overcome these scalability limitations through advanced numerical linear algebra and hierarchical kernel representations. It integrates H^2 matrices to achieve near-linear complexity in both storage and computation for spatial datasets, supports on-the-fly kernel evaluation to avoid explicit storage in large-scale problems, and incorporates a robust Adaptive Factorized Nyström (AFN) preconditioner that accelerates convergence of iterative solvers across a broad range of kernel spectra. These computational kernels are implemented in C++ for maximum performance and exposed through Python interfaces, enabling seamless integration with modern machine learning workflows. HiGP also includes analytically derived gradient computations for efficient hyperparameter optimization, avoiding the inefficiencies of automatic differentiation in iterative solvers. By serving as a reusable numerical engine, HiGP complements existing GP frameworks such as GPJax, KeOps, and GaussianProcesses.jl, providing a reliable and scalable computational backbone for large-scale Gaussian Process regression and classification.

preprint2022arXiv

Data-driven Construction of Hierarchical Matrices with Nested Bases

Hierarchical matrices provide a powerful representation for significantly reducing the computational complexity associated with dense kernel matrices. For general kernel functions, interpolation-based methods are widely used for the efficient construction of hierarchical matrices. In this paper, we present a fast hierarchical data reduction (HiDR) procedure with $O(n)$ complexity for the memory-efficient construction of hierarchical matrices with nested bases where $n$ is the number of data points. HiDR aims to reduce the given data in a hierarchical way so as to obtain $O(1)$ representations for all nearfield and farfield interactions. Based on HiDR, a linear complexity $\mathcal{H}^2$ matrix construction algorithm is proposed. The use of data-driven methods enables {better efficiency than other general-purpose methods} and flexible computation without accessing the kernel function. Experiments demonstrate significantly improved memory efficiency of the proposed data-driven method compared to interpolation-based methods over a wide range of kernels. Though the method is not optimized for any special kernel, benchmark experiments for the Coulomb kernel show that the proposed general-purpose algorithm offers competitive performance for hierarchical matrix construction compared to several state-of-the-art algorithms for the Coulomb kernel.

preprint2022arXiv

Large-Scale Maintenance and Unit Commitment: A Decentralized Subgradient Approach

Unit Commitment (UC) is a fundamental problem in power system operations. When coupled with generation maintenance, the joint optimization problem poses significant computational challenges due to coupling constraints linking maintenance and UC decisions. Obviously, these challenges grow with the size of the network. With the introduction of sensors for monitoring generator health and condition-based maintenance(CBM), these challenges have been magnified. ADMM-based decentralized methods have shown promise in solving large-scale UC problems, especially in vertically integrated power systems. However, in their current form, these methods fail to deliver similar computational performance and scalability when considering the joint UC and CBM problem. This paper provides a novel decentralized optimization framework for solving large-scale, joint UC and CBM problems. Our approach relies on the novel use of the subgradient method to temporally decouple various subproblems of the ADMM-based formulation of the joint problem along the maintenance horizon. By effectively utilizing multithreading, our decentralized subgradient approach delivers superior computational performance and eliminates the need to move sensor data thereby alleviating privacy and security concerns. Using experiments on large scale test cases, we show that our framework can provide a speedup of upto 50x as compared to various state of the art benchmarks without compromising on solution quality.

preprint2021arXiv

Efficient construction of an HSS preconditioner for symmetric positive definite $\mathcal{H}^2$ matrices

In an iterative approach for solving linear systems with ill-conditioned, symmetric positive definite (SPD) kernel matrices, both fast matrix-vector products and fast preconditioning operations are required. Fast (linear-scaling) matrix-vector products are available by expressing the kernel matrix in an $\mathcal{H}^2$ representation or an equivalent fast multipole method representation. Preconditioning such matrices, however, requires a structured matrix approximation that is more regular than the $\mathcal{H}^2$ representation, such as the hierarchically semiseparable (HSS) matrix representation, which provides fast solve operations. Previously, an algorithm was presented to construct an HSS approximation to an SPD kernel matrix that is guaranteed to be SPD. However, this algorithm has quadratic cost and was only designed for recursive binary partitionings of the points defining the kernel matrix. This paper presents a general algorithm for constructing an SPD HSS approximation. Importantly, the algorithm uses the $\mathcal{H}^2$ representation of the SPD matrix to reduce its computational complexity from quadratic to quasilinear. Numerical experiments illustrate how this SPD HSS approximation performs as a preconditioner for solving linear systems arising from a range of kernel functions.

preprint2020arXiv

Asynchronous One-Level and Two-Level Domain Decomposition Solvers

Parallel implementations of linear iterative solvers generally alternate between phases of data exchange and phases of local computation. Increasingly large problem sizes on more heterogeneous systems make load balancing and network layout very challenging tasks. In particular, global communication patterns such as inner products become increasingly limiting at scale. We explore the use of asynchronous communication based on one-sided MPI primitives in a multitude of domain decomposition solvers. In particular, a scalable asynchronous two-level method is presented. We discuss practical issues encountered in the development of a scalable solver and show experimental results obtained on state-of-the-art supercomputer systems that illustrate the benefits of asynchronous solvers in load balanced as well as load imbalanced scenarios. Using the novel method, we can observe speed-ups of up to 4x over its classical synchronous equivalent.

preprint2020arXiv

Asynchronous Richardson iterations

We consider asynchronous versions of the first and second order Richardson methods for solving linear systems of equations. These methods depend on parameters whose values are chosen a priori. We explore the parameter values that can be proven to give convergence of the asynchronous methods. This is the first such analysis for asynchronous second order methods. We find that for the first order method, the optimal parameter value for the synchronous case also gives an asynchronously convergent method. For the second order method, the parameter ranges for which we can prove asynchronous convergence do not contain the optimal parameter values for the synchronous iteration. In practice, however, the asynchronous second order iterations may still converge using the optimal parameter values, or parameter values close to the optimal ones, despite this result. We explore this behavior with a multithreaded parallel implementation of the asynchronous methods.

preprint2020arXiv

SPARC: Simulation Package for Ab-initio Real-space Calculations

We present SPARC: Simulation Package for Ab-initio Real-space Calculations. SPARC can perform Kohn-Sham density functional theory calculations for isolated systems such as molecules as well as extended systems such as crystals and surfaces, in both static and dynamic settings. It is straightforward to install/use and highly competitive with state-of-the-art planewave codes, demonstrating comparable performance on a small number of processors and increasing advantages as the number of processors grows. Notably, SPARC brings solution times down to a few seconds for systems with $\mathcal{O}(100-500)$ atoms on large-scale parallel computers, outperforming planewave counterparts by an order of magnitude and more.

Edmond Chow

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

HiGP: A high-performance Python package for Gaussian Process

Data-driven Construction of Hierarchical Matrices with Nested Bases

Large-Scale Maintenance and Unit Commitment: A Decentralized Subgradient Approach

Efficient construction of an HSS preconditioner for symmetric positive definite $\mathcal{H}^2$ matrices

Asynchronous One-Level and Two-Level Domain Decomposition Solvers

Asynchronous Richardson iterations

SPARC: Simulation Package for Ab-initio Real-space Calculations