Researcher profile

Esmond Ng

Esmond Ng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2016arXiv

An Asynchronous Task-based Fan-Both Sparse Cholesky Solver

Systems of linear equations arise at the heart of many scientific and engineering applications. Many of these linear systems are sparse; i.e., most of the elements in the coefficient matrix are zero. Direct methods based on matrix factorizations are sometimes needed to ensure accurate solutions. For example, accurate solution of sparse linear systems is needed in shift-invert Lanczos to compute interior eigenvalues. The performance and resource usage of sparse matrix factorizations are critical to time-to-solution and maximum problem size solvable on a given platform. In many applications, the coefficient matrices are symmetric, and exploiting symmetry will reduce both the amount of work and storage cost required for factorization. When the factorization is performed on large-scale distributed memory platforms, communication cost is critical to the performance of the algorithm. At the same time, network topologies have become increasingly complex, so that modern platforms exhibit a high level of performance variability. This makes scheduling of computations an intricate and performance-critical task. In this paper, we investigate the use of an asynchronous task paradigm, one-sided communication and dynamic scheduling in implementing sparse Cholesky factorization (symPACK) on large-scale distributed memory platforms. Our solver symPACK relies on efficient and flexible communication primitives provided by the UPC++ library. Performance evaluation shows good scalability and that symPACK outperforms state-of-the-art parallel distributed memory factorization packages, validating our approach on practical cases.

preprint2015arXiv

Ab Initio No Core Shell Model - Recent Results and Further Prospects

There has been significant recent progress in solving the long-standing problems of how nuclear shell structure and collective motion emerge from underlying microscopic inter-nucleon interactions. We review a selection of recent significant results within the ab initio No Core Shell Model (NCSM) closely tied to three major factors enabling this progress: (1) improved nuclear interactions that accurately describe the experimental two-nucleon and three-nucleon interaction data; (2) advances in algorithms to simulate the quantum many-body problem with strong interactions; and (3) continued rapid development of high-performance computers now capable of performing $20 \times 10^{15}$ floating point operations per second. We also comment on prospects for further developments.

preprint2013arXiv

On the minimum FLOPs problem in the sparse Cholesky factorization

Prior to computing the Cholesky factorization of a sparse, symmetric positive definite matrix, a reordering of the rows and columns is computed so as to reduce both the number of fill elements in Cholesky factor and the number of arithmetic operations (FLOPs) in the numerical factorization. These two metrics are clearly somehow related and yet it is suspected that these two problems are different. However, no rigorous theoretical treatment of the relation of these two problems seems to have been given yet. In this paper we show by means of an explicit, scalable construction that the two problems are different in a very strict sense. In our construction no ordering, that is optimal for the fill, is optimal with respect to the number of FLOPs, and vice versa. Further, it is commonly believed that minimizing the number of FLOPs is no easier than minimizing the fill (in the complexity sense), but so far no proof appears to be known. We give a reduction chain that shows the NP hardness of minimizing the number of arithmetic operations in the Cholesky factorization.

preprint2009arXiv

{\it Ab initio} nuclear structure - the large sparse matrix eigenvalue problem

The structure and reactions of light nuclei represent fundamental and formidable challenges for microscopic theory based on realistic strong interaction potentials. Several {\it ab initio} methods have now emerged that provide nearly exact solutions for some nuclear properties. The {\it ab initio} no core shell model (NCSM) and the no core full configuration (NCFC) method, frame this quantum many-particle problem as a large sparse matrix eigenvalue problem where one evaluates the Hamiltonian matrix in a basis space consisting of many-fermion Slater determinants and then solves for a set of the lowest eigenvalues and their associated eigenvectors. The resulting eigenvectors are employed to evaluate a set of experimental quantities to test the underlying potential. For fundamental problems of interest, the matrix dimension often exceeds $10^{10}$ and the number of nonzero matrix elements may saturate available storage on present-day leadership class facilities. We survey recent results and advances in solving this large sparse matrix eigenvalue problem. W also outline the challenges that lie ahead for achieving further breakthroughs in fundamental nuclear theory using these {\it ab initio} approaches.