Source author record

Michael E. Wall

Michael E. Wall appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cond-mat.mtrl-sci physics.chem-ph Biological Physics Biomolecules Distributed, Parallel, and Cluster Computing Molecular Networks Performance physics.comp-ph Quantitative Methods

Catalog footprint

What is connected

6works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Accelerating X-Ray Tracing for Exascale Systems using Kokkos

The upcoming exascale computing systems Frontier and Aurora will draw much of their computing power from GPU accelerators. The hardware for these systems will be provided by AMD and Intel, respectively, each supporting their own GPU programming model. The challenge for applications that harness one of these exascale systems will be to avoid lock-in and to preserve performance portability. We report here on our results of using Kokkos to accelerate a real-world application on NERSC's Perlmutter Phase 1 (using NVIDIA A100 accelerators) and the testbed system for OLCF's Frontier (using AMD MI250X). By porting to Kokkos, we were able to successfully run the same X-ray tracing code on both systems and achieved speed-ups between 13% and 66% compared to the original CUDA code. These results are a highly encouraging demonstration of using Kokkos to accelerate production science code.

preprint2021arXiv

Performance Optimizations of Recursive Electronic Structure Solvers targeting Multi-Core Architectures (LA-UR-20-26665)

As we rapidly approach the frontiers of ultra large computing resources, software optimization is becoming of paramount interest to scientific application developers interested in efficiently leveraging all available on-Node computing capabilities and thereby improving a requisite science per watt metric. The scientific application of interest here is the Basic Math Library (BML) that provides a singular interface for linear algebra operation frequently used in the Quantum Molecular Dynamics (QMD) community. The provisioning of a singular interface indicates the presence of an abstraction layer which in-turn suggests commonalities in the code-base and therefore any optimization or tuning introduced in the core of code-base has the ability to positively affect the performance of the aforementioned library as a whole. With that in mind, we proceed with this investigation by performing a survey of the entirety of the BML code-base, and extract, in form of micro-kernels, common snippets of code. We introduce several optimization strategies into these micro-kernels including 1.) Strength Reduction 2.) Memory Alignment for large arrays 3.) Non Uniform Memory Access (NUMA) aware allocations to enforce data locality and 4.) appropriate thread affinity and bindings to enhance the overall multi-threaded performance. After introducing these optimizations, we benchmark the micro-kernels and compare the run-time before and after optimization for several target architectures. Finally we use the results as a guide to propagating the optimization strategies into the BML code-base. As a demonstration, herein, we test the efficacy of these optimization strategies by comparing the benchmark and optimized versions of the code.

preprint2016arXiv

Graph-based linear scaling electronic structure theory

We show how graph theory can be combined with quantum theory to calculate the electronic structure of large complex systems. The graph formalism is general and applicable to a broad range of electronic structure methods and materials, including challenging systems such as biomolecules. The methodology combines well-controlled accuracy, low computational cost, and natural low-communication parallelism. This combination addresses substantial shortcomings of linear scaling electronic structure theory, in particular with respect to quantum-based molecular dynamics simulations.

preprint2016arXiv

Quantum crystallographic charge density of urea

Standard X-ray crystallography methods use free-atom models to calculate mean unit cell charge densities. Real molecules, however, have shared charge that is not captured accurately using free-atom models. To address this limitation, a charge density model of crystalline urea was calculated using high-level quantum theory and was refined against publicly available ultra high-resolution experimental Bragg data, including the effects of atomic displacement parameters. The resulting quantum crystallographic model was compared to models obtained using spherical atom or multipole methods. Despite using only the same number of free parameters as the spherical atom model, the agreement of the quantum model with the data is comparable to the multipole model. The static, theoretical crystalline charge density of the quantum model is distinct from the multipole model, indicating the quantum model provides substantially new information. Hydrogen thermal ellipsoids in the quantum model were very similar to those obtained using neutron crystallography, indicating that quantum crystallography can increase the accuracy of the X-ray crystallographic atomic displacement parameters. The results demonstrate the feasibility and benefits of integrating fully periodic quantum charge density calculations into ultra high-resolution X-ray crystallographic model building and refinement.

preprint2015arXiv

Of fishes and birthdays: Efficient estimation of polymer configurational entropies

We present an algorithm to estimate the configurational entropy $S$ of a polymer. The algorithm uses the statistics of coincidences among random samples of configurations and is related to the catch-tag-release method for estimation of population sizes, and to the classic "birthday paradox". Bias in the entropy estimation is decreased by grouping configurations in nearly equiprobable partitions based on their energies, and estimating entropies separately within each partition. Whereas most entropy estimation algorithms require $N\sim 2^{S}$ samples to achieve small bias, our approach typically needs only $N\sim \sqrt{2^{S}}$. Thus the algorithm can be applied to estimate protein free energies with increased accuracy and decreased computational cost.

preprint2009arXiv

Model of Transcriptional Activation by MarA in Escherichia coli

We have developed a mathematical model of transcriptional activation by MarA in Escherichia coli, and used the model to analyze measurements of MarA-dependent activity of the marRAB, sodA, and micF promoters in mar-rob- cells. The model rationalizes an unexpected poor correlation between the mid-point of in vivo promoter activity profiles and in vitro equilibrium constants for MarA binding to promoter sequences. Analysis of the promoter activity data using the model yielded the following predictions regarding activation mechanisms: (1) MarA activation of the marRAB, sodA, and micF promoters involves a net acceleration of the kinetics of transitions after RNA polymerase binding, up to and including promoter escape and message elongation; (2) RNA polymerase binds to these promoters with nearly unit occupancy in the absence of MarA, making recruitment of polymerase an insignificant factor in activation of these promoters; and (3) instead of recruitment, activation of the micF promoter might involve a repulsion of polymerase combined with a large acceleration of the kinetics of polymerase activity. These predictions are consistent with published chromatin immunoprecipitation assays of interactions between polymerase and the E. coli chromosome. A lack of recruitment in transcriptional activation represents an exception to the textbook description of activation of bacterial sigma-70 promoters. However, use of accelerated polymerase kinetics instead of recruitment might confer a competitive advantage to E. coli by decreasing latency in gene regulation.