Source author record

Alan Gray

Alan Gray appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

hep-lat hep-ph Distributed, Parallel, and Cluster Computing Hardware Architecture physics.comp-ph

Catalog footprint

What is connected

6works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2016arXiv

A Lightweight Approach to Performance Portability with targetDP

Leading HPC systems achieve their status through use of highly parallel devices such as NVIDIA GPUs or Intel Xeon Phi many-core CPUs. The concept of performance portability across such architectures, as well as traditional CPUs, is vital for the application programmer. In this paper we describe targetDP, a lightweight abstraction layer which allows grid-based applications to target data parallel hardware in a platform agnostic manner. We demonstrate the effectiveness of our pragmatic approach by presenting performance results for a complex fluid application (with which the model was co-designed), plus a separate lattice QCD particle physics code. For each application, a single source code base is seen to achieve portable performance, as assessed within the context of the Roofline model. TargetDP can be combined with MPI to allow use on systems containing multiple nodes: we demonstrate this through provision of scaling results on traditional and GPU-accelerated large scale supercomputers.

preprint2014arXiv

targetDP: an Abstraction of Lattice Based Parallelism with Portable Performance

To achieve high performance on modern computers, it is vital to map algorithmic parallelism to that inherent in the hardware. From an application developer's perspective, it is also important that code can be maintained in a portable manner across a range of hardware. Here we present targetDP (target Data Parallel), a lightweight programming layer that allows the abstraction of data parallelism for applications that employ structured grids. A single source code may be used to target both thread level parallelism (TLP) and instruction level parallelism (ILP) on either SIMD multi-core CPUs or GPU-accelerated platforms. targetDP is implemented via standard C preprocessor macros and library functions, can be added to existing applications incrementally, and can be combined with higher-level paradigms such as MPI. We present CPU and GPU performance results for a benchmark taken from the lattice Boltzmann application that motivated this work. These demonstrate not only performance portability, but also the optimisation resulting from the intelligent exposure of ILP.

preprint2007arXiv

B Meson Semileptonic Form Factors from Unquenched Lattice QCD

The semileptonic process, B --> πl ν, is studied via full QCD Lattice simulations. We use unquenched gauge configurations generated by the MILC collaboration. These include the effect of vacuum polarization from three quark flavors: the $s$ quark and two very light flavors ($u/d$) of variable mass allowing extrapolations to the physical chiral limit. We employ Nonrelativistic QCD to simulate the $b$ quark and a highly improved staggered quark action for the light sea and valence quarks. We calculate the form factors $f_+(q^2)$ and $f_0(q^2)$ in the chiral limit for the range 16 GeV$^2 \leq q^2 < q^2_{max}$ and obtain $\int^{q^2_{max}}_{16 GeV^2} [dΓ/dq^2] dq^2 / |v_{ub}|^2 = 1.46(35) ps^{-1}$. Combining this with a preliminary average by the Heavy Flavor Averaging Group (HFAG'05) of recent branching fraction data for exclusive B semileptonic decays from the BaBar, Belle and CLEO collaborations, leads to $|V_{ub}| = 4.22(30)(51) \times 10^{-3}$. PLEASE NOTE APPENDIX B with an ERRATUM, to appear in Physical Review D, to the published version of this e-print (Phys.Rev.D 73, 074502 (2006)). Results for the form factor $f_+(q^2)$ in the chiral limit have changed significantly. The last two sentences in this abstract should now read; "We calculate the form factor $f_+(q^2)$ and $f_0(q^2)$ in the chiral limit for the range 16 Gev$^2 \leq q^2 < q^2_{max}$ and obtain $\int^{q^2_{max}}_{16 GeV^2} [dΓ/dq^2] dq^2 / |V_{ub}|^2 = 2.07(57)ps^{-1}$. Combining this with a preliminary average by the Heavy Flavor Averagibg Group (HFAG'05) of recent branching fraction data for exclusive B semileptonic decays from the BaBar, Belle and CLEO collaborations, leads to $|V_{ub}| = 3.55(25)(50) \times 10^{-3}$."

preprint2007arXiv

picoArray Technology: The Tool's Story

This paper briefly describes the picoArray? architecture, and in particular the deterministic internal communication fabric. The methods that have been developed for debugging and verifying systems using devices from the picoArray family are explained. In order to maximize the computational ability of these devices, hardware debugging support has been kept to a minimum and the methods and tools developed to take this into account.

preprint2005arXiv

The B Meson Decay Constant from Unquenched Lattice QCD

We present determinations of the B meson decay constant f_B and of the ratio f_{B_s}/f_B using the MILC collaboration unquenched gauge configurations which include three flavors of light sea quarks. The mass of one of the sea quarks is kept around the strange quark mass, and we explore a range in masses for the two lighter sea quarks down to m_s/8. The heavy b quark is simulated using Nonrelativistic QCD, and both the valence and sea light quarks are represented by the highly improved (AsqTad) staggered quark action. The good chiral properties of the latter action allow for a much smoother chiral extrapolation to physical up and down quarks than has been possible in the past. We find f_B = 216(9)(19)(4) (6) MeV and f_{B_s} /f_B = 1.20(3)(1).

preprint2003arXiv

The B_s and D_s decay constants in 3 flavor lattice QCD

Capitalizing on recent advances in lattice QCD, we present a calculation of the leptonic decay constants f_{B_s} and f_{D_s} that includes effects of one strange sea quark and two light sea quarks. The discretization errors of improved staggered fermion actions are small enough to simulate with 3 dynamical flavors on lattices with spacings around 0.1 fm using present computer resources. By shedding the quenched approximation and the associated lattice scale ambiguity, lattice QCD greatly increases its predictive power. NRQCD is used to simulate heavy quarks with masses between 1.5 m_c and m_b. We arrive at the following results: f_{B_s} = 260 \pm 7 \pm 26 \pm 8 \pm 5 MeV and f_{D_s} = 290 \pm 20 \pm 29 \pm 29 \pm 6 MeV. The first quoted error is the statistical uncertainty, and the rest estimate the sizes of higher order terms neglected in this calculation. All of these uncertainties are systematically improvable by including another order in the weak coupling expansion, the nonrelativistic expansion, or the Symanzik improvement program.