Source author record

Meifeng Lin

Meifeng Lin appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

hep-lat hep-ph hep-ex physics.comp-ph Distributed, Parallel, and Cluster Computing physics.data-an physics.ins-det quant-ph

Catalog footprint

What is connected

16works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

OpenMP Advisor

With the increasing diversity of heterogeneous architecture in the HPC industry, porting a legacy application to run on different architectures is a tough challenge. In this paper, we present OpenMP Advisor, a first of its kind compiler tool that enables code offloading to a GPU with OpenMP using Machine Learning. Although the tool is currently limited to GPUs, it can be extended to support other OpenMP-capable devices. The tool has two modes: Training mode and Prediction mode. The training mode must be executed on the target hardware. It takes benchmark codes as input, generates and executes every variant of the code that could possibly run on the target device, and then collects data from all of the executed codes to train an ML-based cost model for use in prediction mode. However, in prediction mode the tool does not need any interaction with the target device. It accepts a C code as input and returns the best code variant that can be used to offload the code to the specified device. The tool can determine the kernels that are best suited for offloading by predicting their runtime using a machine learning-based cost model. The main objective behind this tool is to maintain the portability aspect of OpenMP. Using our Advisor, we were able to generate code of multiple applications for seven different architectures, and correctly predict the top ten best variants for each application on every architecture. Preliminary findings indicate that this tool can assist compiler developers and HPC application researchers in porting their legacy HPC codes to the upcoming heterogeneous computing environment.

preprint2022arXiv

DUNE Software and High Performance Computing

DUNE, like other HEP experiments, faces a challenge related to matching execution patterns of our production simulation and data processing software to the limitations imposed by modern high-performance computing facilities. In order to efficiently exploit these new architectures, particularly those with high CPU core counts and GPU accelerators, our existing software execution models require adaptation. In addition, the large size of individual units of raw data from the far detector modules pose an additional challenge somewhat unique to DUNE. Here we describe some of these problems and how we begin to solve them today with existing software frameworks and toolkits. We also describe ways we may leverage these existing software architectures to attack remaining problems going forward. This whitepaper is a contribution to the Computational Frontier of Snowmass21.

preprint2022arXiv

Lattice QCD and the Computational Frontier

The search for new physics requires a joint experimental and theoretical effort. Lattice QCD is already an essential tool for obtaining precise model-free theoretical predictions of the hadronic processes underlying many key experimental searches, such as those involving heavy flavor physics, the anomalous magnetic moment of the muon, nucleon-neutrino scattering, and rare, second-order electroweak processes. As experimental measurements become more precise over the next decade, lattice QCD will play an increasing role in providing the needed matching theoretical precision. Achieving the needed precision requires simulations with lattices with substantially increased resolution. As we push to finer lattice spacing we encounter an array of new challenges. They include algorithmic and software-engineering challenges, challenges in computer technology and design, and challenges in maintaining the necessary human resources. In this white paper we describe those challenges and discuss ways they are being dealt with. Overcoming them is key to supporting the community effort required to deliver the needed theoretical support for experiments in the coming decade.

preprint2022arXiv

Methods and Results for Quantum Optimal Pulse Control on Superconducting Qubit Systems

The effective use of current Noisy Intermediate-Scale Quantum (NISQ) devices is often limited by the noise which is caused by interaction with the environment and affects the fidelity of quantum gates. In transmon qubit systems, the quantum gate fidelity can be improved by applying control pulses that can minimize the effects of the environmental noise. In this work, we employ physics-guided quantum optimal control strategies to design optimal pulses driving quantum gates on superconducting qubit systems. We test our results by conducting experiments on the IBM Q hardware using their OpenPulse API. We compare the performance of our pulse-optimized quantum gates against the default quantum gates and show that the optimized pulses improve the fidelity of the quantum gates, in particular the single-qubit gates. We discuss the challenges we encountered in our work and point to possible future improvements.

preprint2022arXiv

Portability: A Necessary Approach for Future Scientific Software

Today's world of scientific software for High Energy Physics (HEP) is powered by x86 code, while the future will be much more reliant on accelerators like GPUs and FPGAs. The portable parallelization strategies (PPS) project of the High Energy Physics Center for Computational Excellence (HEP/CCE) is investigating solutions for portability techniques that will allow the coding of an algorithm once, and the ability to execute it on a variety of hardware products from many vendors, especially including accelerators. We think without these solutions, the scientific success of our experiments and endeavors is in danger, as software development could be expert driven and costly to be able to run on available hardware infrastructure. We think the best solution for the community would be an extension to the C++ standard with a very low entry bar for users, supporting all hardware forms and vendors. We are very far from that ideal though. We argue that in the future, as a community, we need to request and work on portability solutions and strive to reach this ideal.

preprint2022arXiv

Solving Simulation Systematics in and with AI/ML

Training an AI/ML system on simulated data while using that system to infer on data from real detectors introduces a systematic error which is difficult to estimate and in many analyses is simply not confronted. It is crucial to minimize and to quantitatively estimate the uncertainties in such analysis and do so with a precision and accuracy that matches those that AI/ML techniques bring. Here we highlight the need to confront this class of systematic error, discuss conventional ways to estimate it and describe ways to quantify and to minimize the uncertainty using methods which are themselves based on the power of AI/ML. We also describe methods to introduce a simulation into an AI/ML network to allow for training of its semantically meaningful parameters. This whitepaper is a contribution to the Computational Frontier of Snowmass21.

preprint2021arXiv

Porting HEP Parameterized Calorimeter Simulation Code to GPUs

The High Energy Physics (HEP) experiments, such as those at the Large Hadron Collider (LHC), traditionally consume large amounts of CPU cycles for detector simulations and data analysis, but rarely use compute accelerators such as GPUs. As the LHC is upgraded to allow for higher luminosity, resulting in much higher data rates, purely relying on CPUs may not provide enough computing power to support the simulation and data analysis needs. As a proof of concept, we investigate the feasibility of porting a HEP parameterized calorimeter simulation code to GPUs. We have chosen to use FastCaloSim, the ATLAS fast parametrized calorimeter simulation. While FastCaloSim is sufficiently fast such that it does not impose a bottleneck in detector simulations overall, significant speed-ups in the processing of large samples can be achieved from GPU parallelization at both the particle (intra-event) and event levels; this is especially beneficial in conditions expected at the high-luminosity LHC, where extremely high per-event particle multiplicities will result from the many simultaneous proton-proton collisions. We report our experience with porting FastCaloSim to NVIDIA GPUs using CUDA. A preliminary Kokkos implementation of FastCaloSim for portability to other parallel architectures is also described.

preprint2020arXiv

Nucleon mass and isovector couplings in 2+1-flavor dynamical domain-wall lattice QCD near physical mass

We report nucleon mass, isovector vector and axial-vector charges, and tensor and scalar couplings, calculated using two recent 2+1-flavor dynamical domain-wall fermions lattice-QCD ensembles generated jointly by the RIKEN-BNL-Columbia and UKQCD collaborations. These ensembles were generated with Iwasaki $\times$ dislocation-suppressing-determinant-ratio gauge action at inverse lattice spacing of 1.378(7) GeV and pion mass values of 249.4(3) and 172.3(3) MeV. The nucleon mass extrapolates to a value $m_N = 0.950(5)$ GeV at physical point. The isovector vector charge renormalizes to unity in the chiral limit, narrowly constraining excited-state contamination in the calculation. The ratio of the isovector axial-vector to vector charges shows a deficit of about ten percent. The tensor coupling no longer depends on mass and extrapolates to 1.04(5) in $\overline {\rm MS}$ 2-GeV renormalization at physical point, in a good agreement with the value obtained at the lightest mass in our previous calculations and other calculations that followed. The scalar charge, though noisier, does not show mass dependence and is in agreement with other calculations.

preprint2015arXiv

Optimizing the domain wall fermion Dirac operator using the R-Stream source-to-source compiler

The application of the Dirac operator on a spinor field, the Dslash operation, is the most computation-intensive part of the lattice QCD simulations. It is often the key kernel to optimize to achieve maximum performance on various platforms. Here we report on a project to optimize the domain wall fermion Dirac operator in Columbia Physics System (CPS) using the R-Stream source-to-source compiler. Our initial target platform is the Intel PC clusters. We discuss the optimization strategies involved before and after the automatic code generation with R-Stream and present some preliminary benchmark results.

preprint2014arXiv

Composite bosonic baryon dark matter on the lattice: SU(4) baryon spectrum and the effective Higgs interaction

We present the spectrum of baryons in a new SU(4) gauge theory with fundamental fermion constituents. The spectrum of these bosonic baryons is of significant interest for composite dark matter theories. Here, we compare the spectrum and properties of SU(3) and SU(4) baryons, and then compute the dark-matter direct detection cross section via Higgs boson exchange for TeV-scale composite dark matter arising from a confining SU(4) gauge sector. Comparison with the latest LUX results leads to tight bounds on the fraction of the constituent-fermion mass that may arise from electroweak symmetry breaking. Lattice calculations of the dark matter mass spectrum and the Higgs-dark matter coupling are performed on quenched $16^{3} \times 32$, $32^{3} \times 64$, $48^{3} \times 96$, and $64^{3} \times128$ lattices with three different lattice spacings, using Wilson fermions with moderate to heavy pseudoscalar meson masses. Our results lay a foundation for future analytic and numerical study of composite baryonic dark matter.

preprint2014arXiv

Nucleon Form Factors with 2+1 Flavors of Domain Wall Fermions and All-Mode-Averaging

We report recent progress in the calculations of the isovector nucleon electromagnetic form factors using 2+1 flavors of domain wall fermions at pion masses of 170 MeV and 250 MeV. The lattice size is fixed at $32^3\times64$ with a lattice cutoff scale of 1.37(1) GeV. For the calculations with $M_π= 170$ MeV, we employed the All-Mode-Averaging (AMA) technique, which led to roughly a factor of 20 improvement in computational efficiency and has reduced the statistical errors in our results significantly. We were also able to do calculations at two different source-sink separations, at roughly 1.3 fm and 1.0 fm, without much additional cost by reusing the low eigen-modes stored for the AMA calculations. We will present results for the isovector form factors and their derived quantities, including the Dirac and Pauli radii, anomalous magnetic moment and discuss the effects of possible excited-state contaminations. Connected contributions to the isoscalar Dirac and Pauli form factors will also be shown.

preprint2012arXiv

Approaching Conformality with Ten Flavors

We present first results for lattice simulations, on a single volume, of the low-lying spectrum of an SU(3) Yang-Mills gauge theory with ten light fermions in the fundamental representation. Fits to the fermion mass dependence of various observables are found to be globally consistent with the hypothesis that this theory is within or just outside the strongly-coupled edge of the conformal window, with mass anomalous dimension consistent with 1 over the range of scales simulated. We stress that we cannot rule out the possibility of spontaneous chiral-symmetry breaking at scales well below our infrared cutoff. We discuss important systematic effects, including finite-volume corrections, and consider directions for future improvement.

preprint2012arXiv

WW Scattering Parameters via Pseudoscalar Phase Shifts

Using domain-wall lattice simulations, we study pseudoscalar-pseudoscalar scattering in the maximal isospin channel for an SU(3) gauge theory with two and six fermion flavors in the fundamental representation. This calculation of the S-wave scattering length is related to the next-to-leading order corrections to WW scattering through the low-energy coefficients of the chiral Lagrangian. While two and six flavor scattering lengths are similar for a fixed ratio of the pseudoscalar mass to its decay constant, six-flavor scattering shows a somewhat less repulsive next-to-leading order interaction than its two-flavor counterpart. Estimates are made for the WW scattering parameters and the plausibility of detection is discussed.

preprint2011arXiv

Lattice Simulations and Infrared Conformality

We examine several recent lattice-simulation data sets, asking whether they are consistent with infrared conformality. We observe, in particular, that for an SU(3) gauge theory with 12 Dirac fermions in the fundamental representation, recent simulation data can be described assuming infrared conformality. Lattice simulations include a fermion mass m which is then extrapolated to zero, and we note that this data can be fit by a small-m expansion, allowing a controlled extrapolation. We also note that the conformal hypothesis does not work well for two theories that are known or expected to be confining and chirally broken, and that it does work well for another theory expected to be infrared conformal.

preprint2010arXiv

Parity Doubling and the S Parameter Below the Conformal Window

We describe a lattice simulation of the masses and decay constants of the lowest-lying vector and axial resonances, and the electroweak S parameter, in an SU(3) gauge theory with $N_f = 2$ and 6 fermions in the fundamental representation. The spectrum becomes more parity doubled and the S parameter per electroweak doublet decreases when $N_f$ is increased from 2 to 6, motivating study of these trends as $N_f$ is increased further, toward the critical value for transition from confinement to infrared conformality.

preprint2005arXiv

Probing the chiral limit of M_pi and f_pi in 2+1 flavor QCD with domain wall fermions from QCDOC

We present results for the pseudoscalar meson masses and decay constants on 2+1 flavor DWF configurations with different sea quark masses and an inverse lattice spacing of 1.6(1) GeV, with a focus on chiral fits at small quark masses. The calculation is done on 16^3x32x8 lattices generated with the DBW2 gauge action.

Meifeng Lin

What is connected

Connect this record

See the researcher in context

Building this map preview

16 published item(s)

OpenMP Advisor

DUNE Software and High Performance Computing

Lattice QCD and the Computational Frontier

Methods and Results for Quantum Optimal Pulse Control on Superconducting Qubit Systems

Portability: A Necessary Approach for Future Scientific Software

Solving Simulation Systematics in and with AI/ML

Porting HEP Parameterized Calorimeter Simulation Code to GPUs

Nucleon mass and isovector couplings in 2+1-flavor dynamical domain-wall lattice QCD near physical mass

Optimizing the domain wall fermion Dirac operator using the R-Stream source-to-source compiler

Composite bosonic baryon dark matter on the lattice: SU(4) baryon spectrum and the effective Higgs interaction

Nucleon Form Factors with 2+1 Flavors of Domain Wall Fermions and All-Mode-Averaging

Approaching Conformality with Ten Flavors

WW Scattering Parameters via Pseudoscalar Phase Shifts

Lattice Simulations and Infrared Conformality

Parity Doubling and the S Parameter Below the Conformal Window

Probing the chiral limit of M_pi and f_pi in 2+1 flavor QCD with domain wall fermions from QCDOC