Source author record

Alan R. Bishop

Alan R. Bishop appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Distributed, Parallel, and Cluster Computing Mathematical Software Performance quant-ph cond-mat.str-el physics.comp-ph Biological Physics cond-mat cond-mat.soft math-ph math.MP nlin.CD nlin.PS Numerical Analysis

Catalog footprint

What is connected

12works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Patterns and Stability of Coupled Multi-Stable Nonlinear Oscillators

Nonlinear isolated and coupled oscillators are extensively studied as prototypical nonlinear dynamics models. Much attention has been devoted to oscillator synchronization or the lack thereof. Here, we study the synchronization and stability of coupled driven-damped Helmholtz-Duffing oscillators in bi-stability regimes. We find that despite the fact that the system parameters and the driving force are identical, the stability of the two states to spatially non-uniform perturbations is very different. Moreover, the final stable states, resulting from these spatial perturbations, are not solely dictated by the wavelength of the perturbing mode and take different spatial configurations in terms of the coupled oscillator phases.

preprint2020arXiv

Understanding HPC Benchmark Performance on Intel Broadwell and Cascade Lake Processors

Hardware platforms in high performance computing are constantly getting more complex to handle even when considering multicore CPUs alone. Numerous features and configuration options in the hardware and the software environment that are relevant for performance are not even known to most application users or developers. Microbenchmarks, i.e., simple codes that fathom a particular aspect of the hardware, can help to shed light on such issues, but only if they are well understood and if the results can be reconciled with known facts or performance models. The insight gained from microbenchmarks may then be applied to real applications for performance analysis or optimization. In this paper we investigate two modern Intel x86 server CPU architectures in depth: Broadwell EP and Cascade Lake SP. We highlight relevant hardware configuration settings that can have a decisive impact on code performance and show how to properly measure on-chip and off-chip data transfer bandwidths. The new victim L3 cache of Cascade Lake and its advanced replacement policy receive due attention. Finally we use DGEMM, sparse matrix-vector multiplication, and the HPCG benchmark to make a connection to relevant application scenarios.

preprint2018arXiv

Chebyshev Filter Diagonalization on Modern Manycore Processors and GPGPUs

Chebyshev filter diagonalization is well established in quantum chemistry and quantum physics to compute bulks of eigenvalues of large sparse matrices. Choosing a block vector implementation, we investigate optimization opportunities on the new class of high-performance compute devices featuring both high-bandwidth and low-bandwidth memory. We focus on the transparent access to the full address space supported by both architectures under consideration: Intel Xeon Phi "Knights Landing" and Nvidia "Pascal." We propose two optimizations: (1) Subspace blocking is applied for improved performance and data access efficiency. We also show that it allows transparently handling problems much larger than the high-bandwidth memory without significant performance penalties. (2) Pipelining of communication and computation phases of successive subspaces is implemented to hide communication costs without extra memory traffic. As an application scenario we use filter diagonalization studies on topological insulator materials. Performance numbers on up to 512 nodes of the OakForest-PACS and Piz Daint supercomputers are presented, achieving beyond 100 Tflop/s for computing 100 inner eigenvalues of sparse matrices of dimension one billion.

preprint2014arXiv

A unified sparse matrix data format for efficient general sparse matrix-vector multiply on modern processors with wide SIMD units

Sparse matrix-vector multiplication (spMVM) is the most time-consuming kernel in many numerical algorithms and has been studied extensively on all modern processor and accelerator architectures. However, the optimal sparse matrix data storage format is highly hardware-specific, which could become an obstacle when using heterogeneous systems. Also, it is as yet unclear how the wide single instruction multiple data (SIMD) units in current multi- and many-core processors should be used most efficiently if there is no structure in the sparsity pattern of the matrix. We suggest SELL-C-sigma, a variant of Sliced ELLPACK, as a SIMD-friendly data format which combines long-standing ideas from General Purpose Graphics Processing Units (GPGPUs) and vector computer programming. We discuss the advantages of SELL-C-sigma compared to established formats like Compressed Row Storage (CRS) and ELLPACK and show its suitability on a variety of hardware platforms (Intel Sandy Bridge, Intel Xeon Phi and Nvidia Tesla K20) for a wide range of test matrices from different application areas. Using appropriate performance models we develop deep insight into the data transfer properties of the SELL-C-sigma spMVM kernel. SELL-C-sigma comes with two tuning parameters whose performance impact across the range of test matrices is studied and for which reasonable choices are proposed. This leads to a hardware-independent ("catch-all") sparse matrix format, which achieves very high efficiency for all test matrices across all hardware platforms.

preprint2013arXiv

Non-Hermitian Quantum Annealing in the Antiferromagnetic Ising Chain

A non-Hermitian quantum optimization algorithm is created and used to find the ground state of an antiferromagnetic Ising chain. We demonstrate analytically and numerically (for up to N=1024 spins) that our approach leads to a significant reduction of the annealing time that is proportional to $\ln N$, which is much less than the time (proportional to $N^2$) required for the quantum annealing based on the corresponding Hermitian algorithm. We propose to use this approach to achieve similar speed-up for NP-complete problems by using classical computers in combination with quantum algorithms.

preprint2012arXiv

Non-Hermitian approach for modeling of noise-assisted quantum electron transfer in photosynthetic complexes

We model the quantum electron transfer (ET) in the photosynthetic reaction center (RC), using a non-Hermitian Hamiltonian approach. Our model includes (i) two protein cofactors, donor and acceptor, with discrete energy levels and (ii) a third protein pigment (sink) which has a continuous energy spectrum. Interactions are introduced between the donor and acceptor, and between the acceptor and the sink, with noise acting between the donor and acceptor. The noise is considered classically (as an external random force), and it is described by an ensemble of two-level systems (random fluctuators). Each fluctuator has two independent parameters, an amplitude and a switching rate. We represent the noise by a set of fluctuators with fitting parameters (boundaries of switching rates), which allows us to build a desired spectral density of noise in a wide range of frequencies. We analyze the quantum dynamics and the efficiency of the ET as a function of (i) the energy gap between the donor and acceptor, (ii) the strength of the interaction with the continuum, and (iii) noise parameters. As an example, numerical results are presented for the ET through the active pathway in a quinone-type photosystem II RC.

preprint2012arXiv

Sparse matrix-vector multiplication on GPGPU clusters: A new storage format and a scalable implementation

Sparse matrix-vector multiplication (spMVM) is the dominant operation in many sparse solvers. We investigate performance properties of spMVM with matrices of various sparsity patterns on the nVidia "Fermi" class of GPGPUs. A new "padded jagged diagonals storage" (pJDS) format is proposed which may substantially reduce the memory overhead intrinsic to the widespread ELLPACK-R scheme. In our test scenarios the pJDS format cuts the overall spMVM memory footprint on the GPGPU by up to 70%, and achieves 95% to 130% of the ELLPACK-R performance. Using a suitable performance model we identify performance bottlenecks on the node level that invalidate some types of matrix structures for efficient multi-GPGPU parallelization. For appropriate sparsity patterns we extend previous work on distributed-memory parallel spMVM to demonstrate a scalable hybrid MPI-GPGPU code, achieving efficient overlap of communication and computation.

preprint2010arXiv

Non-Hermitian description of a superconducting phase qubit measurement

We present an approach based on a non-Hermitian Hamiltonian to describe the process of measurement by tunneling of a phase qubit state. We derive simple analytical expressions which describe the dynamics of measurement, and compare our results with those experimentally available.

preprint2010arXiv

Stability of Quantum Critical Points in the Presence of Competing Orders

We investigate the stability of Quantum Critical Points (QCPs) in the presence of two competing phases. These phases near QCPs are assumed to be either classical or quantum and assumed to repulsively interact via square-square interactions. We find that for any dynamical exponents and for any dimensionality strong enough interaction renders QCPs unstable, and drives transitions to become first order. We propose that this instability and the onset of first-order transitions lead to spatially inhomogeneous states in practical materials near putative QCPs. Our analysis also leads us to suggest that there is a breakdown of Conformal Field Theory (CFT) scaling in the Anti de Sitter models, and in fact these models contain first-order transitions in the strong coupling limit.

preprint2009arXiv

Numerical approaches to time evolution of complex quantum systems

We examine several numerical techniques for the calculation of the dynamics of quantum systems. In particular, we single out an iterative method which is based on expanding the time evolution operator into a finite series of Chebyshev polynomials. The Chebyshev approach benefits from two advantages over the standard time-integration Crank-Nicholson scheme: speedup and efficiency. Potential competitors are semiclassical methods such as the Wigner-Moyal or quantum tomographic approaches. We outline the basic concepts of these techniques and benchmark their performance against the Chebyshev approach by monitoring the time evolution of a Gaussian wave packet in restricted one-dimensional (1D) geometries. Thereby the focus is on tunnelling processes and the motion in anharmonic potentials. Finally we apply the prominent Chebyshev technique to two highly non-trivial problems of current interest: (i) the injection of a particle in a disordered 2D graphene nanoribbon and (ii) the spatiotemporal evolution of polaron states in finite quantum systems. Here, depending on the disorder/electron-phonon coupling strength and the device dimensions, we observe transmission or localisation of the matter wave.

preprint2002arXiv

Glassy behavior in systems with Kac-type step-function interaction

We study a system with a weak, long-range repulsive Kac-type step-function interaction within the framework of a replicated effective $ϕ^4$-theory. The occurrence of extensive configurational entropy, or an exponentially large number of metastable minima in the free energy (characteristic of a glassy state), is demonstrated. The underlying mechanism of mesoscopic patterning and defect organizations is discussed.

preprint1997arXiv

Theory of colossal magnetoresistance

The history and recent developments in studying (colossal) magnetoresistance in perovskite manganese oxides is reviewed. We emphasize the growing evidence for strongly coupled spin, charge and lattice degrees of freedom. Together with disorder, these provide the microscopic driving forces for local and inhomogeneous textures. The modeling and experimental probes for localized charge-spin-lattice (polaron) structures, and their multiscale ordering, is discussed in terms of a growing synergy of solid state physics and materials science perspectives.