Researcher profile

Christopher A. Pattison

Christopher A. Pattison contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2022arXiv

Fast Arbitrary Precision Floating Point on FPGA

Numerical codes that require arbitrary precision floating point (APFP) numbers for their core computation are dominated by elementary arithmetic operations due to the super-linear complexity of multiplication in the number of mantissa bits. APFP computations on conventional software-based architectures are made exceedingly expensive by the lack of native hardware support, requiring elementary operations to be emulated using instructions operating on machine-word-sized blocks. In this work, we show how APFP multiplication on compile-time fixed-precision operands can be implemented as deep FPGA pipelines with a recursively defined Karatsuba decomposition on top of native DSP multiplication. When comparing our design implemented on an Alveo U250 accelerator to a dual-socket 36-core Xeon node running the GNU Multiple Precision Floating-Point Reliable (MPFR) library, we achieve a 9.8x speedup at 4.8 GOp/s for 512-bit multiplication, and a 5.3x speedup at 1.2 GOp/s for 1024-bit multiplication, corresponding to the throughput of more than 351x and 191x CPU cores, respectively. We apply this architecture to general matrix-matrix multiplication, yielding a 10x speedup at 2.0 GOp/s over the Xeon node, equivalent to more than 375x CPU cores, effectively allowing a single FPGA to replace a small CPU cluster. Due to the significant dependence of some numerical codes on APFP, such as semidefinite program solvers, we expect these gains to translate into real-world speedups. Our configurable and flexible HLS-based code provides as high-level software interface for plug-and-play acceleration, published as an open source project.

preprint2020arXiv

A Scalable Decoder Micro-architecture for Fault-Tolerant Quantum Computing

Quantum computation promises significant computational advantages over classical computation for some problems. However, quantum hardware suffers from much higher error rates than in classical hardware. As a result, extensive quantum error correction is required to execute a useful quantum algorithm. The decoder is a key component of the error correction scheme whose role is to identify errors faster than they accumulate in the quantum computer and that must be implemented with minimum hardware resources in order to scale to the regime of practical applications. In this work, we consider surface code error correction, which is the most popular family of error correcting codes for quantum computing, and we design a decoder micro-architecture for the Union-Find decoding algorithm. We propose a three-stage fully pipelined hardware implementation of the decoder that significantly speeds up the decoder. Then, we optimize the amount of decoding hardware required to perform error correction simultaneously over all the logical qubits of the quantum computer. By sharing resources between logical qubits, we obtain a 67% reduction of the number of hardware units and the memory capacity is reduced by 70%. Moreover, we reduce the bandwidth required for the decoding process by a factor at least 30x using low-overhead compression algorithms. Finally, we provide numerical evidence that our optimized micro-architecture can be executed fast enough to correct errors in a quantum computer.

preprint2020arXiv

The Wishart planted ensemble: A tunably-rugged pairwise Ising model with a first-order phase transition

We propose the Wishart planted ensemble, a class of zero-field Ising models with tunable algorithmic hardness and specifiable (or planted) ground state. The problem class arises from a simple procedure for generating a family of random integer programming problems with specific statistical symmetry properties, but turns out to have intimate connections to a sign-inverted variant of the Hopfield model. The Hamiltonian contains only 2-spin interactions, with the coupler matrix following a type of Wishart distribution. The class exhibits a classical first-order phase transition in temperature. For some parameter settings the model has a locally-stable paramagnetic state, a feature which correlates strongly with difficulty in finding the ground state and suggests an extremely rugged energy landscape. We analytically probe the ensemble thermodynamic properties by deriving the Thouless-Anderson-Palmer equations and free energy and corroborate the results with a replica and annealed approximation analysis; extensive Monte Carlo simulations confirm our predictions of the first-order transition temperature. The class exhibits a wide variation in algorithmic hardness as a generation parameter is varied, with a pronounced easy-hard-easy profile and peak in solution time towering many orders of magnitude over that of the easy regimes. By deriving the ensemble-averaged energy distribution and taking into account finite-precision representation, we propose an analytical expression for the location of the hardness peak and show that at fixed precision, the number of constraints in the integer program must increase with system size to yield truly hard problems. The Wishart planted ensemble is interesting for its peculiar physical properties and provides a useful and analytically-transparent set of problems for benchmarking optimization algorithms.