Researcher profile

Xin-Chuan Wu

Xin-Chuan Wu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2022arXiv

An LLVM-based C++ Compiler Toolchain for Variational Hybrid Quantum-Classical Algorithms and Quantum Accelerators

Variational algorithms are a representative class of quantum computing workloads that combine quantum and classical computing. This paper presents an LLVM-based C++ compiler toolchain to efficiently execute variational hybrid quantum-classical algorithms on a computational system in which the quantum device acts as an accelerator. We introduce a set of extensions to the C++ language for programming these algorithms. We define a novel Executable and Linking Format (ELF) for Quantum and create a quantum device compiler component in the LLVM framework to compile the quantum part of the C++ source and reuse the host compiler in the LLVM framework to compile the classical computing part of the C++ source. A variational algorithm runs a quantum circuit repeatedly, each time with different gate parameters. We add to the quantum runtime the capability to execute dynamically a quantum circuit with different parameters. Thus, programmers can call quantum routines the same way as classical routines. With these capabilities, a variational hybrid quantum-classical algorithm can be specified in a single-source code and only needs to be compiled once for all iterations. The single compilation significantly reduces the execution latency of variational algorithms. We evaluate the framework's performance by running quantum circuits that prepare Thermofield Double (TFD) states, a quantum-classical variational algorithm.

preprint2022arXiv

QGo: Scalable Quantum Circuit Optimization Using Automated Synthesis

The current phase of quantum computing is in the Noisy Intermediate-Scale Quantum (NISQ) era. On NISQ devices, two-qubit gates such as CNOTs are much noisier than single-qubit gates, so it is essential to minimize their count. Quantum circuit synthesis is a process of decomposing an arbitrary unitary into a sequence of quantum gates, and can be used as an optimization tool to produce shorter circuits to improve overall circuit fidelity. However, the time-to-solution of synthesis grows exponentially with the number of qubits. As a result, synthesis is intractable for circuits on a large qubit scale. In this paper, we propose a hierarchical, block-by-block optimization framework, QGo, for quantum circuit optimization. Our approach allows an exponential cost optimization to scale to large circuits. QGo uses a combination of partitioning and synthesis: 1) partition the circuit into a sequence of independent circuit blocks; 2) re-generate and optimize each block using quantum synthesis; and 3) re-compose the final circuit by stitching all the blocks together. We perform our analysis and show the fidelity improvements in three different regimes: small-size circuits on real devices, medium-size circuits on noise simulations, and large-size circuits on analytical models. Using a set of NISQ benchmarks, we show that QGo can reduce the number of CNOT gates by 29.9% on average and up to 50% when compared with industrial compilers such as t|ket>. When executed on the IBM Athens system, shorter depth leads to higher circuit fidelity. We also demonstrate the scalability of our QGo technique to optimize circuits of 60+ qubits. Our technique is the first demonstration of successfully employing and scaling synthesis in the compilation toolchain for large circuits. Overall, our approach is robust for direct incorporation in production compiler toolchains.

preprint2022arXiv

SupermarQ: A Scalable Quantum Benchmark Suite

The emergence of quantum computers as a new computational paradigm has been accompanied by speculation concerning the scope and timeline of their anticipated revolutionary changes. While quantum computing is still in its infancy, the variety of different architectures used to implement quantum computations make it difficult to reliably measure and compare performance. This problem motivates our introduction of SupermarQ, a scalable, hardware-agnostic quantum benchmark suite which uses application-level metrics to measure performance. SupermarQ is the first attempt to systematically apply techniques from classical benchmarking methodology to the quantum domain. We define a set of feature vectors to quantify coverage, select applications from a variety of domains to ensure the suite is representative of real workloads, and collect benchmark results from the IBM, IonQ, and AQT@LBNL platforms. Looking forward, we envision that quantum benchmarking will encompass a large cross-community effort built on open source, constantly evolving benchmark suites. We introduce SupermarQ as an important step in this direction.

preprint2020arXiv

Full-State Quantum Circuit Simulation by Using Data Compression

Quantum circuit simulations are critical for evaluating quantum algorithms and machines. However, the number of state amplitudes required for full simulation increases exponentially with the number of qubits. In this study, we leverage data compression to reduce memory requirements, trading computation time and fidelity for memory space. Specifically, we develop a hybrid solution by combining the lossless compression and our tailored lossy compression method with adaptive error bounds at each timestep of the simulation. Our approach optimizes for compression speed and makes sure that errors due to lossy compression are uncorrelated, an important property for comparing simulation output with physical machines. Experiments show that our approach reduces the memory requirement of simulating the 61-qubit Grover's search algorithm from 32 exabytes to 768 terabytes of memory on Argonne's Theta supercomputer using 4,096 nodes. The results suggest that our techniques can increase the simulation size by 2 to 16 qubits for general quantum circuits.

preprint2020arXiv

SQUARE: Strategic Quantum Ancilla Reuse for Modular Quantum Programs via Cost-Effective Uncomputation

Compiling high-level quantum programs to machines that are size constrained (i.e. limited number of quantum bits) and time constrained (i.e. limited number of quantum operations) is challenging. In this paper, we present SQUARE (Strategic QUantum Ancilla REuse), a compilation infrastructure that tackles allocation and reclamation of scratch qubits (called ancilla) in modular quantum programs. At its core, SQUARE strategically performs uncomputation to create opportunities for qubit reuse. Current Noisy Intermediate-Scale Quantum (NISQ) computers and forward-looking Fault-Tolerant (FT) quantum computers have fundamentally different constraints such as data locality, instruction parallelism, and communication overhead. Our heuristic-based ancilla-reuse algorithm balances these considerations and fits computations into resource-constrained NISQ or FT quantum machines, throttling parallelism when necessary. To precisely capture the workload of a program, we propose an improved metric, the "active quantum volume," and use this metric to evaluate the effectiveness of our algorithm. Our results show that SQUARE improves the average success rate of NISQ applications by 1.47X. Surprisingly, the additional gates for uncomputation create ancilla with better locality, and result in substantially fewer swap gates and less gate noise overall. SQUARE also achieves an average reduction of 1.5X (and up to 9.6X) in active quantum volume for FT machines.