Researcher profile

Yulun Wang

Yulun Wang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

Collective Communication for 100k+ GPUs

The increasing scale of large language models (LLMs) necessitates highly efficient collective communication frameworks, particularly as training workloads extend to hundreds of thousands of GPUs. Traditional communication methods face significant throughput and latency limitations at this scale, hindering both the development and deployment of state-of-the-art models. This paper presents the NCCLX collective communication framework, developed at Meta, engineered to optimize performance across the full LLM lifecycle, from the synchronous demands of large-scale training to the low-latency requirements of inference. The framework is designed to support complex workloads on clusters exceeding 100,000 GPUs, ensuring reliable, high-throughput, and low-latency data exchange. Empirical evaluation on the Llama4 model demonstrates substantial improvements in communication efficiency. This research contributes a robust solution for enabling the next generation of LLMs to operate at unprecedented scales.

preprint2023arXiv

Collisional S-Matrix for the Vibrational Dynamics of H+H2 by Quantum Computing

An algorithm and a system of quantum circuits is developed and applied to compute accurately the S matrix for the transitions between vibrational states of H2 for collisions with H. The algorithm was applied to 100 eV laboratory collision energy at a quantum circuit simulator. The effects of the discretized dissociative continuum to the transition cross sections are carefully studied and accuracy and convergence of the results with the chosen parameters of the algorithm and the collision system are verified by comparison with a solution of the time-dependent Schrodinger equation using the classical algorithm as well as comparison with a few results from the literature.

preprint2022arXiv

Multistate Transition Dynamics by Strong Time-Dependent Perturbation in NISQ era

We develop a quantum computing scheme utilizing McLachlan variational principle in a hybrid quantum-classical algorithm to accurately calculate the transition dynamics of a closed quantum system with many excited states subject to a strong time-dependent perturbation. A systematic approach for optimal construction of a general N-state ansatz with unary N-qubit encoding is refined. We also utilize qubit efficient encoding in McLachlan variational quantum algorithm to reduce the number of qubits to log2 N, simultaneously diminishing depths of the quantum circuits. The significant reduction of the number of time steps is achieved by use of the second order marching method. Instrumental in obtaining high accuracy are adaptations of the circuits to include time-dependent global phase correction. We illustrated, tested and optimized our quantum computing algorithm on a set of 16 bound hydrogenic eigenstates exposed to a strong laser attosecond pulse. Results for transition probabilities are obtained with accuracy better than 1%, as established by comparison to the benchmark data. Use of interaction representation of the Hamiltonian reduces the effect of both NISQ noise and sampling errors accumulation while the quantum system evolves in time.

preprint2021arXiv

Variational Quantum Linear Solver with Dynamic Ansatz

Variational quantum algorithms have found success in the NISQ era owing to their hybrid quantum-classical approach which mitigate the problems of noise in quantum computers. In our study we introduce the dynamic ansatz in the Variational Quantum Linear Solver for a system of linear algebraic equations. In this improved algorithm, the number of layers in the hardware efficient ansatz circuit is evolved, starting from a small and gradually increasing until convergence of the solution is reached. We demonstrate the algorithm advantage in comparison to the standard, static ansatz by utilizing fewer quantum resources and with a smaller quantum depth on average, in presence and absence of quantum noise, and in cases when the number of qubits or condition number of the system matrix are increased. The numbers of iterations and layers can be altered by a switching parameter. The performance of the algorithm in using quantum resources is quantified by a newly defined metric.