Source author record

Dongyang Wang

Dongyang Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

physics.optics Machine Learning quant-ph Artificial Intelligence Computation and Language cond-mat.mes-hall Distributed, Parallel, and Cluster Computing Emerging Technologies physics.comp-ph

Catalog footprint

What is connected

9works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

DisagMoE: Computation-Communication overlapped MoE Training via Disaggregated AF-Pipe Parallelism

Mixture-of-experts (MoE) architectures enable trillion-parameter LLMs with sparsely activated experts. Expert parallelism (EP) is a widely adopted MoE training strategy, but it suffers from severe all-to-all communication bottlenecks, which is exaggerated by the limited inter-node network bandwidth as the growing model size requires distributing experts across GPU nodes. Prior work focused on overlapping these all-to-all communications with feed-forward network (FFN) and self-attention computations, which often leaves residual network-bound stalls due to inherent imbalance in attention and FFN layers' computation-communication ratios. We present DisagMoE, a disaggregated MoE training system that jointly optimizes model placement and scheduling for maximal efficiency. DisagMoE separates attention and FFN layers into disjoint GPU groups, introduces a multi-stage pipeline with uni-directional, many-to-many communications, and employs a computation-communication roofline model to balance GPU and network bandwidth allocation among the attention and FFN groups. DisagMoE is implemented on Megatron-LM, and evaluation shows that DisagMoE improves training efficiency across multiple MoE models with up to 1.8x speedup on 16-node 8xH800 clusters.

preprint2022arXiv

A Personalized Dialogue Generator with Implicit User Persona Detection

Current works in the generation of personalized dialogue primarily contribute to the agent presenting a consistent personality and driving a more informative response. However, we found that the generated responses from most previous models tend to be self-centered, with little care for the user in the dialogue. Moreover, we consider that human-like conversation is essentially built based on inferring information about the persona of the other party. Motivated by this, we propose a novel personalized dialogue generator by detecting an implicit user persona. Because it is hard to collect a large number of detailed personas for each user, we attempted to model the user's potential persona and its representation from dialogue history, with no external knowledge. The perception and fader variables were conceived using conditional variational inference. The two latent variables simulate the process of people being aware of each other's persona and producing a corresponding expression in conversation. Finally, posterior-discriminated regularization was presented to enhance the training procedure. Empirical studies demonstrate that, compared to state-of-the-art methods, our approach is more concerned with the user's persona and achieves a considerable boost across the evaluations.

preprint2022arXiv

Large-scale full-programmable quantum walk and its applications

With photonics, the quantum computational advantage has been demonstrated on the task of boson sampling. Next, developing quantum-enhanced approaches for practical problems becomes one of the top priorities for photonic systems. Quantum walks are powerful kernels for developing new and useful quantum algorithms. Here we realize large-scale quantum walks using a fully programmable photonic quantum computing system. The system integrates a silicon quantum photonic chip, enabling the simulation of quantum walk dynamics on graphs with up to 400 vertices and possessing full programmability over quantum walk parameters, including the particle property, initial state, graph structure, and evolution time. In the 400-dimensional Hilbert space, the average fidelity of random entangled quantum states after the whole on-chip circuit evolution reaches as high as 94.29$\pm$1.28$\%$. With the system, we demonstrated exponentially faster hitting and quadratically faster mixing performance of quantum walks over classical random walks, achieving more than two orders of magnitude of enhancement in the experimental hitting efficiency and almost half of the reduction in the experimental evolution time for mixing. We utilize the system to implement a series of quantum applications, including measuring the centrality of scale-free networks, searching targets on Erdös-Rényi networks, distinguishing non-isomorphic graph pairs, and simulating the topological phase of higher-order topological insulators. Our work shows one feasible path for quantum photonics to address applications of practical interests in the near future.

preprint2022arXiv

Topological membrane devices for terahertz on-chip photonics

Terahertz waves offer a profound platform for next-generation sensing, imaging, and information communications. However, all conventional terahertz components and systems suffer from a bulky design, sensitivity to imperfections, and transmission losses. Here, we propose and experimentally demonstrate on-chip integration and miniaturization of topological devices which may address many existing drawbacks of the terahertz technology. We design and fabricate topological devices based on valley-Hall photonic structures that can be employed for various integrated components of on-chip terahertz systems. More specifically, we demonstrate the valley-locked asymmetric energy flow and mode conversion with topological straight waveguide, multi-port couplers, wave division, and whispering gallery mode resonators. Our devices are based on topological membrane metasurfaces which are of great importance for developing on-chip photonics and bringing many novel features into terahertz devices.

preprint2021arXiv

Experimental observation of non-Abelian earring nodal links in phononic crystals

Nodal lines are symmetry-protected one-dimensional band degeneracies in momentum space, which can appear in numerous topological configurations such as nodal rings, chains, links, and knots. Very recently, non-Abelian topological physics has been proposed in space-time inversion (PT) symmetric systems, and attract widespread attention. One of the most special configurations in non-Abelian system is the earring nodal link, composing of a nodal chain linking with an isolated nodal line, is signature of non-Abelian topology and cannot be elucidated using Abelian topological classifications. However, the earring nodal links have not been yet observed in real system. Here we design the phononic crystals with earring nodal links, and verify its non-Abelian topologicial charge in full-wave simulations. Moreover, we experimentally observed two different kinds of earring nodal links by measuring the band structures for two phononic crystals. Specifically, we found that the order of the nodal chain and line can switch after band inversion but their link cannot be severed. Our work provides experimental evidence for phenomena unique to non-Abelian band topology and our simple acoustic system provides a convenient platform for studying non-Abelian charges.

preprint2020arXiv

Sample caching Markov chain Monte Carlo approach to boson sampling simulation

Boson sampling is a promising candidate for quantum supremacy. It requires to sample from a complicated distribution, and is trusted to be intractable on classical computers. Among the various classical sampling methods, the Markov chain Monte Carlo method is an important approach to the simulation and validation of boson sampling. This method however suffers from the severe sample loss issue caused by the autocorrelation of the sample sequence. Addressing this, we propose the sample caching Markov chain Monte Carlo method that eliminates the correlations among the samples, and prevents the sample loss at the meantime, allowing more efficient simulation of boson sampling. Moreover, our method can be used as a general sampling framework that can benefit a wide range of sampling tasks, and is particularly suitable for applications where a large number of samples are taken.

preprint2020arXiv

Topological one-way large-area waveguide states in magnetic photonic crystals

We have theoretically and experimentally achieved large-area one-way transport by using heterostructures consisting of a domain of an ordinary photonic crystal (PC) sandwiched between two domains of magnetic PCs. The non-magnetized domain carries two orthogonal one-way waveguide states which have amplitude uniformly distributed over a large-area. These two waveguide states support unidirectional transport even though the medium of propagation is not magnetized. We show both experimentally and numerically that such one-way waveguide states can be utilized to abruptly narrow the beam width of an extended state to concentrate energy. Such extended waveguide modes are robust to different kinds of defects, such as voids and PEC barriers. They are also immune to the Anderson type localization when large randomness is introduced.

preprint2020arXiv

Variational Quantum Circuits for Quantum State Tomography

Quantum state tomography is a key process in most quantum experiments. In this work, we employ quantum machine learning for state tomography. Given an unknown quantum state, it can be learned by maximizing the fidelity between the output of a variational quantum circuit and this state. The number of parameters of the variational quantum circuit grows linearly with the number of qubits and the circuit depth, so that only polynomial measurements are required, even for highly-entangled states. After that, a subsequent classical circuit simulator is used to transform the information of the target quantum state from the variational quantum circuit into a familiar format. We demonstrate our method by performing numerical simulations for the tomography of the ground state of a one-dimensional quantum spin chain, using a variational quantum circuit simulator. Our method is suitable for near-term quantum computing platforms, and could be used for relatively large-scale quantum state tomography for experimentally relevant quantum states.

preprint2016arXiv

Pancharatnam-Berry phase induced spin-selective transmission in herringbone dielectric metamaterials

Manipulating the polarisation of light is crucial for sensing and imaging applications. One such aspect in particular is selective transmission of one circular polarisation (spin) when light is transmitted through a medium or a device. However, most present methods of achieving this have relatively low efficiency and selectivity, whilst high selectivity examples rely on lossy and complex three-dimensional helical or multilayer structures. Here, we propose a dielectric metamaterial approach for achieving spin-selective transmission of electromagnetic waves, utilizing spin-controlled constructive or destructive interference between two Pancharatnam-Berry (PB) phases in conjunction with propagative dynamic phase. The dielectric metamaterial, consisting of monolithic silicon herringbone structures, exhibits a broadband operation in the terahertz regime whilst obtaining a spin-selective efficiency upwards of 60%. Such a device is robust and is not easily degraded by errors in fabrication.

Dongyang Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

DisagMoE: Computation-Communication overlapped MoE Training via Disaggregated AF-Pipe Parallelism

A Personalized Dialogue Generator with Implicit User Persona Detection

Large-scale full-programmable quantum walk and its applications

Topological membrane devices for terahertz on-chip photonics

Experimental observation of non-Abelian earring nodal links in phononic crystals

Sample caching Markov chain Monte Carlo approach to boson sampling simulation

Topological one-way large-area waveguide states in magnetic photonic crystals

Variational Quantum Circuits for Quantum State Tomography

Pancharatnam-Berry phase induced spin-selective transmission in herringbone dielectric metamaterials