Source author record

Xiaoze Liu

Xiaoze Liu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning Computation and Language cond-mat.mes-hall cond-mat.mtrl-sci Databases physics.optics

Catalog footprint

What is connected

6works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Experience Sharing in Mutual Reinforcement Learning for Heterogeneous Language Models

We introduce Mutual Reinforcement Learning, a framework for concurrent RL post-training in which heterogeneous LLM policies exchange typed experience while keeping separate parameters, objectives, and tokenizers. The framework combines a Shared Experience Exchange (SEE), Multi-Worker Resource Allocation (MWRA), and a Tokenizer Heterogeneity Layer (THL) that retokenizes text and aligns token-level traces across incompatible vocabularies. This substrate makes the experience-sharing design question operational across model families. We instantiate three controlled probes on top of GRPO: data-level rollout sharing via Peer Rollout Pooling (PRP), value-level advantage sharing via Cross-Policy GRPO Advantage Sharing (XGRPO), and outcome-level success transfer via Success-Gated Transfer (SGT). A contextual-bandit analysis characterizes their structural positions on a stability-support trade-off: PRP pays density-ratio variance and THL residual costs, XGRPO preserves on-policy actor support while changing scalar baselines, and SGT supplies a rescue-set score direction toward verified peer successes. In the evaluated regime, outcome-level sharing occupies the favorable point of this trade-off.

preprint2026arXiv

Multi-Rollout On-Policy Distillation via Peer Successes and Failures

Large language models are often post-trained with sparse verifier rewards, which indicate whether a sampled trajectory succeeds but provide limited guidance about where reasoning succeeds or fails. On-policy distillation (OPD) offers denser token-level supervision by training on student-generated trajectories, yet existing methods typically distill each rollout independently and ignore the other attempts sampled for the same prompt. We introduce Multi-Rollout On-Policy Distillation (MOPD), a peer-conditioned distillation framework that uses the student's local rollout group to construct more informative teacher signals. MOPD conditions the teacher on both successful and failed peer rollouts: successes provide positive evidence for valid reasoning patterns, while failures provide structured negative evidence about plausible mistakes to avoid. We study two peer-context constructions: positive peer imitation and contrastive success-failure conditioning. Experiments on competitive programming, mathematical reasoning, scientific question answering, and tool-use benchmarks show that MOPD consistently improves over standard on-policy baselines. Further teacher-signal analysis shows that mixed success-failure contexts better align teacher scores with verifier rewards, indicating that the gains arise from more faithful, instance-adaptive supervision. These results indicate that effective on-policy distillation should exploit the student's multi-rollout trial-and-error behavior rather than treating rollouts as isolated samples.

preprint2022arXiv

ClusterEA: Scalable Entity Alignment with Stochastic Training and Normalized Mini-batch Similarities

Entity alignment (EA) aims at finding equivalent entities in different knowledge graphs (KGs). Embedding-based approaches have dominated the EA task in recent years. Those methods face problems that come from the geometric properties of embedding vectors, including hubness and isolation. To solve these geometric problems, many normalization approaches have been adopted for EA. However, the increasing scale of KGs renders it hard for EA models to adopt the normalization processes, thus limiting their usage in real-world applications. To tackle this challenge, we present ClusterEA, a general framework that is capable of scaling up EA models and enhancing their results by leveraging normalization methods on mini-batches with a high entity equivalent rate. ClusterEA contains three components to align entities between large-scale KGs, including stochastic training, ClusterSampler, and SparseFusion. It first trains a large-scale Siamese GNN for EA in a stochastic fashion to produce entity embeddings. Based on the embeddings, a novel ClusterSampler strategy is proposed for sampling highly overlapped mini-batches. Finally, ClusterEA incorporates SparseFusion, which normalizes local and global similarity and then fuses all similarity matrices to obtain the final similarity matrix. Extensive experiments with real-life datasets on EA benchmarks offer insight into the proposed framework, and suggest that it is capable of outperforming the state-of-the-art scalable EA framework by up to 8 times in terms of Hits@1.

preprint2022arXiv

Electrically pumped polarized exciton-polaritons in a halide perovskite microcavity

Exciton polaritons, hybrid quasiparticles with part-light part-matter nature in semiconductor microcavities, are extensively investigated for striking phenomena such as polariton condensation and quantum emulation. These phenomena have recently been discovered in emerging lead halide perovskites at elevated temperatures up to room temperature. For advancing these discoveries into practical applications, one critical requirement is the realization of electrically pumped exciton-polaritons. However, electrically pumped polariton light-emitting devices with perovskites have not yet been achieved experimentally. Here, we devise a new method to combine the device with the microcavity and report the first halide perovskite polariton light-emitting device. Specifically, the device is based on a CsPbBr3 capacitive structure, which can inject the electrons and holes from the same electrode, conducive to the formation of excitons and simultaneously maintaining the high quality of the microcavity. In addition, highly polarization-selective polariton emissions have been demonstrated due to the optical birefringence in the CsPbBr3 microplate. This work paves the way for realizing practical polaritonic devices such as high-speed light-emitting devices for information communications and inversionless electrically pumped lasers based on perovskites.

preprint2019arXiv

Observation of Rydberg exciton polaritons and their condensate in a perovskite cavity

The condensation of half-light half-matter exciton polaritons in semiconductor optical cavities is a striking example of macroscopic quantum coherence in a solid state platform. Quantum coherence is possible only when there are strong interactions between the exciton polaritons provided by their excitonic constituents. Rydberg excitons with high principle value exhibit strong dipole-dipole interactions in cold atoms. However, polaritons with the excitonic constituent that is an excited state, namely Rydberg exciton polaritons (REPs), have not yet been experimentally observed. Here, for the first time, we observe the formation of REPs in a single crystal CsPbBr3 perovskite cavity without any external fields. These polaritons exhibit strong nonlinear behavior that leads to a coherent polariton condensate with a prominent blue shift. Furthermore, the REPs in CsPbBr3 are highly anisotropic and have a large extinction ratio, arising from the perovskite's orthorhombic crystal structure. Our observation not only sheds light on the importance of many-body physics in coherent polariton systems involving higher-order excited states, but also paves the way for exploring these coherent interactions for solid state quantum optical information processing.

preprint2014arXiv

Strong light-matter coupling in two-dimensional atomic crystals

Two dimensional (2D) atomic crystals of graphene, and transition metal dichalcogenides have emerged as a class of materials that show strong light-matter interaction. This interaction can be further controlled by embedding such materials into optical microcavities. When the interaction is engineered to be stronger than the dissipation of light and matter entities, one approaches the strong coupling regime resulting in the formation of half-light half-matter bosonic quasiparticles called microcavity polaritons. Here we report the evidence of strong light-matter coupling and formation of microcavity polaritons in a two dimensional atomic crystal of molybdenum disulphide (MoS2) embedded inside a dielectric microcavity at room temperature. A Rabi splitting of 46 meV and highly directional emission is observed from the MoS2 microcavity owing to the coupling between the 2D excitons and the cavity photons. Realizing strong coupling effects at room temperature in a disorder free potential landscape is central to the development of practical polaritonic circuits and switches.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint