Source author record

Jie Lou

Jie Lou appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cond-mat.str-el Computation and Language cond-mat.stat-mech Artificial Intelligence Machine Learning quant-ph Computer Vision cond-mat.mtrl-sci cond-mat.quant-gas cond-mat.supr-con

Catalog footprint

What is connected

15works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Learning from Failures: Correction-Oriented Policy Optimization with Verifiable Rewards

Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as an effective paradigm for improving the reasoning capabilities of large language models. However, RLVR training is often hindered by sparse binary rewards and weak credit assignment, resulting in ambiguous optimization signals and underutilization of the useful information embedded in failed trajectories. To address this challenge, we propose Correction-Oriented Policy Optimization (CIPO), a simple and effective extension to RLVR that converts on-policy failed trajectories into correction-oriented supervision, without relying on any external signals. By jointly optimizing correction samples derived from the model's own failed attempts together with the standard RLVR objective, CIPO improves learning effectiveness while explicitly enhancing the model's ability to correct its own errors. Extensive experiments across 11 benchmarks spanning mathematical reasoning and code generation demonstrate that CIPO consistently and significantly outperforms strong baselines in both reasoning and correction performance. Moreover, CIPO yields stronger pass@K gains, indicating that it improves the model's intrinsic reasoning capacity rather than merely redistributing probability mass over existing correct answers.

preprint2026arXiv

Scalable Oversight for Superhuman AI via Recursive Self-Critiquing

As AI capabilities increasingly surpass human proficiency in complex tasks, current alignment techniques, including SFT and RLHF, face fundamental challenges in ensuring reliable oversight. These methods rely on direct human assessment and become impractical when AI outputs exceed human cognitive thresholds. In response to this challenge, we explore two hypotheses: (1) \textit{Critique of critique can be easier than critique itself}, extending the widely-accepted observation that verification is easier than generation to the critique domain, as critique itself is a specialized form of generation; (2) \textit{This difficulty relationship holds recursively}, suggesting that when direct evaluation is infeasible, performing higher-order critiques (e.g., critique of critique of critique) offers a more tractable supervision pathway. We conduct Human-Human, Human-AI, and AI-AI experiments to investigate the potential of recursive self-critiquing for AI supervision. Our results highlight recursive critique as a promising approach for scalable AI oversight.

preprint2026arXiv

Vision-OPD: Learning to See Fine Details for Multimodal LLMs via On-Policy Self-Distillation

Multimodal Large Language Models (MLLMs) still struggle with fine-grained visual understanding, where answers often depend on small but decisive evidence in the full image. We observe a regional-to-global perception gap: the same MLLM answers fine-grained questions more accurately when conditioned on evidence-centered crops than on the corresponding full images, suggesting that many failures stem from difficulty to focus on relevant evidence rather than insufficient local recognition ability. Motivated by this observation, we propose Vision-OPD (Vision On-Policy Distillation), a regional-to-global self-distillation framework that transfers the model's own privileged regional perception to its full-image policy. Vision-OPD instantiates two conditional policies from the same MLLM: a crop-conditioned teacher and a full-image-conditioned student. The student generates on-policy rollouts, and Vision-OPD minimizes token-level divergence between the teacher and student next-token distributions along these rollouts. This enables the model to internalize the benefit of visual zooming without external teacher models, ground-truth labels, reward verifiers, or inference-time tool use. Experiments on multiple fine-grained visual understanding benchmarks show that Vision-OPD models achieve competitive or superior performance against much larger open-source, closed-source, and "Thinking-with-Images" agentic models.

preprint2023arXiv

Universal Information Extraction as Unified Semantic Matching

The challenge of information extraction (IE) lies in the diversity of label schemas and the heterogeneity of structures. Traditional methods require task-specific model design and rely heavily on expensive supervision, making them difficult to generalize to new schemas. In this paper, we decouple IE into two basic abilities, structuring and conceptualizing, which are shared by different tasks and schemas. Based on this paradigm, we propose to universally model various IE tasks with Unified Semantic Matching (USM) framework, which introduces three unified token linking operations to model the abilities of structuring and conceptualizing. In this way, USM can jointly encode schema and input text, uniformly extract substructures in parallel, and controllably decode target structures on demand. Empirical evaluation on 4 IE tasks shows that the proposed method achieves state-of-the-art performance under the supervised experiments and shows strong generalization ability in zero/few-shot transfer settings.

preprint2021arXiv

Enhancement of boson superfluidity in a one-dimensional Bose-Fermi mixture

We examine the effect of boson-fermion interaction in a one-dimensional Bose-Fermi mixture by using the density matrix renormalization group method. We show that the boson superfluidity is enhanced by fermions for a weak boson-fermion coupling at an approximate integer boson filling factor (e.g., $0.935\le ρ_b \le 1.0$), and this enhancement is produced both in a fermion metallic state and in a fermion insulating state. A metal-insulator phase transition of fermions induced by boson-fermion interaction is observed even though there is no fermion-fermion interaction in the parent Hamiltonian. Furthermore, we find that the boson superfluid order and density wave order can coexist in a deep fermion Mott region. All these features could be measured in future experiments and open up the possibility of detecting the new physical effect in the Bose-Fermi mixture.

preprint2020arXiv

Effective p-wave Fermi-Fermi Interaction Induced by Bosonic Superfluids

We study the two-dimensional Bose-Fermi mixture on square lattice at finite temperature by using the determinant quantum Monte Carlo method within the weakly interacting regime. Here we consider the attractive Bose-Hubbard model and free spinless fermions. In the absence of bosonfermion interactions, we obtain the boundary of the collapsed state of the attractive bosons. In the presence of boson-fermion interactions, an effective p-wave interaction between fermions will be induced as far as the bosons are in a superfluid state. Moreover, we find the emergence of the composite fermion pairs at low temperatures.

preprint2015arXiv

Combining Grassmann algebra with entanglement renormalization method

By combining the Grassmann algebra with multi-scale entanglement renormalization ansatz (MERA), we introduce a new unbiased and effective numerical method for simulating 2D strongly correlated electronic systems. The new GMERA method inherits all the advantages of MERA, which constructs the variational wave function based on complicated tensor network. Besides it can deal with fermionic properties of the system due to Grassmann algebra through local tensor contractions. This general method can treat different tensor network structures in a universal way. We show several benchmark calculations of the GMERA method, including the free fermion model, tight binding model, as well as the t-J model with hole doping.

preprint2015arXiv

Global Phase Diagram of the Extended Kitaev-Heisenberg Model on Honeycomb Lattice

We study the extended Kitaev-Heisenberg (EKH) quantum spin model by adding bond-dependent off-diagonal Heisenberg term into the original KH model, which was recently proposed to describe the honeycomb Iridates. A rigorous mathematical mapping of spin operators reveals the intrinsic symmetry of the model Hamiltonian. By employing an unbiased numerical entanglement renormalization method based on tensor network ansatz, we obtain the global phase diagram containing eight distinct quantum phases. By using the dual mapping of spin operators, each of the individual magnetic phase in the global phase diagram can be clearly understood. At last, we show that a valence solid state emerges as the ground state in the quadro-critical region where multiple magnetic phases compete most intensively.

preprint2015arXiv

SU(N) Heisenberg model with multi-column representations

The $\mathrm{SU}(N)$ symmetric antiferromagnetic Heisenberg model with multi-column representations on the two-dimensional square lattice is investigated by quantum Monte Carlo simulations. For the representation of Young diagram with two columns, we confirm that a valence-bond solid order appears as soon as the Néel order disappears at $N = 10$ indicating no intermediate phase. In the case of the representation with three columns, there is no evidence for both of the Néel and the valence-bond solid ordering for $N\ge 15$. This is actually consistent with the large-$N$ theory, which predicts that the VBS state immediately follows the Néel state, because the expected spontaneous order is too weak to be detected.

preprint2013arXiv

Possibility of Deconfined Criticality in SU(N) Heisenberg Models at Small N

To examine the validity of the scenario of the deconfined critical phenomena, we carry out a quantum Monte Carlo simulation for the SU($N$) generalization of the Heisenberg model with four-body and six-body interactions. The quantum phase transition between the SU($N$) Néel and valence-bond solid phases is characterized for $N=2,3,$ and $4$ on the square and honeycomb lattices. While finite-size scaling analysis works well up to the maximum lattice size ($L=256$) and indicates the continuous nature of the phase transition, a clear systematic change towards the first-order transition is observed in the estimates of the critical exponent $y \equiv 1/ν$ as the system size increases. We also confirm the relevance of a squared valence-bond solid field $Ψ^2$ for the SU(3) model.

preprint2012arXiv

Correlated valence-bond states

We study generalizations of the singlet-sector amplitude-product (AP) states in the valence-bond basis of S=1/2 quantum spin systems. In the standard AP states, the weight of a tiling of the system into valence bonds (singlets of two spins) is a product of amplitudes depending on the length of the bonds. We here introduce correlated AP (CAP) states, in which the amplitude product is further multiplied by factors depending on two bonds connected to a pair of sites (here nearest neighbors). While the standard AP states can describe a phase transition between an antiferromagnetic (Neel) state and a valence-bond solid (VBS) in one dimension (which we also study here), in two dimensions it cannot describe VBS order. With the CAP states, Neel-VBS transitions are realized as a function of some parameter describing the bond correlations. We here study such phase transitions of CAP wave-functions on the square lattice. We find examples of direct first-order Neel-VBS transitions, as well as cases where there is an extended U(1) spin liquid phase intervening between the Neel and VBS states. In the latter case the transitions are continuous and we extract critical exponents and address the issue of a possible emergent U(1) symmetry in the near-critical VBS. We also consider variationally optimized CAP states for the standard Heisenberg model in one and two dimensions and the J-Q model in two dimensions, with the latter including four-spin interactions (Q) in addition to the Heisenberg exchange (J) and harboring VBS order for large Q/J. The optimized CAP states lead to significantly lower variational energies than the simple AP states for these models.

preprint2012arXiv

Entanglement Spectra of the 2D AKLT Model: VBS/CFT Correspondence

We investigate the entanglement properties of the valence-bond-solid (VBS) state defined on two-dimensional lattices, which is the exact ground state of the Affleck-Kennedy-Lieb-Tasaki model. It is shown that the entanglement entropy obeys an area law and the non-universal prefactor of the leading term is strictly less than $\ln 2$. The analysis of entanglement spectra for various lattices reveals that the reduced density matrix associated with the VBS state is closely related to a thermal density matrix of a {\it holographic} spin chain, whose spectrum is reminiscent of that of the spin-1/2 Heisenberg chain. This correspondence is further supported by comparing the entanglement entropy in the holographic spin chain with conformal field theory predictions.

preprint2012arXiv

Study of the Shastry Sutherland Model Using Multi-scale Entanglement Renormalization Ansatz

We performed variational calculation based on the multi-scale entanglemnt renormalization ansatz, for the antiferromagnetic Heisenberg model on a Shastry Sutherland lattice (SSL). Our results show that at coupling ratio J'/J= 0.687(3), the system undergoes a quantum phase transition from the orthogonal dimer order to the plaquette valence bond solid phase, which then transits into the antiferromagnetic order above J'/J=0.75. In the presence of an external magnetic field, our calculations show clear evidences of various magnetic plateaux in systems with different coupling ratios range from 0.5 to 0.69. Our calculations are not limited to the small coupling ratio region, and we are able to show strong evidence of the presence of several supersolid phases, including ones above 1/2 and 1/3 plateaux. Such supersolid phases, which feature the coexistence of compressible superfluidity and crystalline long range order in triplet excitations, emerge at relatively large coupling ratio (J'/J>0.5). A schematic phase diagram of the SSL model in the presence of magnetic field is provided.

preprint2008arXiv

Z4-U(1) crossover of the order parameter symmetry in a two-dimensional valence-bond-solid

We discuss ground-state projector simulations of a modified two-dimensional S=1/2 Heisenberg model in the valence bonds basis. Tuning matrix elements corresponding to the diagonal and off-diagonal terms in the quantum dimer model, we show that there is a quantum phase transition from the antiferromagnet into a columnar valence-bond-solid (VBS). There are no signs of discontinuities, suggesting a continuous or very weakly first-order transition. The Z4-symmetric VBS order parameter exhibits an emergent U(1) symmetry as the phase transition is approached. We extract the associated length-scale governing the U(1)-Z4 cross-over inside the VBS phase.

preprint2007arXiv

Emergence of U(1) symmetry in the 3D XY model with Zq anisotropy

We study the three-dimensional XY model with a Z_q anisotropic term. At temperatures T < Tc this dangerously irrelevant perturbation is relevant only above a length scale Lambda, which diverges as a power of the correlation length; Lambda ~ xi^a_q. Below Lambda the order parameter is U(1) symmetric. We derive the full scaling function controlling the emergence of U(1) symmetry and use Monte Carlo results to extract the exponent a_q for q=4,...,8. We find that a_q = a_4 (q/4)^2, with a_4 only marginally larger than 1. We discuss these results in the context of U(1) symmetry at "deconfined" quantum critical points separating antiferromagnetic and valence-bond-solid states in quantum spin systems.

Jie Lou

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

Learning from Failures: Correction-Oriented Policy Optimization with Verifiable Rewards

Scalable Oversight for Superhuman AI via Recursive Self-Critiquing

Vision-OPD: Learning to See Fine Details for Multimodal LLMs via On-Policy Self-Distillation

Universal Information Extraction as Unified Semantic Matching

Enhancement of boson superfluidity in a one-dimensional Bose-Fermi mixture

Effective p-wave Fermi-Fermi Interaction Induced by Bosonic Superfluids

Combining Grassmann algebra with entanglement renormalization method

Global Phase Diagram of the Extended Kitaev-Heisenberg Model on Honeycomb Lattice

SU(N) Heisenberg model with multi-column representations

Possibility of Deconfined Criticality in SU(N) Heisenberg Models at Small N

Correlated valence-bond states

Entanglement Spectra of the 2D AKLT Model: VBS/CFT Correspondence

Study of the Shastry Sutherland Model Using Multi-scale Entanglement Renormalization Ansatz

Z4-U(1) crossover of the order parameter symmetry in a two-dimensional valence-bond-solid

Emergence of U(1) symmetry in the 3D XY model with Zq anisotropy