Researcher profile

Xiaojun Wu

Xiaojun Wu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
23works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

23 published item(s)

preprint2026arXiv

DataArc-SynData-Toolkit: A Unified Closed-Loop Framework for Multi-Path, Multimodal, and Multilingual Data Synthesis

Synthetic data has emerged as a crucial solution to the data scarcity bottleneck in large language models (LLMs), particularly for specialized domains and low-resource languages. However, the broader adoption of existing synthetic data tools is severely hindered by convoluted workflows, fragmented data standards, and limited scalability across modalities. To address these limitations, we develop DataArc-SynData-Toolkit, an open-source framework featuring: (1) a configuration-driven, end-to-end pipeline equipped with an intuitive visual interface and simplified CLI for exceptional usability; (2) a unified, quality-controllable synthesis paradigm that standardizes multi-source data generation to ensure high reusability; and (3) a highly modular architecture designed for seamless multimodal, multilingual, and multi-task adaptation. We apply the toolkit in multiple application scenarios. Experimental results demonstrate that our toolkit achieves an optimal balance between generation efficiency and data quality. By offering an end-to-end and visually interactive pipeline, DataArc-SynData-Toolkit significantly lowers the technical barrier to synthetic data generation and subsequent model training, accelerating its practical deployment in real-world applications.

preprint2026arXiv

Duality between Bott-Chern and Aeppli Cohomology on Non-Compact Complex Manifolds

In this paper we establish duality theorems relating Bott-Chern and Aeppli cohomology, both with and without compact support, on non-compact complex manifolds under suitable pseudoconvexity assumptions. In particular, on Stein manifolds we obtain a full Bott-Chern-Aeppli duality extending Serre duality for Dolbeault cohomology. We also show that these results fail in general without pseudoconvexity assumptions by constructing explicit counterexamples on non-compact complex surfaces.

preprint2026arXiv

Riemannian Networks over Full-Rank Correlation Matrices

Representations on the Symmetric Positive Definite (SPD) manifold have garnered significant attention across different applications. In contrast, the manifold of full-rank correlation matrices, a normalized alternative to SPD matrices, remains largely underexplored. This paper introduces Riemannian networks over the correlation manifold, leveraging five recently developed correlation geometries. We systematically extend basic layers, including Multinomial Logistic Regression (MLR), Fully Connected (FC), and convolutional layers, to these geometries. Besides, we present methods for accurate backpropagation for two correlation geometries. Experiments comparing our approach against existing SPD and Grassmannian networks demonstrate its effectiveness.

preprint2026arXiv

Think-on-Graph 3.0: Efficient and Adaptive LLM Reasoning on Heterogeneous Graphs via Multi-Agent Dual-Evolving Context Retrieval

Graph-based Retrieval-Augmented Generation (GraphRAG) has become the important paradigm for enhancing Large Language Models (LLMs) with external knowledge. However, existing approaches are constrained by their reliance on high-quality knowledge graphs: manually built ones are not scalable, while automatically extracted ones are limited by the performance of LLM extractors, especially when using smaller, local-deployed models. To address this, we introduce Think-on-Graph 3.0 (ToG-3), a novel framework featuring a Multi-Agent Context Evolution and Retrieval (MACER) mechanism. Its core contribution is the dynamic construction and iterative refinement of a Chunk-Triplets-Community heterogeneous graph index, powered by a Dual-Evolution process that adaptively evolves both the query and the retrieved sub-graph during reasoning. ToG-3 dynamically builds a targeted graph index tailored to the query, enabling precise evidence retrieval and reasoning even with lightweight LLMs. Extensive experiments demonstrate that ToG-3 outperforms compared baselines on both deep and broad reasoning benchmarks, and ablation studies confirm the efficacy of the components of MACER framework. The source code are available in https://github.com/DataArcTech/ToG-3.

preprint2023arXiv

BusReF: Infrared-Visible images registration and fusion focus on reconstructible area using one set of features

In a scenario where multi-modal cameras are operating together, the problem of working with non-aligned images cannot be avoided. Yet, existing image fusion algorithms rely heavily on strictly registered input image pairs to produce more precise fusion results, as a way to improve the performance of downstream high-level vision tasks. In order to relax this assumption, one can attempt to register images first. However, the existing methods for registering multiple modalities have limitations, such as complex structures and reliance on significant semantic information. This paper aims to address the problem of image registration and fusion in a single framework, called BusRef. We focus on Infrared-Visible image registration and fusion task (IVRF). In this framework, the input unaligned image pairs will pass through three stages: Coarse registration, Fine registration and Fusion. It will be shown that the unified approach enables more robust IVRF. We also propose a novel training and evaluation strategy, involving the use of masks to reduce the influence of non-reconstructible regions on the loss functions, which greatly improves the accuracy and robustness of the fusion task. Last but not least, a gradient-aware fusion network is designed to preserve the complementary information. The advanced performance of this algorithm is demonstrated by

preprint2022arXiv

Interfacial Charge-transfer Excitonic Insulator in a Two-dimensional Organic-inorganic Superlattice

Excitonic insulators are long-sought-after quantum materials predicted to spontaneously open a gap by the Bose condensation of bound electron-hole pairs, namely, excitons, in their ground state. Since the theoretical conjecture, extensive efforts have been devoted to pursuing excitonic insulator platforms for exploring macroscopic quantum phenomena in real materials. Reliable evidences of excitonic character have been obtained in layered chalcogenides as promising candidates. However, owing to the interference of intrinsic lattice instabilities, it is still debatable whether those features, such as charge density wave and gap opening, are primarily driven by the excitonic effect or by the lattice transition. Herein, we develop a novel charge-transfer excitonic insulator in organic-inorganic superlattice interfaces, which serves as an ideal platform to decouple the excitonic effect from the lattice effect. In this system, we observe the narrow gap opening and the formation of a charge density wave without periodic lattice distortion, providing visualized evidence of exciton condensation occurring in thermal equilibrium. Our findings identify spontaneous interfacial charge transfer as a new strategy for developing novel excitonic insulators and investigating their correlated many-body physics.

preprint2022arXiv

PPT Fusion: Pyramid Patch Transformerfor a Case Study in Image Fusion

The Transformer architecture has witnessed a rapid development in recent years, outperforming the CNN architectures in many computer vision tasks, as exemplified by the Vision Transformers (ViT) for image classification. However, existing visual transformer models aim to extract semantic information for high-level tasks, such as classification and detection.These methods ignore the importance of the spatial resolution of the input image, thus sacrificing the local correlation information of neighboring pixels. In this paper, we propose a Patch Pyramid Transformer(PPT) to effectively address the above issues.Specifically, we first design a Patch Transformer to transform the image into a sequence of patches, where transformer encoding is performed for each patch to extract local representations. In addition, we construct a Pyramid Transformer to effectively extract the non-local information from the entire image. After obtaining a set of multi-scale, multi-dimensional, and multi-angle features of the original image, we design the image reconstruction network to ensure that the features can be reconstructed into the original input. To validate the effectiveness, we apply the proposed Patch Pyramid Transformer to image fusion tasks. The experimental results demonstrate its superior performance, compared to the state-of-the-art fusion approaches, achieving the best results on several evaluation indicators. Thanks to the underlying representational capacity of the PPT network, it can directly be applied to different image fusion tasks without redesigning or retraining the network.

preprint2022arXiv

Pseudo-effective and numerically flat reflexive sheaves

In this note, we discuss the concept of pseudoeffective vector bundle and also introduce pseudoeffective torsion-free sheaves over compact Kähler manifolds. We show that a pseudoeffective reflexive sheaf over a compact Kähler manifold with vanishing first Chern class is in fact a numerically flat vector bundle. A proof is obtained through a natural construction of positive currents representing the Segre classes of pseudoeffective vector bundles.

preprint2022arXiv

Remark on nefness in higher codimension

In this work, following the fundamental work of Boucksom we construct the nef cone of a compact complex manifold in higher codimension and give explicit examples where these cones are different. In the third section, we give two versions of Kawamata-Viehweg vanishing theorems in terms of nefness in higher codimension and numerical dimensions. We also show by examples the optimality of the divisoral Zariski decomposition defined by Boucksom.

preprint2022arXiv

Ultra-High Lithium Storage Capacity of Al2C Monolayer under Restricted Multilayered Growth Mechanism

Designing anode materials with high lithium specific capacity is crucial to the development of high energy-density lithium ion batteries. Herein, a distinctive lithium growth mechanism, namely, the restricted multilayered growth for lithium, and a strategy for lithium storage are proposed to achieve the balance between the ultra-high specific capacity and the need to avert uncontrolled dendritic growth of lithium. In particular, based on first-principles computation, we show that the Al2C monolayer with planar tetracoordinate carbon structure can be an ideal platform for realizing the restricted multilayered growth mechanism as a 2D anode material. Furthermore, the Al2C monolayer exhibits ultra-high specific capacity of lithium of 4059 mAh/g, yet with a low dif-fusion barrier of 0.039-0.17 eV as well as low open circuit voltage in the range of 0.002-0.34 V. These novel properties endow the Al2C monolayer a promising anode material for future lithium ion batteries. Our study offers a new way to design promising 2D anode materials with high specific capacity, fast lithium-ion diffusion, and safe lithium storage mechanism.

preprint2021arXiv

Mechanical Properties of Atomically Thin Tungsten Dichalcogenides: WS$_2$, WSe$_2$ and WTe$_2$

Two-dimensional (2D) tungsten disulfide (WS$_2$), tungsten diselenide (WSe$_2$), and tungsten ditelluride (WTe$_2$) draw increasing attention due to their attractive properties deriving from the heavy tungsten and chalcogenide atoms, but their mechanical properties are still mostly unknown. Here, we determine the intrinsic and air-aged mechanical properties of mono-, bi-, and trilayer (1-3L) WS$_2$, WSe$_2$ and WTe$_2$ using a complementary suite of experiments and theoretical calculations. High-quality 1L WS$_2$ has the highest Young's modulus (302.4+-24.1 GPa) and strength (47.0+-8.6 GPa) of the entire family, overpassing those of 1L WSe$_2$ (258.6+-38.3 and 38.0+-6.0 GPa, respectively) and WTe$_2$ (149.1+-9.4 and 6.4+-3.3 GPa, respectively). However, the elasticity and strength of WS$_2$ decrease most dramatically with increased thickness among the three materials. We interpret the phenomenon by the different tendencies for interlayer sliding in equilibrium state and under in-plane strain and out-of-plane compression conditions in the indentation process, revealed by finite element method (FEM) and density functional theory (DFT) calculations including van der Waals (vdW) interactions. We also demonstrate that the mechanical properties of the high-quality 1-3L WS$_2$ and WSe$_2$ are largely stable in the air for up to 20 weeks. Intriguingly, the 1-3L WSe$_2$ shows increased modulus and strength values with aging in the air. This is ascribed to oxygen doping, which reinforces the structure. The present study will facilitate the design and use of 2D tungsten dichalcogenides in applications, such as strain engineering and flexible field-effect transistors (FETs).

preprint2021arXiv

Two-Dimensional Bipolar Magnetic Semiconductor with High Curie Temperature and Electrically Controllable Spin Polarization Realized in Exfoliated Cr(pyrazine)$_2$ Monolayer

Exploring two-dimensional (2D) magnetic semiconductors with room temperature magnetic ordering and electrically controllable spin polarization is a highly desirable but challenging task for nanospintronics. Here, through first principles calculations, we propose to realize such a material by exfoliating the recently synthesized organometallic layered crystal Li$_{0.7}$[Cr(pyz)$_2$]Cl$_{0.7}$0.25$\cdot$(THF) (pyz = pyrazine, THF = tetrahydrofuran) [Science 370, 587 (2020)]. The feasibility of exfoliation is confirmed by the rather low exfoliation energy of 0.27 J/m$^2$, even smaller than that of graphite. In exfoliated Cr(pyz)$_2$ monolayer, each pyrazine ring grabs one electron from the Cr atom to become a radical anion, then a strong $d$-$p$ direct exchange magnetic interaction emerges between Cr cations and pyrazine radicals, resulting in room temperature ferrimagnetism with a Curie temperature of 342 K. Moreover, Cr(pyz)$_2$ monolayer is revealed to be an intrinsic bipolar magnetic semiconductor where electrical doping can induce half-metallic conduction with controllable spin-polarization direction.

preprint2020arXiv

1.4-mJ High Energy Terahertz Radiation from Lithium Niobates

Free-space super-strong terahertz (THz) electromagnetic fields offer multifaceted capabilities for reaching extreme nonlinear THz optics, accelerating and manipulating charged particles, and realizing other fascinating applications. However, the lack of powerful solid-state THz sources with single pulse energy >1 mJ is impeding the proliferation of extreme THz applications. The fundamental challenge lies in hard to achieve high efficiency due to high intensity pumping caused crystal damage, linear absorption and nonlinear distortion induced short effective interaction length, and so on. Here, through cryogenically cooling the crystals, delicately tailoring the pump laser spectra, chirping the pump pulses, and magnifying the laser energies, we first successfully realized the generation of 1.4-mJ THz pulses lithium niobates under the excitation of 214-mJ femtosecond laser pulses via tilted pulse front technique. The 800 nm-to-THz energy conversion efficiency reached 0.7%, and a free-space THz peak electric and magnetic fields reached 6.3 MV/cm and 2.1 Tesla. Our numerical simulations based on a frequencydomain second-order nonlinear wave equation under slowly varying envelope approximation reproduced the experimental optimization processes. To show the capability of this super-strong THz source, nonlinear absorption due to field-induced intervalley scattering effect in high conductive silicon induced by strong THz electric field was demonstrated. Such a high energy THz source with a relatively low peak frequency is very appropriate not only for electron acceleration towards table-top X-ray sources but also for extreme THz science and nonlinear applications.

preprint2020arXiv

Generation and manipulation of chiral terahertz waves emitted from the three-dimensional topological insulator Bi2Te3

Arbitrary manipulation of broadband terahertz waves with flexible polarization shaping at the source has great potential in expanding real applications such as imaging, information encryption, and all-optically coherent control of terahertz nonlinear phenomena. Topological insulators featuring unique spin-momentum locked surface state have already exhibited very promising prospects in terahertz emission, detection and modulation, which may lay a foundation for future on-chip topological insulator-based terahertz systems. However, polarization shaped terahertz emission with prescribed manners of arbitrarily manipulated temporal evolution of the amplitude and electric-field vector direction based on topological insulators have not yet been explored. Here we systematically investigated the terahertz radiation from topological insulator Bi2Te3 nanofilms driven by femtosecond laser pulses, and successfully realized the generation of efficient chiral terahertz waves with controllable chirality, ellipticity, and principle axis. The convenient engineering of the chiral terahertz waves was interpreted by photogalvanic effect induced photocurrent, while the linearly polarized terahertz waves originated from linear photogalvanic effect induced shift currents. We believe our works not only help further understanding femtosecond coherent control of ultrafast spin currents in light-matter interaction but also provide an effective way to generate spin-polarized terahertz waves and accelerate the proliferation of twisting the terahertz waves at the source.

preprint2020arXiv

On the hard Lefschetz theorem for pseudoeffective line bundles

In this note, we obtain a number of results related to the hard Lefschetz theorem for pseudoeffective line bundles, due to Demailly, Peternell and Schneider. Our first result states that the holomorphic sections produced by the theorem are in fact parallel, when viewed as currents with respect to the singular Chern connection associated with the metric. Our proof is based on a control of the covariant derivative in the approximation process used in the construction of the section. Then we show that we have an isomorphsim between such parallel sections and higher degree cohomology. As an application, we show that the closedness of such sections induces a linear subspace structure on the tangent bundle. Finally, we discuss some questions related to the optimality of the hard Lefschetz theorem.

preprint2020arXiv

Peeking into occluded joints: A novel framework for crowd pose estimation

Although occlusion widely exists in nature and remains a fundamental challenge for pose estimation, existing heatmap-based approaches suffer serious degradation on occlusions. Their intrinsic problem is that they directly localize the joints based on visual information; however, the invisible joints are lack of that. In contrast to localization, our framework estimates the invisible joints from an inference perspective by proposing an Image-Guided Progressive GCN module which provides a comprehensive understanding of both image context and pose structure. Moreover, existing benchmarks contain limited occlusions for evaluation. Therefore, we thoroughly pursue this problem and propose a novel OPEC-Net framework together with a new Occluded Pose (OCPose) dataset with 9k annotated images. Extensive quantitative and qualitative evaluations on benchmarks demonstrate that OPEC-Net achieves significant improvements over recent leading works. Notably, our OCPose is the most complex occlusion dataset with respect to average IoU between adjacent instances. Source code and OCPose will be publicly available.

preprint2020arXiv

Terahertz Strong-Field Physics in Light-Emitting Diodes for Terahertz Detection and Imaging

Intense terahertz (THz) electromagnetic fields have been utilized to reveal a variety of extremely nonlinear optical effects in many materials through nonperturbative driving of elementary and collective excitations. However, such nonlinear photoresponses have not yet been discovered in light-emitting diodes (LEDs), letting alone employing them as fast, cost effective,compact, and room-temperature-operating THz detectors and cameras. Here we report ubiquitously available LEDs exhibited gigantic and fast photovoltaic signals with excellent signal-to-noise ratios when being illuminated by THz field strengths >50 kV/cm. We also successfully demonstrated THz-LED detectors and camera prototypes. These unorthodox THz detectors exhibited high responsivities (>1 kV/W) with response time shorter than those of pyroelectric detectors by four orders of magnitude. The detection mechanism was attributed to THz-field-induced nonlinear impact ionization and Schottky contact. These findings not only help deepen our understanding of strong THz field-matter interactions but also greatly contribute to the applications of strong-field THz diagnosis.

preprint2019arXiv

Tunable High-Quality Fano Resonance in Coupled Terahertz Whispering-Gallery-Mode Resonators

Fano resonance is widely discussed in designing novel terahertz components, such as sensors, filters, modulators, and group delay modules. Usually, high quality (Q) factor and flexible tunability of Fano resonance are key requirements for these applications. Here, we present tunable terahertz Fano resonance with a Q factor of 2095 at 0.439 THz in coupled terahertz whispering-gallery-mode resonators (WGMRs). Coupling between a relatively low Q (578) quartz ring and a high Q (2095) silicon ring is employed to generate the Fano resonance. The resonant frequency of the Fano resonance can be actively manipulated by tuning the resonant frequency of the high Q WGMR, which is achieved through utilizing an electrical thermo-optic tuning method, meanwhile, the resonance intensity of the Fano resonance can be engineered by adjusting the coupling strength between two WGMRs. This coupled-WGMR scheme delivers high Q tunable Fano resonance and may contribute to the design of high-performance configurable terahertz devices.

preprint2018arXiv

A New Two-Dimensional Functional Material with Desirable Bandgap and Ultrahigh Carrier Mobility

Two-dimensional (2D) semiconductors with direct and modest bandgap and ultrahigh carrier mobility are highly desired functional materials for nanoelectronic applications. Herein, we predict that monolayer CaP3 is a new 2D functional material that possesses not only a direct bandgap of 1.15 eV (based on HSE06 computation), and also a very high electron mobility up to 19930 cm2 V-1 s-1, comparable to that of monolayer phosphorene. More remarkably, contrary to the bilayer phosphorene which possesses dramatically reduced carrier mobility compared to its monolayer counterpart, CaP3 bilayer possesses even higher electron mobility (22380 cm2 V-1 s-1) than its monolayer counterpart. The bandgap of 2D CaP3 can be tuned over a wide range from 1.15 to 0.37 eV (HSE06 values) through controlling the number of stacked CaP3 layers. Besides novel electronic properties, 2D CaP3 also exhibits optical absorption over the entire visible-light range. The combined novel electronic, charge mobility, and optical properties render 2D CaP3 an exciting functional material for future nanoelectronic and optoelectronic applications.