Researcher profile

Pengfei Zhou

Pengfei Zhou contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2026arXiv

Neural-Driven Image Editing

Traditional image editing typically relies on manual prompting, making it labor-intensive and inaccessible to individuals with limited motor control or language abilities. Leveraging recent advances in brain-computer interfaces (BCIs) and generative models, we propose LoongX, a hands-free image editing approach driven by multimodal neurophysiological signals. LoongX utilizes state-of-the-art diffusion models trained on a comprehensive dataset of 23,928 image editing pairs, each paired with synchronized electroencephalography (EEG), functional near-infrared spectroscopy (fNIRS), photoplethysmography (PPG), and head motion signals that capture user intent. To effectively address the heterogeneity of these signals, LoongX integrates two key modules. The cross-scale state space (CS3) module encodes informative modality-specific features. The dynamic gated fusion (DGF) module further aggregates these features into a unified latent space, which is then aligned with edit semantics via fine-tuning on a diffusion transformer (DiT). Additionally, we pre-train the encoders using contrastive learning to align cognitive states with semantic intentions from embedded natural language. Extensive experiments demonstrate that LoongX achieves performance comparable to text-driven methods (CLIP-I: 0.6605 vs. 0.6558; DINO: 0.4812 vs. 0.4636) and outperforms them when neural signals are combined with speech (CLIP-T: 0.2588 vs. 0.2549). These results highlight the promise of neural-driven generative models in enabling accessible, intuitive image editing and open new directions for cognitive-driven creative technologies. The code and dataset are released on the project website: https://loongx1.github.io.

preprint2026arXiv

TMD-Bench: A Multi-Level Evaluation Paradigm for Music-Dance Co-Generation

Unified audio-visual generation is rapidly gaining industrial and creative relevance, enabling applications in virtual production and interactive media. However, when moving from general audio-video synthesis to music-dance co-generation, the task becomes substantially harder: musical rhythm, phrasing, and accents must drive choreographic motion at fine temporal resolution, and such rhythmic coupling is not captured by unimodal metrics or generic audiovisual consistency scores used in current evaluation practice. We introduce TMD-Bench, a benchmark for text-driven music-dance co-generation that assesses systems across unimodal generation quality, instruction adherence, and cross-modal rhythmic alignment. The benchmark integrates computable physical metrics with perceptual multimodal judgments, and is supported by a curated rhythm-aligned music-dance dataset and a fine-grained Music Captioner for structured music semantics. TMD-Bench further reveals that (i) modern commercial audio-visual models, such as Veo 3 and Sora 2, produce high-quality music and video, while rhythmic coupling remains less consistently optimized and leaves room for improvement, and (ii) our unified baseline RhyJAM trained on rhythm-aligned data achieves competitive beat-level synchronization while maintaining competitive unimodal fidelity. This presents prospects for building next-generation music-dance models that explicitly optimize rhythmic and kinetic coherence.

preprint2022arXiv

ISDA: Position-Aware Instance Segmentation with Deformable Attention

Most instance segmentation models are not end-to-end trainable due to either the incorporation of proposal estimation (RPN) as a pre-processing or non-maximum suppression (NMS) as a post-processing. Here we propose a novel end-to-end instance segmentation method termed ISDA. It reshapes the task into predicting a set of object masks, which are generated via traditional convolution operation with learned position-aware kernels and features of objects. Such kernels and features are learned by leveraging a deformable attention network with multi-scale representation. Thanks to the introduced set-prediction mechanism, the proposed method is NMS-free. Empirically, ISDA outperforms Mask R-CNN (the strong baseline) by 2.6 points on MS-COCO, and achieves leading performance compared with recent models. Code will be available soon.

preprint2022arXiv

Temperature-dependent structure of an intermetallic ErPd$_2$Si$_2$ single crystal: A combined synchrotron and in-house X-ray diffraction study

We have grown intermetallic ErPd$_2$Si$_2$ single crystals employing laser-diodes with the floating-zone method. The temperature-dependent crystallography was determined using synchrotron and in-house X-ray powder diffraction measurements from 20 to 500 K. The diffraction patterns fit well with the tetragonal $I$4/$mmm$ space group (No. 139) with two chemical formulas within one unit cell. Our synchrotron X-ray powder diffraction study shows that the refined lattice constants are $a$ = 4.10320(2) Å, $c$ = 9.88393(5) Å at 298 K and $a$ = 4.11737(2) Å, $c$ = 9.88143(5) Å at 500 K, resulting in the unit-cell volume $V$ = 166.408(1) Å$^3$ (298 K) and 167.517(2) Å$^3$ (500 K). In the whole studied temperature range, we did not find any structural phase transition. Upon cooling, the lattice constants a and c are shortened and elongated, respectively.

preprint2021arXiv

Classical Sampling of Random Quantum Circuits with Bounded Fidelity

Random circuit sampling has become a popular means for demonstrating the superiority of quantum computers over classical supercomputers. While quantum chips are evolving rapidly, classical sampling algorithms are also getting better and better. The major challenge is to generate bitstrings exhibiting an XEB fidelity above that of the quantum chips. Here we present a classical sampling algorithm for producing the probability distribution of any given random quantum circuit, where the fidelity can be rigorously bounded. Specifically, our algorithm performs rejection sampling after the introduced very recently multi-tensor contraction algorithm. We show that the fidelity can be controlled by partially contracting the dominant paths in the tensor network and by adjusting the number of batches used in the rejection sampling. As a demonstration, we classically produced 1 million samples with the fidelity bounded by 0.2%, based on the 20-cycle circuit of the Sycamore 53-qubit quantum chip. Though this task was initially estimated to take 10,000 years on the Summit supercomputer, it took about 14.5 days using our algorithm on a relatively small cluster with 32 GPUs (Tesla V100 16GB). Furthermore, we estimate that for the Zuchongzhi 56-qubit 20-cycle circuit one can produce 1M samples with fidelity 0.066% using the Selene supercomputer with 4480 GPUs (Tesla A100 80GB) in about 4 days.

preprint2021arXiv

Dynamic Virtual Graph Significance Networks for Predicting Influenza

Graph-structured data and their related algorithms have attracted significant attention in many fields, such as influenza prediction in public health. However, the variable influenza seasonality, occasional pandemics, and domain knowledge pose great challenges to construct an appropriate graph, which could impair the strength of the current popular graph-based algorithms to perform data analysis. In this study, we develop a novel method, Dynamic Virtual Graph Significance Networks (DVGSN), which can supervisedly and dynamically learn from similar "infection situations" in historical timepoints. Representation learning on the dynamic virtual graph can tackle the varied seasonality and pandemics, and therefore improve the performance. The extensive experiments on real-world influenza data demonstrate that DVGSN significantly outperforms the current state-of-the-art methods. To the best of our knowledge, this is the first attempt to supervisedly learn a dynamic virtual graph for time-series prediction tasks. Moreover, the proposed method needs less domain knowledge to build a graph in advance and has rich interpretability, which makes the method more acceptable in the fields of public health, life sciences, and so on.

preprint2020arXiv

Colossal Negative Magnetoresistance Effect in A La$_{1.37}$Sr$_{1.63}$Mn$_2$O$_7$ Single Crystal Grown by Laser-Diode-Heated Floating-Zone Technique

We have grown La$_{1.37}$Sr$_{1.63}$Mn$_2$O$_7$ single crystals with a laser-diode-heated floating-zone furnace and studied the crystallinity, structure, and magnetoresistance (MR) effect by in-house X-ray Laue diffraction, X-ray powder diffraction, and resistance measurements. The La$_{1.37}$Sr$_{1.63}$Mn$_2$O$_7$ single crystal crystallizes into a tetragonal structure with space group \emph{I}4{/}\emph{mmm} at room temperature. At 0 T, the maximum resistance centers around $\sim$166.9 K. Below $\sim$35.8 K, it displays an insulating character with an increase in resistance upon cooling. An applied magnetic field of \emph{B}~=~7~T strongly suppresses the resistance indicative of a negative MR effect. The minimum MR value equals $-$91.23\% at 7 T and 128.7 K. The magnetic-field-dependent resistance shows distinct features at 1.67, 140, and 322 K, from which we calculated the corresponding MR values. At 14 T and 140 K, the colossal negative MR value is down to $-$94.04(5)\%. We schematically fit the MR values with different models for an ideal describing of the interesting features of the MR value versus \emph{B} curves.

preprint2020arXiv

Contracting Arbitrary Tensor Networks: General Approximate Algorithm and Applications in Graphical Models and Quantum Circuit Simulations

We present a general method for approximately contracting tensor networks with an arbitrary connectivity. This enables us to release the computational power of tensor networks to wide use in inference and learning problems defined on general graphs. We show applications of our algorithm in graphical models, specifically on estimating free energy of spin glasses defined on various of graphs, where our method largely outperforms existing algorithms including the mean-field methods and the recently proposed neural-network-based methods. We further apply our method to the simulation of random quantum circuits, and demonstrate that, with a trade off of negligible truncation errors, our method is able to simulate large quantum circuits that are out of reach of the state-of-the-art simulation methods.

preprint2020arXiv

Crystalline and magnetic structures, magnetization, heat capacity and anisotropic magnetostriction effect in a yttrium-chromium oxide

We have studied a nearly stoichiometric insulating Y$_{0.97(2)}$Cr$_{0.98(2)}$O$_{3.00(2)}$ single crystal by performing measurements of magnetization, heat capacity, and neutron diffraction. Albeit that the YCrO$_3$ compound behaviors like a soft ferromagnet with a coersive force of $\sim$ 0.05 T, there exist strong antiferromagnetic (AFM) interactions between Cr$^{3+}$ spins due to a strongly negative paramagnetic Curie-Weiss temperature, i.e., -433.2(6) K. The coexistence of ferromagnetism and antiferromagnetism may indicate a canted AFM structure. The AFM phase transition occurs at $T_\textrm{N} =$ 141.5(1) K, which increases to $T_\textrm{N}$(5T) = 144.5(1) K at 5 T. Within the accuracy of the present neuron-diffraction studies, we determine a G-type AFM structure with a propagation vector \textbf{k} = (1 1 0) and Cr$^{3+}$ spin directions along the crystallographic \emph{c} axis of the orthorhombic structure with space group \emph{Pnma} below $T_\textrm{N}$. At 12 K, the refined moment size is 2.45(6) $μ_\textrm{B}$, $\sim$ 82\% of the theoretical saturation value 3 $μ_\textrm{B}$. The Cr$^{3+}$ spin interactions are probably two-dimensional Ising like within the reciprocal (1 1 0) scattering plane. Below $T_\textrm{N}$, the lattice configuration (\emph{a}, \emph{b}, \emph{c}, and \emph{V}) deviates largely downward from the Gr$\ddot{\textrm{u}}$neisen law, displaying an anisotropic magnetostriction effect and a magnetoelastic effect. Especially, the sample contraction upon cooling is enhanced below the AFM transition temperature. There is evidence to suggest that the actual crystalline symmetry of YCrO$_3$ compound is probably lower than the currently assumed one. Additionally, we compared the $t_{2\textrm{g}}$ YCrO$_3$ and the $e_\textrm{g}$ La$_{7/8}$Sr$_{1/8}$MnO$_3$ single crystals for a further understanding of the reason for the possible symmetry lowering.

preprint2020arXiv

Solving Statistical Mechanics on Sparse Graphs with Feedback Set Variational Autoregressive Networks

We propose a method for solving statistical mechanics problems defined on sparse graphs. It extracts a small Feedback Vertex Set (FVS) from the sparse graph, converting the sparse system to a much smaller system with many-body and dense interactions with an effective energy on every configuration of the FVS, then learns a variational distribution parameterized using neural networks to approximate the original Boltzmann distribution. The method is able to estimate free energy, compute observables, and generate unbiased samples via direct sampling without auto-correlation. Extensive experiments show that our approach is more accurate than existing approaches for sparse spin glasses. On random graphs and real-world networks, our approach significantly outperforms the standard methods for sparse systems such as the belief-propagation algorithm; on structured sparse systems such as two-dimensional lattices our approach is significantly faster and more accurate than recently proposed variational autoregressive networks using convolution neural networks.

preprint2020arXiv

Super-Necking Crystal Growth and Structural and Magnetic Properties of SrTb$_2$O$_4$ Single Crystals

We report on single-crystal growths of the SrTb$_2$O$_4$ compound by a super-necking technique with a laser-floating-zone furnace and study the stoichiometry, growth mode, and structural and magnetic properties by scanning electronic microscopy, neutron Laue, X-ray powder diffraction, and the physical property measurement system. We optimized the growth parameters, mainly the growth speed, atmosphere, and the addition of a Tb$_4$O$_7$ raw material. Neutron Laue diffraction displays the characteristic feature of a single crystal. Our study reveals an atomic ratio of Sr:Tb $ = 0.97(2){:}2.00(1)$ and a possible layer by layer crystal growth mode. Our X-ray powder diffraction study determines the crystal structure, lattice constants and atomic positions. The paramagnetic (PM) Curie--Weiss (CW) temperature $θ_{\texttt{CW}} =$ 5.00(4) K, and the effective PM moment $M^{\texttt{eff}}_{\texttt{mea}} =$ 10.97(1) $μ_\texttt{B}$ per Tb$^{3+}$ ion. The data of magnetization versus temperature can be divided into three regimes, showing a coexistence of antiferromagnetic and ferromagnetic interactions. This probably leads to the magnetic frustration in the SrTb$_2$O$_4$ compound. The magnetization at 2 K and 14 T originates from both the Tb1 and Tb2 sites and is strongly frustrated with an expected saturation field at $\sim$41.5 T, displaying an intricate phase diagram with three ranges.

preprint2019arXiv

Phase transitions and optimal algorithms for semi-supervised classifications on graphs: from belief propagation to graph convolution network

We perform theoretical and algorithmic studies for the problem of clustering and semi-supervised classification on graphs with both pairwise relational information and single-point feature information, upon a joint stochastic block model for generating synthetic graphs with both edges and node features. Asymptotically exact analysis based on the Bayesian inference of the underlying model are conducted, using the cavity method in statistical physics. Theoretically, we identify a phase transition of the generative model, which puts fundamental limits on the ability of all possible algorithms in the clustering task of the underlying model. Algorithmically, we propose a belief propagation algorithm that is asymptotically optimal on the generative model, and can be further extended to a belief propagation graph convolution neural network (BPGCN) for semi-supervised classification on graphs. For the first time, well-controlled benchmark datasets with asymptotially exact properties and optimal solutions could be produced for the evaluation of graph convolution neural networks, and for the theoretical understanding of their strengths and weaknesses. In particular, on these synthetic benchmark networks we observe that existing graph convolution neural networks are subject to an sparsity issue and an ovefitting issue in practice, both of which are successfully overcome by our BPGCN. Moreover, when combined with classic neural network methods, BPGCN yields extraordinary classification performances on some real-world datasets that have never been achieved before.