Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
24works
0followers
16topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

24 published item(s)

preprint2026arXiv

MicroWorld: Empowering Multimodal Large Language Models to Bridge the Microscopic Domain Gap with Multimodal Attribute Graph

Multimodal large language models (MLLMs) show remarkable potential for scientific reasoning, yet their performance in specialized domains such as microscopy remains limited by the scarcity of domain-specific training data and the difficulty of encoding fine-grained expert knowledge into model parameters. To bridge the gap, we introduce MicroWorld, a framework that constructs a multimodal attributed property graph (MAPG) from large-scale scientific image--caption corpora and leverages it to augment MLLM reasoning at inference time without any domain-specific fine-tuning. MicroWorld extracts biomedical entities and relations via scispaCy or LLM-based triplet mining, aligns images and entities in a shared embedding space using Qwen3-VL-Embedding, and assembles a knowledge graph comprising approximately 111K nodes and 346K typed edges spanning eight relation categories. At inference time, a graph-augmented retrieval pipeline matches query entities to the MAPG and injects structured knowledge context into the MLLM prompt. On the MicroVQA benchmark, MicroWorld improves the reasoning performance of Qwen3-VL-8B-Instruct by 37.5%, outperforming GPT-5 by 13.0% to achieve a new state-of-the-art. Furthermore, it yields a 6.0% performance gain on the MicroBench benchmark. Extensive experiments demonstrate the enhanced generalization capability introduced by MicroWorld. A qualitative case study further reveals both the mechanisms through which structured knowledge improves reasoning and the failure modes that point to promising future directions. Code and data are available at https://github.com/ieellee/MicroWorld.

preprint2025arXiv

Optical pumping and laser slowing of a heavy molecule

Precision measurements of the electron's electric dipole moment (eEDM) are critical for testing fundamental symmetries in particle physics, and heavy polar molecules-such as barium monofluoride (BaF)-have emerged as promising candidates for advancing the sensitivity. However, the achievement of a 3D magneto-optical trap (MOT) required slowing BaF molecules to near-zero velocity by scattering over 10^4 photons per molecule, demanding a quasi-cycling transition with minimal leakage. We present a detailed study of the leakage channels, including higher vibrational and rotational states. By combining microwave remixing with optical pumping of rotational and vibrational dark states, we reduced the total leakage fraction to 10^-5. Using frequency-chirped laser slowing, we slowed a subset of buffer-gas-cooled BaF molecules from approximately 80 m/s to near-zero velocity, which is critical for efficient MOT loading. This work establishes the technical foundation for precision eEDM measurements using laser-cooled heavy molecules.

preprint2024arXiv

Context-Aware Iteration Policy Network for Efficient Optical Flow Estimation

Existing recurrent optical flow estimation networks are computationally expensive since they use a fixed large number of iterations to update the flow field for each sample. An efficient network should skip iterations when the flow improvement is limited. In this paper, we develop a Context-Aware Iteration Policy Network for efficient optical flow estimation, which determines the optimal number of iterations per sample. The policy network achieves this by learning contextual information to realize whether flow improvement is bottlenecked or minimal. On the one hand, we use iteration embedding and historical hidden cell, which include previous iterations information, to convey how flow has changed from previous iterations. On the other hand, we use the incremental loss to make the policy network implicitly perceive the magnitude of optical flow improvement in the subsequent iteration. Furthermore, the computational complexity in our dynamic network is controllable, allowing us to satisfy various resource preferences with a single trained model. Our policy network can be easily integrated into state-of-the-art optical flow networks. Extensive experiments show that our method maintains performance while reducing FLOPs by about 40%/20% for the Sintel/KITTI datasets.

preprint2024arXiv

Engineering topological chiral transport in a flat-band lattice of ultracold atoms

The manipulation of particle transport in synthetic quantum matter is an active research frontier for its theoretical importance and potential applications. Here we experimentally demonstrate an engineered topological transport in a synthetic flat-band lattice of ultracold $^{87}$Rb atoms. We implement a quasi-one-dimensional rhombic chain with staggered flux in the momentum space of the atomic condensate and observe biased local oscillations that originate from the interplay of the staggered flux and flat-band localization under the mechanism of Aharonov-Bohm caging. Based on these features, we design and experimentally confirm a state-dependent chiral transport under the periodic modulation of the synthetic flux. We show that the phenomenon is topologically protected by the winding of the Floquet Bloch bands of a coarse-grained effective Hamiltonian. The observed chiral transport offers a strategy for efficient quantum device design where topological robustness is ensured by fast Floquet driving and flat-band localization.

preprint2022arXiv

A Review of Location Encoding for GeoAI: Methods and Applications

A common need for artificial intelligence models in the broader geoscience is to represent and encode various types of spatial data, such as points (e.g., points of interest), polylines (e.g., trajectories), polygons (e.g., administrative regions), graphs (e.g., transportation networks), or rasters (e.g., remote sensing images), in a hidden embedding space so that they can be readily incorporated into deep learning models. One fundamental step is to encode a single point location into an embedding space, such that this embedding is learning-friendly for downstream machine learning models such as support vector machines and neural networks. We call this process location encoding. However, there lacks a systematic review on the concept of location encoding, its potential applications, and key challenges that need to be addressed. This paper aims to fill this gap. We first provide a formal definition of location encoding, and discuss the necessity of location encoding for GeoAI research from a machine learning perspective. Next, we provide a comprehensive survey and discussion about the current landscape of location encoding research. We classify location encoding models into different categories based on their inputs and encoding methods, and compare them based on whether they are parametric, multi-scale, distance preserving, and direction aware. We demonstrate that existing location encoding models can be unified under a shared formulation framework. We also discuss the application of location encoding for different types of spatial data. Finally, we point out several challenges in location encoding research that need to be solved in the future.

preprint2022arXiv

Bilateral Network with Channel Splitting Network and Transformer for Thermal Image Super-Resolution

In recent years, the Thermal Image Super-Resolution (TISR) problem has become an attractive research topic. TISR would been used in a wide range of fields, including military, medical, agricultural and animal ecology. Due to the success of PBVS-2020 and PBVS-2021 workshop challenge, the result of TISR keeps improving and attracts more researchers to sign up for PBVS-2022 challenge. In this paper, we will introduce the technical details of our submission to PBVS-2022 challenge designing a Bilateral Network with Channel Splitting Network and Transformer(BN-CSNT) to tackle the TISR problem. Firstly, we designed a context branch based on channel splitting network with transformer to obtain sufficient context information. Secondly, we designed a spatial branch with shallow transformer to extract low level features which can preserve the spatial information. Finally, for the context branch in order to fuse the features from channel splitting network and transformer, we proposed an attention refinement module, and then features from context branch and spatial branch are fused by proposed feature fusion module. The proposed method can achieve PSNR=33.64, SSIM=0.9263 for x4 and PSNR=21.08, SSIM=0.7803 for x2 in the PBVS-2022 challenge test dataset.

preprint2022arXiv

Decoupled measurement and modeling of interface reaction kinetics of ion-intercalation battery electrodes

Ultrahigh rate performance of active particles used in lithium-ion battery electrodes has been revealed by single-particle measurements, which indicates a huge potential for developing high-power batteries. However, the charging/discharging behaviors of single particles at ultrahigh C-rates can no longer be described by the traditional electrochemical kinetics in such ion-intercalation active materials. In the meantime, regular kinetic measuring methods meet a challenge due to the coupling of interface reaction and solid-state diffusion processes of active particles. Here, we decouple the reaction and diffusion kinetics via time-resolved potential measurements with an interval of 1 ms, revealing that the classical Butler-Volmer equation deviates from the actual relation between current density, overpotential, and Li+ concentration. An interface ion-intercalation model is developed which considers the excess driving force of Li+ (de)intercalation in the charge transfer reaction for ion-intercalation materials. Simulations demonstrate that the proposed model enables accurate prediction of charging/discharging at both single-particle and electrode scales for various active materials. The kinetic limitation processes from single particles to composite electrodes are systematically revealed, promoting rational designs of high-power batteries.

preprint2022arXiv

Doppler cooling of buffer-gas-cooled Barium monofluoride molecules

We demonstrate one-dimensional Doppler cooling of a beam of buffer-gas cooled Barium monofluoride (BaF) molecules. The dependences of the cooling efficiency with the laser detuning, the bias filed and the laser intensity are carefully measured. We numerical simulate our experiment with a Monte Carlo method, and find the theoretic predictions consists with our experimental data. This result represents a key step towards further cooling and trapping of BaF molecules.

preprint2022arXiv

Geometry-Aware Reference Synthesis for Multi-View Image Super-Resolution

Recent multi-view multimedia applications struggle between high-resolution (HR) visual experience and storage or bandwidth constraints. Therefore, this paper proposes a Multi-View Image Super-Resolution (MVISR) task. It aims to increase the resolution of multi-view images captured from the same scene. One solution is to apply image or video super-resolution (SR) methods to reconstruct HR results from the low-resolution (LR) input view. However, these methods cannot handle large-angle transformations between views and leverage information in all multi-view images. To address these problems, we propose the MVSRnet, which uses geometry information to extract sharp details from all LR multi-view to support the SR of the LR input view. Specifically, the proposed Geometry-Aware Reference Synthesis module in MVSRnet uses geometry information and all multi-view LR images to synthesize pixel-aligned HR reference images. Then, the proposed Dynamic High-Frequency Search network fully exploits the high-frequency textural details in reference images for SR. Extensive experiments on several benchmarks show that our method significantly improves over the state-of-the-art approaches.

preprint2022arXiv

Learning Parallax Transformer Network for Stereo Image JPEG Artifacts Removal

Under stereo settings, the performance of image JPEG artifacts removal can be further improved by exploiting the additional information provided by a second view. However, incorporating this information for stereo image JPEG artifacts removal is a huge challenge, since the existing compression artifacts make pixel-level view alignment difficult. In this paper, we propose a novel parallax transformer network (PTNet) to integrate the information from stereo image pairs for stereo image JPEG artifacts removal. Specifically, a well-designed symmetric bi-directional parallax transformer module is proposed to match features with similar textures between different views instead of pixel-level view alignment. Due to the issues of occlusions and boundaries, a confidence-based cross-view fusion module is proposed to achieve better feature fusion for both views, where the cross-view features are weighted with confidence maps. Especially, we adopt a coarse-to-fine design for the cross-view interaction, leading to better performance. Comprehensive experimental results demonstrate that our PTNet can effectively remove compression artifacts and achieves superior performance than other testing state-of-the-art methods.

preprint2022arXiv

Observation of Non-Hermitian Skin Effect and Topology in Ultracold Atoms

The non-Hermitian skin effect (NHSE), the accumulation of eigen wavefunctions at boundaries of open systems, underlies a variety of exotic properties that defy conventional wisdom. While NHSE and its intriguing impact on band topology and dynamics have been observed in classical or photonic systems, their demonstration in a quantum many-body setting remains elusive. Here we report the experimental realization of a dissipative Aharonov-Bohm chain -- a non-Hermitian topological model with NHSE -- in the momentum space of a two-component Bose-Einstein condensate. We identify unique signatures of NHSE in the condensate dynamics, and perform Bragg spectroscopy to resolve topological edge states against a background of localized bulk states. Our work sets the stage for further investigation on the interplay of many-body statistics and interactions with NHSE, and is a significant step forward in the quantum control and simulation of non-Hermitian physics.

preprint2022arXiv

Overpotential decomposition enabled decoupling of complex kinetic processes in battery electrodes

Identifying overpotential components of electrochemical systems enables quantitative analysis of polarization contributions of kinetic processes under practical operating conditions. However, the inherently coupled kinetic processes lead to an enormous challenge in measuring individual overpotentials, particularly in composite electrodes of lithium-ion batteries. Herein, the full decomposition of electrode overpotential is realized by the collaboration of single-layer structured particle electrode (SLPE) constructions and time-resolved potential measurements, explicitly revealing the evolution of kinetic processes. Perfect prediction of the discharging profiles is achieved via potential measurements on SLPEs, even in extreme polarization conditions. By decoupling overpotentials in different electrode/cell structures and material systems, the dominant limiting processes of battery rate performance are uncovered, based on which the optimization of electrochemical kinetics can be conducted. Our study not only shades light on decoupling complex kinetics in electrochemical systems, but also provides vitally significant guidance for the rational design of high-performance batteries.

preprint2022arXiv

Perception-Oriented Stereo Image Super-Resolution

Recent studies of deep learning based stereo image super-resolution (StereoSR) have promoted the development of StereoSR. However, existing StereoSR models mainly concentrate on improving quantitative evaluation metrics and neglect the visual quality of super-resolved stereo images. To improve the perceptual performance, this paper proposes the first perception-oriented stereo image super-resolution approach by exploiting the feedback, provided by the evaluation on the perceptual quality of StereoSR results. To provide accurate guidance for the StereoSR model, we develop the first special stereo image super-resolution quality assessment (StereoSRQA) model, and further construct a StereoSRQA database. Extensive experiments demonstrate that our StereoSR approach significantly improves the perceptual quality and enhances the reliability of stereo images for disparity estimation.

preprint2022arXiv

Rethinking Super-Resolution as Text-Guided Details Generation

Deep neural networks have greatly promoted the performance of single image super-resolution (SISR). Conventional methods still resort to restoring the single high-resolution (HR) solution only based on the input of image modality. However, the image-level information is insufficient to predict adequate details and photo-realistic visual quality facing large upscaling factors (x8, x16). In this paper, we propose a new perspective that regards the SISR as a semantic image detail enhancement problem to generate semantically reasonable HR image that are faithful to the ground truth. To enhance the semantic accuracy and the visual quality of the reconstructed image, we explore the multi-modal fusion learning in SISR by proposing a Text-Guided Super-Resolution (TGSR) framework, which can effectively utilize the information from the text and image modalities. Different from existing methods, the proposed TGSR could generate HR image details that match the text descriptions through a coarse-to-fine process. Extensive experiments and ablation studies demonstrate the effect of the TGSR, which exploits the text reference to recover realistic images.

preprint2022arXiv

The Second Place Solution for The 4th Large-scale Video Object Segmentation Challenge--Track 3: Referring Video Object Segmentation

The referring video object segmentation task (RVOS) aims to segment object instances in a given video referred by a language expression in all video frames. Due to the requirement of understanding cross-modal semantics within individual instances, this task is more challenging than the traditional semi-supervised video object segmentation where the ground truth object masks in the first frame are given. With the great achievement of Transformer in object detection and object segmentation, RVOS has been made remarkable progress where ReferFormer achieved the state-of-the-art performance. In this work, based on the strong baseline framework--ReferFormer, we propose several tricks to boost further, including cyclical learning rates, semi-supervised approach, and test-time augmentation inference. The improved ReferFormer ranks 2nd place on CVPR2022 Referring Youtube-VOS Challenge.

preprint2022arXiv

The Third Place Solution for CVPR2022 AVA Accessibility Vision and Autonomy Challenge

The goal of AVA challenge is to provide vision-based benchmarks and methods relevant to accessibility. In this paper, we introduce the technical details of our submission to the CVPR2022 AVA Challenge. Firstly, we conducted some experiments to help employ proper model and data augmentation strategy for this task. Secondly, an effective training strategy was applied to improve the performance. Thirdly, we integrated the results from two different segmentation frameworks to improve the performance further. Experimental results demonstrate that our approach can achieve a competitive result on the AVA test set. Finally, our approach achieves 63.008\%AP@0.50:0.95 on the test set of CVPR2022 AVA Challenge.

preprint2021arXiv

Isotope Separation of Potassium with a magneto-optical combined method

Due to the similar physical and chemical properties, isotopes are usually hard to separate. On the other hand, the isotope shifts are very well separated in a high-resolution spectrum, making them possible to be addressed individually by lasers, thus separated. Here we report such an isotope separation experiment with Potassium atoms. The isotopes are independently optical pumped to the desired spin states, and then separated with a Stern-Gerlach scheme. A micro-capillary oven is used to collimate the atomic beam, and a Halbach-type magnet array is used to deflect the desired atoms. Finally, the $^{40}$K is enriched by two orders of magnitude. This magneto-optical combined method provides an effective way to separate isotopes and can be extended to other elements if the relevant optical pumping scheme is feasible.

preprint2020arXiv

Isometric Graph Neural Networks

Many tasks that rely on representations of nodes in graphs would benefit if those representations were faithful to distances between nodes in the graph. Geometric techniques to extract such representations have poor scaling over large graph size, and recent advances in Graph Neural Network (GNN) algorithms have limited ability to reflect graph distance information beyond the first degree neighborhood. To enable this highly desired capability, we propose a technique to learn Isometric Graph Neural Networks (IGNN), which requires changing the input representation space and loss function to enable any GNN algorithm to generate representations that reflect distances between nodes. We experiment with the isometric technique on several GNN architectures for modeling multiple prediction tasks on multiple datasets. In addition to an improvement in AUC-ROC as high as $43\%$ in these experiments, we observe a consistent and substantial improvement as high as 400% in Kendall's Tau (KT), a measure that directly reflects distance information, demonstrating that the learned embeddings do account for graph distances.

preprint2020arXiv

Multi-Scale Representation Learning for Spatial Feature Distributions using Grid Cells

Unsupervised text encoding models have recently fueled substantial progress in NLP. The key idea is to use neural networks to convert words in texts to vector space representations based on word positions in a sentence and their contexts, which are suitable for end-to-end training of downstream tasks. We see a strikingly similar situation in spatial analysis, which focuses on incorporating both absolute positions and spatial contexts of geographic objects such as POIs into models. A general-purpose representation model for space is valuable for a multitude of tasks. However, no such general model exists to date beyond simply applying discretization or feed-forward nets to coordinates, and little effort has been put into jointly modeling distributions with vastly different characteristics, which commonly emerges from GIS data. Meanwhile, Nobel Prize-winning Neuroscience research shows that grid cells in mammals provide a multi-scale periodic representation that functions as a metric for location encoding and is critical for recognizing places and for path-integration. Therefore, we propose a representation learning model called Space2Vec to encode the absolute positions and spatial relationships of places. We conduct experiments on two real-world geographic data for two different tasks: 1) predicting types of POIs given their positions and context, 2) image classification leveraging their geo-locations. Results show that because of its multi-scale representations, Space2Vec outperforms well-established ML approaches such as RBF kernels, multi-layer feed-forward nets, and tile embedding approaches for location modeling and image classification tasks. Detailed analysis shows that all baselines can at most well handle distribution at one scale but show poor performances in other scales. In contrast, Space2Vec's multi-scale representation can handle distributions at different scales.

preprint2020arXiv

Periodic driving induced helical Floquet channels with ultracold atoms in momentum space

Employing the external degrees of freedom of atoms as synthetic dimensions renders easy and new accesses to quantum engineering and quantum simulation. As a recent development, ultracold atoms suffering from two-photon Bragg transitions can be diffracted into a series of discrete momentum states to form a momentum lattice. Here we provide a detailed analysis on such a system, and, as a concrete example, report the observation of robust helical Floquet channels, by introducing periodic driving sequences. The robustness of these channels against perturbations is confirmed, as a test for their topological origin captured by Floquet winding numbers. The periodic switching demonstrated here serves as a testbed for more complicated Floquet engieering schemes, and offers exciting opportunities to study novel topological physics in a many-body setting with tunable interactions.

preprint2020arXiv

SE-KGE: A Location-Aware Knowledge Graph Embedding Model for Geographic Question Answering and Spatial Semantic Lifting

Learning knowledge graph (KG) embeddings is an emerging technique for a variety of downstream tasks such as summarization, link prediction, information retrieval, and question answering. However, most existing KG embedding models neglect space and, therefore, do not perform well when applied to (geo)spatial data and tasks. For those models that consider space, most of them primarily rely on some notions of distance. These models suffer from higher computational complexity during training while still losing information beyond the relative distance between entities. In this work, we propose a location-aware KG embedding model called SE-KGE. It directly encodes spatial information such as point coordinates or bounding boxes of geographic entities into the KG embedding space. The resulting model is capable of handling different types of spatial reasoning. We also construct a geographic knowledge graph as well as a set of geographic query-answer pairs called DBGeo to evaluate the performance of SE-KGE in comparison to multiple baselines. Evaluation results show that SE-KGE outperforms these baselines on the DBGeo dataset for geographic logic query answering task. This demonstrates the effectiveness of our spatially-explicit model and the importance of considering the scale of different geographic entities. Finally, we introduce a novel downstream task called spatial semantic lifting which links an arbitrary location in the study area to entities in the KG via some relations. Evaluation on DBGeo shows that our model outperforms the baseline by a substantial margin.

preprint2020arXiv

Topological quantum walks in momentum space with a Bose-Einstein condensate

We report the experimental implementation of discrete-time topological quantum walks of a Bose-Einstein condensate in momentum space. Introducing stroboscopic driving sequences to the generation of a momentum lattice, we show that the dynamics of atoms along the lattice is effectively governed by a periodically driven Su-Schrieffer-Heeger model, which is equivalent to a discrete-time topological quantum walk. We directly measure the underlying topological invariants through time-averaged mean chiral displacements, which are consistent with our experimental observation of topological phase transitions. We then observe interaction-induced localization in the quantum-walk dynamics, where atoms tend to populate a single momentum-lattice site under interactions that are non-local in momentum space. Our experiment opens up the avenue of investigating discrete-time topological quantum walks using cold atoms, where the many-body environment and tunable interactions offer exciting new possibilities.

preprint2020arXiv

Tunable non-reciprocal quantum transport through a dissipative Aharonov-Bohm ring in ultracold atoms

We report the experimental observation of tunable, non-reciprocal quantum transport of a Bose-Einstein condensate in a momentum lattice. By implementing a dissipative Aharonov-Bohm (AB) ring in momentum space and sending atoms through it, we demonstrate a directional atom flow by measuring the momentum distribution of the condensate at different times. While the dissipative AB ring is characterized by the synthetic magnetic flux through the ring and the laser-induced loss on it, both the propagation direction and transport rate of the atom flow sensitively depend on these highly tunable parameters. We demonstrate that the non-reciprocity originates from the interplay of the synthetic magnetic flux and the laser-induced loss, which simultaneously breaks the inversion and the time-reversal symmetries. Our results open up the avenue for investigating non-reciprocal dynamics in cold atoms, and highlight the dissipative AB ring as a flexible building element for applications in quantum simulation and quantum information.

preprint2019arXiv

A study on dynamical complexity of noise induced blood flow

In this article, the dynamics and complexity of a noise induced blood flow system have been investigated. Changes in the dynamics have been recognized by measuring the periodicity over significant parameters. Chaotic as well as non-chaotic regimes have also been classified. Further, dynamical complexity has been studied by phase space based weighted entropy. Numerical results show a strong correlation between the dynamics and complexity of the noise induced system. The correlation has been confirmed by a cross-correlation analysis.