Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
19works
0followers
17topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

19 published item(s)

preprint2026arXiv

Aquatic Neuromorphic Optical Flow

Underwater environments impose severe constraints on conventional imaging systems and demand solutions that balance high-quality sensing with strict resource efficiency. While emerging event cameras offer a promising alternative, their potential in aquatic scenarios remains largely unexplored. Through the lens of neuromorphic vision, this work pioneers the investigation of motion fields that serve as key media for agile underwater perception. Built upon spiking neural networks, we introduce a self-supervised framework to estimate per-pixel optical flow from asynchronous event streams, elegantly bypassing the long-standing bottleneck of underwater data scarcity. Extensive evaluations demonstrate that our method achieves competitive visual and quantitative results against leading techniques while operating with superior computational efficiency. By bridging neuromorphic sensing and aquatic intelligence, this work opens new frontiers for lightweight, real-time, and low-cost perception on resource-constrained underwater edge platforms.

preprint2026arXiv

LISA: Language-guided Interference-aware Spatial-Frequency Attention for Driver Gaze Estimation

Driver gaze estimation serves as a fundamental metric for evaluating driver attentiveness in modern monitoring systems. Beyond being vulnerable to sudden lighting changes and sensor noise, spatial-domain models struggle to disentangle authentic gaze cues from irrelevant visual attributes. In this paper, we propose LISA, a \textbf{L}anguage-guided \textbf{I}nterference-aware \textbf{S}patial-Frequency \textbf{A}ttention framework that combines frequency-domain priors with vision-language knowledge. Observing that the amplitude spectrum remains relatively stable even under spatial perturbations, we design a dual-domain fusion mechanism. It integrates stable low-frequency semantics into high-frequency details, employing spatial attention to precisely target ocular regions. To reduce semantic ambiguity, we also introduce a training-time disentanglement strategy. Using a frozen CLIP encoder and orthogonal regularization, we explicitly separate gaze features from appearance interference. Experiments on two benchmarks show that LISA achieves state-of-the-art performance, with significantly improved robustness against occlusions and lighting variations. The code repository is available at https://github.com/Mason-bupt/LISA.

preprint2026arXiv

See Silhouettes in Motion with Neuromorphic Vision

Quasi-bimodal objects, such as text, road signs, and barcodes, play a basic yet vital role in daily visual communication. By boiling these down to clear silhouettes, binarization uses a minimal language to convey essential vision cues for maximum downstream efficiency. The catch is that frame-based imaging often struggles on mobile platforms like drones, self-driving cars, and underwater vehicles. In these dynamic scenes, rapid motion and harsh lighting can make it blind, causing severe motion blur and erasing crucial details. To overcome the limits, neuromorphic vision via event cameras, featuring microsecond-level temporal resolution and high dynamic range, steps in as a natural solution. Building upon this event-driven sensing paradigm, we introduce a simple yet effective dual-modal approach that harnesses the synergy between frames and events to achieve real-time, high-frame-rate binarization on CPU-only devices. Extensive evaluations present that it earns competitive performance against leading techniques in reducing motion blur, while delivering impressive improvements under challenging illumination. Besides, our asynchronous workflow bypasses event scarcity that breaks traditional time-binning reconstruction, maintaining clear target shapes even at extreme kilohertz frame rates. Its binary results further serve as reliable representations that facilitate a range of downstream tasks. This work paves the way towards lightweight perception and interaction in embodied intelligence on resource-constrained edge platforms.

preprint2025arXiv

Autoregressive long-horizon prediction of plasma edge dynamics

Accurate modeling of scrape-off layer (SOL) and divertor-edge dynamics is vital for designing plasma-facing components in fusion devices. High-fidelity edge fluid/neutral codes such as SOLPS-ITER capture SOL physics with high accuracy, but their computational cost limits broad parameter scans and long transient studies. We present transformer-based, autoregressive surrogates for efficient prediction of 2D, time-dependent plasma edge state fields. Trained on SOLPS-ITER spatiotemporal data, the surrogates forecast electron temperature, electron density, and radiated power over extended horizons. We evaluate model variants trained with increasing autoregressive horizons (1-100 steps) on short- and long-horizon prediction tasks. Longer-horizon training systematically improves rollout stability and mitigates error accumulation, enabling stable predictions over hundreds to thousands of steps and reproducing key dynamical features such as the motion of high-radiation regions. Measured end-to-end wall-clock times show the surrogate is orders of magnitude faster than SOLPS-ITER, enabling rapid parameter exploration. Prediction accuracy degrades when the surrogate enters physical regimes not represented in the training dataset, motivating future work on data enrichment and physics-informed constraints. Overall, this approach provides a fast, accurate surrogate for computationally intensive plasma edge simulations, supporting rapid scenario exploration, control-oriented studies, and progress toward real-time applications in fusion devices.

preprint2022arXiv

Direct Light Orbital Angular Momentum Detection in Mid-Infrared based on Type-II Weyl Semimetal TaIrTe4

The capability of direct photocurrent detection of orbital angular momentum (OAM) of light has recently been realized with topological Weyl semimetal, but limited to near infrared wavelength range. The extension of direct OAM detection to midinfrared, a wavelength range that plays important role in a vast range of applications, such as autonomous driving, night vision and motion detection, is challenging and has not yet been realized. This is because most studies of photocurrent responses are not sensitive to the phase information and the photo response is usually very poor in the mid-infrared. In this study, we designed a photodetector based on Type-II Weyl semimetal tantalum iridium tellurides with designed electrode geometries for direct detection of the topological charge of OAM through orbital photogalvanic effect. Our results indicate helical phase gradient of light can be distinguished by a current winding around the optical beam axis with a magnitude proportional to its quantized OAM mode number. The topological enhanced response at mid-infrared of TaIrTe4 further help overcome the low responsivity issues and finally render the direct orbital angular momentum detection capability in mid-infrared. Our study enables on-chip integrated OAM detection, and thus OAM sensitive focal plane arrays in mid-infrared. Such capability triggers new route to explore applications of light carrying OAM, especially that it can crucially promote the performance of many mid-infrared imaging related applications, such as intricate target recognition and night vision.

preprint2022arXiv

Friction-dependent rheology of dry granular systems

Understanding the rheology of granular assemblies is important for natural and engineering systems, but the relationship between inter-particle friction (or microscopic friction) and macroscopic friction is still not well understood. In this study, using the the discrete element method (DEM) with spherical particles and realistic contact laws, we investigate the mechanics of granular systems with a wide range of inter-particle frictional coefficients and aim to establish a friction-dependent rheology for dry granular flows. The corresponding results show that increasing inter-particle friction dramatically increases the effective frictional coefficient, $μ_{\textrm{eff}}$, while decreasing the solid fraction of the system and increasing the transitional inertial number that marks the division of quasi-static regimes and intermediate flow regimes. We further propose a new dimensionless number, $\mathcal{M}$, as a ratio between the inertial effect and frictional effect, which is similar to the effective aspect ratio in granular column collapses, and unifies the influence of inter-particle friction with the inertial number. We then establish a relationship between $\mathcal{M}$ and the dimensionless granular temperature, $Θ$, to further universalize the influence of inter-particle frictions. Such study can broaden the application of the $μ(I)$ rheology in natural and engineering systems and help establish a more general constitutive model for complex granular systems.

preprint2022arXiv

Multi-task graph neural networks for simultaneous prediction of global and atomic properties in ferromagnetic systems

We introduce a multi-tasking graph convolutional neural network, HydraGNN, to simultaneously predict both global and atomic physical properties and demonstrate with ferromagnetic materials. We train HydraGNN on an open-source ab initio density functional theory (DFT) dataset for iron-platinum (FePt) with a fixed body centered tetragonal (BCT) lattice structure and fixed volume to simultaneously predict the mixing enthalpy (a global feature of the system), the atomic charge transfer, and the atomic magnetic moment across configurations that span the entire compositional range. By taking advantage of underlying physical correlations between material properties, multi-task learning (MTL) with HydraGNN provides effective training even with modest amounts of data. Moreover, this is achieved with just one architecture instead of three, as required by single-task learning (STL). The first convolutional layers of the HydraGNN architecture are shared by all learning tasks and extract features common to all material properties. The following layers discriminate the features of the different properties, the results of which are fed to the separate heads of the final layer to produce predictions. Numerical results show that HydraGNN effectively captures the relation between the configurational entropy and the material properties over the entire compositional range. Overall, the accuracy of simultaneous MTL predictions is comparable to the accuracy of the STL predictions. In addition, the computational cost of training HydraGNN for MTL is much lower than the original DFT calculations and also lower than training separate STL models for each property.

preprint2022arXiv

Multiparameter simultaneous optimal estimation with an SU(2) coding unitary evolution

In a ubiquitous $SU(2)$ dynamics, achieving the simultaneous optimal estimation of multiple parameters is significant but difficult. Using quantum control to optimize this $SU(2)$ coding unitary evolution is one of solutions. We propose a method, characterized by the nested cross-products of the coefficient vector $\mathbf{X}$ of $SU(2)$ generators and its partial derivative $\partial_\ell \mathbf{X}$, to investigate the control-enhanced quantum multiparameter estimation. Our work reveals that quantum control is not always functional in improving the estimation precision, which depends on the characterization of an $SU(2)$ dynamics with respect to the objective parameter. This characterization is quantified by the angle $α_\ell$ between $\mathbf{X}$ and $\partial_\ell \mathbf{X}$. For an $SU(2)$ dynamics featured by $α_\ell=π/2$, the promotion of the estimation precision can get the most benefits from the controls. When $α_\ell$ gradually closes to $0$ or $π$, the precision promotion contributed to by quantum control correspondingly becomes inconspicuous. Until a dynamics with $α_\ell=0$ or $π$, quantum control completely loses its advantage. In addition, we find a set of conditions restricting the simultaneous optimal estimation of all the parameters, but fortunately, which can be removed by using a maximally entangled two-qubit state as the probe state and adding an ancillary channel into the configuration. Lastly, a spin-$1/2$ system is taken as an example to verify the above-mentioned conclusions. Our proposal sufficiently exhibits the hallmark of control-enhancement in fulfilling the multiparameter estimation mission, and it is applicable to an arbitrary $SU(2)$ parametrization process.

preprint2022arXiv

PI3NN: Out-of-distribution-aware prediction intervals from three neural networks

We propose a novel prediction interval (PI) method for uncertainty quantification, which addresses three major issues with the state-of-the-art PI methods. First, existing PI methods require retraining of neural networks (NNs) for every given confidence level and suffer from the crossing issue in calculating multiple PIs. Second, they usually rely on customized loss functions with extra sensitive hyperparameters for which fine tuning is required to achieve a well-calibrated PI. Third, they usually underestimate uncertainties of out-of-distribution (OOD) samples leading to over-confident PIs. Our PI3NN method calculates PIs from linear combinations of three NNs, each of which is independently trained using the standard mean squared error loss. The coefficients of the linear combinations are computed using root-finding algorithms to ensure tight PIs for a given confidence level. We theoretically prove that PI3NN can calculate PIs for a series of confidence levels without retraining NNs and it completely avoids the crossing issue. Additionally, PI3NN does not introduce any unusual hyperparameters resulting in a stable performance. Furthermore, we address OOD identification challenge by introducing an initialization scheme which provides reasonably larger PIs of the OOD samples than those of the in-distribution samples. Benchmark and real-world experiments show that our method outperforms several state-of-the-art approaches with respect to predictive uncertainty quality, robustness, and OOD samples identification.

preprint2022arXiv

Retrieving High-Dimensional Quantum Steering From a Noisy Environment with N Measurement Settings

One of the most often implied benefits of high-dimensional (HD) quantum systems is to lead to stronger forms of correlations, featuring increased robustness to noise. Here, we experimentally demonstrate the $n$-setting linear HD quantum steering criterion. We verify the large violation of the steering inequalities without full-state tomography. The lower bound of the violation is $2.24\pm0.01$ in 11 dimensions, exceeding the bound ($V<2$) of 2-setting criteria. Hence, a higher strength of steering has been revealed. Moreover, we demonstrate the method for enhancing the noise robustness without increasing dimension, alternatively, by increasing measurement settings. Using the entanglement in 11 dimensions, we experimentally retrieve steering nonlocality with $63.4\pm1.4\%$ isotropic noise fraction, surpassing the $50\%$ limitation of 2-setting criteria. Our work offers the potential for practical one-sided device-independent quantum information processing that tolerates the noisy environment, lossy detection, and transcends the present transmission distance limitation.

preprint2022arXiv

Scalable training of graph convolutional neural networks for fast and accurate predictions of HOMO-LUMO gap in molecules

Graph Convolutional Neural Network (GCNN) is a popular class of deep learning (DL) models in material science to predict material properties from the graph representation of molecular structures. Training an accurate and comprehensive GCNN surrogate for molecular design requires large-scale graph datasets and is usually a time-consuming process. Recent advances in GPUs and distributed computing open a path to reduce the computational cost for GCNN training effectively. However, efficient utilization of high performance computing (HPC) resources for training requires simultaneously optimizing large-scale data management and scalable stochastic batched optimization techniques. In this work, we focus on building GCNN models on HPC systems to predict material properties of millions of molecules. We use HydraGNN, our in-house library for large-scale GCNN training, leveraging distributed data parallelism in PyTorch. We use ADIOS, a high-performance data management framework for efficient storage and reading of large molecular graph data. We perform parallel training on two open-source large-scale graph datasets to build a GCNN predictor for an important quantum property known as the HOMO-LUMO gap. We measure the scalability, accuracy, and convergence of our approach on two DOE supercomputers: the Summit supercomputer at the Oak Ridge Leadership Computing Facility (OLCF) and the Perlmutter system at the National Energy Research Scientific Computing Center (NERSC). We present our experimental results with HydraGNN showing i) reduction of data loading time up to 4.2 times compared with a conventional method and ii) linear scaling performance for training up to 1,024 GPUs on both Summit and Perlmutter.

preprint2022arXiv

Verification of Kochen-Specker-type quantum contextuality with a single photon

Contextuality provides one of the fundamental characterizations of quantum phenomena, and can be used as a resource in lots of quantum information processing. In this paper, we summarize and derive some equivalent noncontextual inequalities from different noncontextual models of the proofs for Kochen-Specker theorem based on Greenberger-Horne-Zeilinger states. These noncontextual inequalities are equivalent up to some correlation items which hold both for noncontextual hidden variable theories and quantum mechanics. Therefore, using single-photon hyperentangled Greenberger-Horne-Zeilinger states encoded by spin, path and orbital angular momentum, we experimentally verify several state-dependent noncontextual models of the proofs for the Kochen-Specker theorem by testing an extreme simplest Mermin-like inequality.

preprint2021arXiv

PSA: A novel optimization algorithm based on survival rules of porcellio scaber

Bio-inspired algorithms such as neural network algorithms and genetic algorithms have received a significant amount of attention in both academic and engineering societies. In this paper, based on the observation of two major survival rules of a species of woodlice, i.e., porcellio scaber, we present an algorithm called the porcellio scaber algorithm (PSA) for solving general unconstrained optimization problems, including differentiable and non-differential ones as well as the case with local optima. Numerical results based on benchmark problems are presented to validate the efficacy of PSA.

preprint2021arXiv

Realization of a deterministic quantum Toffoli gate with a single photon

Quantum controlled-logic gates, including controlled NOT gate and Toffoli gate, play critical roles in lots of quantum information processing schemes. We design and experimentally demonstrate deterministic Toffoli gate by utilizing orbital-angular-momentum and polarization degrees of freedom of a single photon. In addition, we generate Bell states by using the controlled NOT gate. The effective conversion rate of the Toffoli gate in our experiment is $(95.1\pm3.2)\%$. Furthermore, our experimental setup does not require any auxiliary photons and probabilistic post selections.

preprint2020arXiv

Extending Label Smoothing Regularization with Self-Knowledge Distillation

Inspired by the strong correlation between the Label Smoothing Regularization(LSR) and Knowledge distillation(KD), we propose an algorithm LsrKD for training boost by extending the LSR method to the KD regime and applying a softer temperature. Then we improve the LsrKD by a Teacher Correction(TC) method, which manually sets a constant larger proportion for the right class in the uniform distribution teacher. To further improve the performance of LsrKD, we develop a self-distillation method named Memory-replay Knowledge Distillation (MrKD) that provides a knowledgeable teacher to replace the uniform distribution one in LsrKD. The MrKD method penalizes the KD loss between the current model&#39;s output distributions and its copies&#39; on the training trajectory. By preventing the model learning so far from its historical output distribution space, MrKD can stabilize the learning and find a more robust minimum. Our experiments show that LsrKD can improve LSR performance consistently at no cost, especially on several deep neural networks where LSR is ineffectual. Also, MrKD can significantly improve single model training. The experiment results confirm that the TC can help LsrKD and MrKD to boost training, especially on the networks they are failed. Overall, LsrKD, MrKD, and their TC variants are comparable to or outperform the LSR method, suggesting the broad applicability of these KD methods.

preprint2020arXiv

LaNet: Real-time Lane Identification by Learning Road SurfaceCharacteristics from Accelerometer Data

The resolution of GPS measurements, especially in urban areas, is insufficient for identifying a vehicle&#39;s lane. In this work, we develop a deep LSTM neural network model LaNet that determines the lane vehicles are on by periodically classifying accelerometer samples collected by vehicles as they drive in real time. Our key finding is that even adjacent patches of road surfaces contain characteristics that are sufficiently unique to differentiate between lanes, i.e., roads inherently exhibit differing bumps, cracks, potholes, and surface unevenness. Cars can capture this road surface information as they drive using inexpensive, easy-to-install accelerometers that increasingly come fitted in cars and can be accessed via the CAN-bus. We collect an aggregate of 60 km driving data and synthesize more based on this that capture factors such as variable driving speed, vehicle suspensions, and accelerometer noise. Our formulated LSTM-based deep learning model, LaNet, learns lane-specific sequences of road surface events (bumps, cracks etc.) and yields 100% lane classification accuracy with 200 meters of driving data, achieving over 90% with just 100 m (correspondingly to roughly one minute of driving). We design the LaNet model to be practical for use in real-time lane classification and show with extensive experiments that LaNet yields high classification accuracy even on smooth roads, on large multi-lane roads, and on drives with frequent lane changes. Since different road surfaces have different inherent characteristics or entropy, we excavate our neural network model and discover a mechanism to easily characterize the achievable classification accuracies in a road over various driving distances by training the model just once. We present LaNet as a low-cost, easily deployable and highly accurate way to achieve fine-grained lane identification.

preprint2020arXiv

Learning Contextualized Sentence Representations for Document-Level Neural Machine Translation

Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence. In this paper, we propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence. By enforcing the NMT model to predict source context, we want the model to learn &#34;contextualized&#34; source sentence representations that capture document-level dependencies on the source side. We further propose two different methods to learn and integrate such contextualized sentence embeddings into NMT: a joint training method that jointly trains an NMT model with the source context prediction model and a pre-training & fine-tuning method that pretrains the source context prediction model on a large-scale monolingual document corpus and then fine-tunes it with the NMT model. Experiments on Chinese-English and English-German translation show that both methods can substantially improve the translation quality over a strong document-level Transformer baseline.

preprint2020arXiv

Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation

Many document-level neural machine translation (NMT) systems have explored the utility of context-aware architecture, usually requiring an increasing number of parameters and computational complexity. However, few attention is paid to the baseline model. In this paper, we research extensively the pros and cons of the standard transformer in document-level translation, and find that the auto-regressive property can simultaneously bring both the advantage of the consistency and the disadvantage of error accumulation. Therefore, we propose a surprisingly simple long-short term masking self-attention on top of the standard transformer to both effectively capture the long-range dependence and reduce the propagation of errors. We examine our approach on the two publicly available document-level datasets. We can achieve a strong result in BLEU and capture discourse phenomena.

preprint2020arXiv

O-MedAL: Online Active Deep Learning for Medical Image Analysis

Active Learning methods create an optimized labeled training set from unlabeled data. We introduce a novel Online Active Deep Learning method for Medical Image Analysis. We extend our MedAL active learning framework to present new results in this paper. Our novel sampling method queries the unlabeled examples that maximize the average distance to all training set examples. Our online method enhances performance of its underlying baseline deep network. These novelties contribute significant performance improvements, including improving the model&#39;s underlying deep network accuracy by 6.30%, using only 25% of the labeled dataset to achieve baseline accuracy, reducing backpropagated images during training by as much as 67%, and demonstrating robustness to class imbalance in binary and multi-class tasks.