Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
18works
0followers
15topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

18 published item(s)

preprint2026arXiv

Breaking Spatial Uniformity: Prior-Guided Mamba with Radial Serialization for Lens Flare Removal

Lens flares, caused by complex optical aberrations, severely degrade image quality especially in nighttime photography. Although recent restoration methods have made remarkable progress, most still rely on spatially uniform processing. They are failing to handle the region-dependent restoration demands of flare scenes, where saturated light sources should be preserved, flare artifacts removed, and background details recovered. To address this challenge, we propose DeflareMambav2, a prior-guided Mamba framework for lens flare removal. Specifically, we introduce a Flare Prior Network (FPN) to estimate flare priors and guide adaptive restoration. Besides, a novel radial serialization strategy breaks spatially homogeneous processing by performing flare-aware targeted sampling, and better supports long-range modeling in State Space Models (SSMs). Based on these priors, the backbone adopts a dual-level adaptive scheme. It explicitly preserves light-source regions to avoid over-processing, and applies curriculum-based restoration to the remaining contaminated areas while calibrating restoration intensity at the pixel level. Extensive experiments demonstrate that DeflareMambav2 achieves state-of-the-art performance with reduced parameter burden. Code is available at https://github.com/BNU-ERC-ITEA/DeflareMambav2.

preprint2026arXiv

HiGP: A high-performance Python package for Gaussian Process

Gaussian Processes (GPs) are flexible, nonparametric Bayesian models widely used for regression and classification because of their ability to capture complex data patterns and quantify predictive uncertainty. However, the O(n^3) computational cost of kernel matrix operations poses a major obstacle to applying GPs at scale. HiGP is a high-performance Python package designed to overcome these scalability limitations through advanced numerical linear algebra and hierarchical kernel representations. It integrates H^2 matrices to achieve near-linear complexity in both storage and computation for spatial datasets, supports on-the-fly kernel evaluation to avoid explicit storage in large-scale problems, and incorporates a robust Adaptive Factorized Nyström (AFN) preconditioner that accelerates convergence of iterative solvers across a broad range of kernel spectra. These computational kernels are implemented in C++ for maximum performance and exposed through Python interfaces, enabling seamless integration with modern machine learning workflows. HiGP also includes analytically derived gradient computations for efficient hyperparameter optimization, avoiding the inefficiencies of automatic differentiation in iterative solvers. By serving as a reusable numerical engine, HiGP complements existing GP frameworks such as GPJax, KeOps, and GaussianProcesses.jl, providing a reliable and scalable computational backbone for large-scale Gaussian Process regression and classification.

preprint2026arXiv

Learning Physics-Informed Noise Models from Dark Frames for Low-Light Raw Image Denoising

Recently, the mainstream practice for training low-light raw image denoising methods has shifted towards employing synthetic data. Noise modeling, which focuses on characterizing the noise distribution of real-world sensors, profoundly influences the effectiveness and practicality of synthetic data. Currently, physics-based noise modeling struggles to characterize the entire real noise distribution, while learning-based noise modeling impractically depends on paired real data. In this paper, we propose a novel strategy: learning the noise model from dark frames instead of paired real data, to break down the data dependency. Based on this strategy, we introduce an efficient physics-informed noise neural proxy (PNNP) to approximate the real-world sensor noise model. Specifically, we integrate physical priors into neural proxies and introduce three efficient techniques: physics-guided noise decoupling (PND), physics-aware proxy model (PPM), and differentiable distribution loss (DDL). PND decouples the dark frame into different components and handles different levels of noise flexibly, which reduces the complexity of noise modeling. PPM incorporates physical priors to constrain the synthetic noise, which promotes the accuracy of noise modeling. DDL provides explicit and reliable supervision for noise distribution, which promotes the precision of noise modeling. PNNP exhibits powerful potential in characterizing the real noise distribution. Extensive experiments on public datasets demonstrate superior performance in practical low-light raw image denoising. The source code will be publicly available at the project homepage.

preprint2026arXiv

SketchJudge: A Diagnostic Benchmark for Grading Hand-drawn Diagrams with Multimodal Large Language Models

While Multimodal Large Language Models (MLLMs) have achieved remarkable progress in visual understanding, they often struggle when faced with the unstructured and ambiguous nature of human-generated sketches. This limitation is particularly pronounced in the underexplored task of visual grading, where models should not only solve a problem but also diagnose errors in hand-drawn diagrams. Such diagnostic capabilities depend on complex structural, semantic, and metacognitive reasoning. To bridge this gap, we introduce SketchJudge, a novel benchmark tailored for evaluating MLLMs as graders of hand-drawn STEM diagrams. SketchJudge encompasses 1,015 hand-drawn student responses across four domains: geometry, physics, charts, and flowcharts, featuring diverse stylistic variations and distinct error types. Evaluations on SketchJudge demonstrate that even advanced MLLMs lag significantly behind humans, validating the benchmark's effectiveness in exposing the fragility of current vision-language alignment in symbolic and noisy contexts. All data, code, and evaluation scripts are publicly available at https://github.com/yuhangsu82/SketchJudge.

preprint2025arXiv

Problem-Solving Logic Guided Curriculum In-Context Learning for LLMs Complex Reasoning

In-context learning (ICL) can significantly enhance the complex reasoning capabilities of large language models (LLMs), with the key lying in the selection and ordering of demonstration examples. Previous methods typically relied on simple features to measure the relevance between examples. We argue that these features are not sufficient to reflect the intrinsic connections between examples. In this study, we propose a curriculum ICL strategy guided by problem-solving logic. We select demonstration examples by analyzing the problem-solving logic and order them based on curriculum learning. Specifically, we constructed a problem-solving logic instruction set based on the BREAK dataset and fine-tuned a language model to analyze the problem-solving logic of examples. Subsequently, we selected appropriate demonstration examples based on problem-solving logic and assessed their difficulty according to the number of problem-solving steps. In accordance with the principles of curriculum learning, we ordered the examples from easy to hard to serve as contextual prompts. Experimental results on multiple benchmarks indicate that our method outperforms previous ICL approaches in terms of performance and efficiency, effectively enhancing the complex reasoning capabilities of LLMs. Our project will be released at https://github.com/maxuetao/CurriculumICL

preprint2022arXiv

Data-driven Construction of Hierarchical Matrices with Nested Bases

Hierarchical matrices provide a powerful representation for significantly reducing the computational complexity associated with dense kernel matrices. For general kernel functions, interpolation-based methods are widely used for the efficient construction of hierarchical matrices. In this paper, we present a fast hierarchical data reduction (HiDR) procedure with $O(n)$ complexity for the memory-efficient construction of hierarchical matrices with nested bases where $n$ is the number of data points. HiDR aims to reduce the given data in a hierarchical way so as to obtain $O(1)$ representations for all nearfield and farfield interactions. Based on HiDR, a linear complexity $\mathcal{H}^2$ matrix construction algorithm is proposed. The use of data-driven methods enables {better efficiency than other general-purpose methods} and flexible computation without accessing the kernel function. Experiments demonstrate significantly improved memory efficiency of the proposed data-driven method compared to interpolation-based methods over a wide range of kernels. Though the method is not optimized for any special kernel, benchmark experiments for the Coulomb kernel show that the proposed general-purpose algorithm offers competitive performance for hierarchical matrix construction compared to several state-of-the-art algorithms for the Coulomb kernel.

preprint2022arXiv

Learnability Enhancement for Low-light Raw Denoising: Where Paired Real Data Meets Noise Modeling

Low-light raw denoising is an important and valuable task in computational photography where learning-based methods trained with paired real data are mainstream. However, the limited data volume and complicated noise distribution have constituted a learnability bottleneck for paired real data, which limits the denoising performance of learning-based methods. To address this issue, we present a learnability enhancement strategy to reform paired real data according to noise modeling. Our strategy consists of two efficient techniques: shot noise augmentation (SNA) and dark shading correction (DSC). Through noise model decoupling, SNA improves the precision of data mapping by increasing the data volume and DSC reduces the complexity of data mapping by reducing the noise complexity. Extensive results on the public datasets and real imaging scenarios collectively demonstrate the state-of-the-art performance of our method. Our code is available at: https://github.com/megvii-research/PMN.

preprint2022arXiv

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

This paper reviews the NTIRE 2022 challenge on efficient single image super-resolution with focus on the proposed solutions and results. The task of the challenge was to super-resolve an input image with a magnification factor of $\times$4 based on pairs of low and corresponding high resolution images. The aim was to design a network for single image super-resolution that achieved improvement of efficiency measured according to several metrics including runtime, parameters, FLOPs, activations, and memory consumption while at least maintaining the PSNR of 29.00dB on DIV2K validation set. IMDN is set as the baseline for efficiency measurement. The challenge had 3 tracks including the main track (runtime), sub-track one (model complexity), and sub-track two (overall performance). In the main track, the practical runtime performance of the submissions was evaluated. The rank of the teams were determined directly by the absolute value of the average runtime on the validation set and test set. In sub-track one, the number of parameters and FLOPs were considered. And the individual rankings of the two metrics were summed up to determine a final ranking in this track. In sub-track two, all of the five metrics mentioned in the description of the challenge including runtime, parameter count, FLOPs, activations, and memory consumption were considered. Similar to sub-track one, the rankings of five metrics were summed up to determine a final ranking. The challenge had 303 registered participants, and 43 teams made valid submissions. They gauge the state-of-the-art in efficient single image super-resolution.

preprint2022arXiv

Revisiting Competitive Coding Approach for Palmprint Recognition: A Linear Discriminant Analysis Perspective

The competitive Coding approach (CompCode) is one of the most promising methods for palmprint recognition. Due to its high performance and simple formulation, it has been continuously studied for many years. However, although numerous variations of CompCode have been proposed, a detailed analysis of the method is still absent. In this paper, we provide a detailed analysis of CompCode from the perspective of linear discriminant analysis (LDA) for the first time. A non-trivial sufficient condition under which the CompCode is optimal in the sense of Fisher's criterion is presented. Based on our analysis, we examined the statistics of palmprints and concluded that CompCode deviates from the optimal condition. To mitigate the deviation, we propose a new method called Class-Specific CompCode that improves CompCode by excluding non-palm-line areas from matching. A nonlinear mapping of the competitive code is also applied in this method to further enhance accuracy. Experiments on two public databases demonstrate the effectiveness of the proposed method.

preprint2022arXiv

SIND: A Drone Dataset at Signalized Intersection in China

Intersection is one of the most challenging scenarios for autonomous driving tasks. Due to the complexity and stochasticity, essential applications (e.g., behavior modeling, motion prediction, safety validation, etc.) at intersections rely heavily on data-driven techniques. Thus, there is an intense demand for trajectory datasets of traffic participants (TPs) in intersections. Currently, most intersections in urban areas are equipped with traffic lights. However, there is not yet a large-scale, high-quality, publicly available trajectory dataset for signalized intersections. Therefore, in this paper, a typical two-phase signalized intersection is selected in Tianjin, China. Besides, a pipeline is designed to construct a Signalized INtersection Dataset (SIND), which contains 7 hours of recording including over 13,000 TPs with 7 types. Then, the behaviors of traffic light violations in SIND are recorded. Furthermore, the SIND is also compared with other similar works. The features of the SIND can be summarized as follows: 1) SIND provides more comprehensive information, including traffic light states, motion parameters, High Definition (HD) map, etc. 2) The category of TPs is diverse and characteristic, where the proportion of vulnerable road users (VRUs) is up to 62.6% 3) Multiple traffic light violations of non-motor vehicles are shown. We believe that SIND would be an effective supplement to existing datasets and can promote related research on autonomous driving.The dataset is available online via: https://github.com/SOTIF-AVLab/SinD

preprint2021arXiv

Efficient construction of an HSS preconditioner for symmetric positive definite $\mathcal{H}^2$ matrices

In an iterative approach for solving linear systems with ill-conditioned, symmetric positive definite (SPD) kernel matrices, both fast matrix-vector products and fast preconditioning operations are required. Fast (linear-scaling) matrix-vector products are available by expressing the kernel matrix in an $\mathcal{H}^2$ representation or an equivalent fast multipole method representation. Preconditioning such matrices, however, requires a structured matrix approximation that is more regular than the $\mathcal{H}^2$ representation, such as the hierarchically semiseparable (HSS) matrix representation, which provides fast solve operations. Previously, an algorithm was presented to construct an HSS approximation to an SPD kernel matrix that is guaranteed to be SPD. However, this algorithm has quadratic cost and was only designed for recursive binary partitionings of the points defining the kernel matrix. This paper presents a general algorithm for constructing an SPD HSS approximation. Importantly, the algorithm uses the $\mathcal{H}^2$ representation of the SPD matrix to reduce its computational complexity from quadratic to quasilinear. Numerical experiments illustrate how this SPD HSS approximation performs as a preconditioner for solving linear systems arising from a range of kernel functions.

preprint2021arXiv

Snapshot Hyperspectral Imaging Based on Weighted High-order Singular Value Regularization

Snapshot hyperspectral imaging can capture the 3D hyperspectral image (HSI) with a single 2D measurement and has attracted increasing attention recently. Recovering the underlying HSI from the compressive measurement is an ill-posed problem and exploiting the image prior is essential for solving this ill-posed problem. However, existing reconstruction methods always start from modeling image prior with the 1D vector or 2D matrix and cannot fully exploit the structurally spectral-spatial nature in 3D HSI, thus leading to a poor fidelity. In this paper, we propose an effective high-order tensor optimization based method to boost the reconstruction fidelity for snapshot hyperspectral imaging. We first build high-order tensors by exploiting the spatial-spectral correlation in HSI. Then, we propose a weight high-order singular value regularization (WHOSVR) based low-rank tensor recovery model to characterize the structure prior of HSI. By integrating the structure prior in WHOSVR with the system imaging process, we develop an optimization framework for HSI reconstruction, which is finally solved via the alternating minimization algorithm. Extensive experiments implemented on two representative systems demonstrate that our method outperforms state-of-the-art methods.

preprint2021arXiv

Transitional Learning: Exploring the Transition States of Degradation for Blind Super-resolution

Being extremely dependent on iterative estimation of the degradation prior or optimization of the model from scratch, the existing blind super-resolution (SR) methods are generally time-consuming and less effective, as the estimation of degradation proceeds from a blind initialization and lacks interpretable degradation priors. To address it, this paper proposes a transitional learning method for blind SR using an end-to-end network without any additional iterations in inference, and explores an effective representation for unknown degradation. To begin with, we analyze and demonstrate the transitionality of degradations as interpretable prior information to indirectly infer the unknown degradation model, including the widely used additive and convolutive degradations. We then propose a novel Transitional Learning method for blind Super-Resolution (TLSR), by adaptively inferring a transitional transformation function to solve the unknown degradations without any iterative operations in inference. Specifically, the end-to-end TLSR network consists of a degree of transitionality (DoT) estimation network, a homogeneous feature extraction network, and a transitional learning module. Quantitative and qualitative evaluations on blind SR tasks demonstrate that the proposed TLSR achieves superior performances and costs fewer complexities against the state-of-the-art blind SR methods. The code is available at github.com/YuanfeiHuang/TLSR.

preprint2020arXiv

3D Quasi-Recurrent Neural Network for Hyperspectral Image Denoising

In this paper, we propose an alternating directional 3D quasi-recurrent neural network for hyperspectral image (HSI) denoising, which can effectively embed the domain knowledge -- structural spatio-spectral correlation and global correlation along spectrum. Specifically, 3D convolution is utilized to extract structural spatio-spectral correlation in an HSI, while a quasi-recurrent pooling function is employed to capture the global correlation along spectrum. Moreover, alternating directional structure is introduced to eliminate the causal dependency with no additional computation cost. The proposed model is capable of modeling spatio-spectral dependency while preserving the flexibility towards HSIs with arbitrary number of bands. Extensive experiments on HSI denoising demonstrate significant improvement over state-of-the-arts under various noise settings, in terms of both restoration accuracy and computation time. Our code is available at https://github.com/Vandermode/QRNN3D.

preprint2020arXiv

A Physics-based Noise Formation Model for Extreme Low-light Raw Denoising

Lacking rich and realistic data, learned single image denoising algorithms generalize poorly to real raw images that do not resemble the data used for training. Although the problem can be alleviated by the heteroscedastic Gaussian model for noise synthesis, the noise sources caused by digital camera electronics are still largely overlooked, despite their significant effect on raw measurement, especially under extremely low-light condition. To address this issue, we present a highly accurate noise formation model based on the characteristics of CMOS photosensors, thereby enabling us to synthesize realistic samples that better match the physics of image formation process. Given the proposed noise model, we additionally propose a method to calibrate the noise parameters for available modern digital cameras, which is simple and reproducible for any new device. We systematically study the generalizability of a neural network trained with existing schemes, by introducing a new low-light denoising dataset that covers many modern digital cameras from diverse brands. Extensive empirical results collectively show that by utilizing our proposed noise formation model, a network can reach the capability as if it had been trained with rich real data, which demonstrates the effectiveness of our noise formation model.

preprint2020arXiv

Acoustic Echo Cancellation by Combining Adaptive Digital Filter and Recurrent Neural Network

Acoustic Echo Cancellation (AEC) plays a key role in voice interaction. Due to the explicit mathematical principle and intelligent nature to accommodate conditions, adaptive filters with different types of implementations are always used for AEC, giving considerable performance. However, there would be some kinds of residual echo in the results, including linear residue introduced by mismatching between estimation and the reality and non-linear residue mostly caused by non-linear components on the audio devices. The linear residue can be reduced with elaborate structure and methods, leaving the non-linear residue intractable for suppression. Though, some non-linear processing methods have already be raised, they are complicated and inefficient for suppression, and would bring damage to the speech audio. In this paper, a fusion scheme by combining adaptive filter and neural network is proposed for AEC. The echo could be reduced in a large scale by adaptive filtering, resulting in little residual echo. Though it is much smaller than speech audio, it could also be perceived by human ear and would make communication annoy. The neural network is elaborately designed and trained for suppressing such residual echo. Experiments compared with prevailing methods are conducted, validating the effectiveness and superiority of the proposed combination scheme.

preprint2020arXiv

Graph Computing based Distributed State Estimation with PMUs

Power system state estimation plays a fundamental and critical role in the energy management system (EMS). To achieve a high performance and accurate system states estimation, a graph computing based distributed state estimation approach is proposed in this paper. Firstly, a power system network is divided into multiple areas. Reference buses are selected with PMUs being installed at these buses for each area. Then, the system network is converted into multiple independent areas. In this way, the power system state estimation could be conducted in parallel for each area and the estimated system states are obtained without compromise of accuracy. IEEE 118-bus system and MP 10790-bus system are employed to verify the results accuracy and present the promising computation performance.

preprint2020arXiv

SPARC: Simulation Package for Ab-initio Real-space Calculations

We present SPARC: Simulation Package for Ab-initio Real-space Calculations. SPARC can perform Kohn-Sham density functional theory calculations for isolated systems such as molecules as well as extended systems such as crystals and surfaces, in both static and dynamic settings. It is straightforward to install/use and highly competitive with state-of-the-art planewave codes, demonstrating comparable performance on a small number of processors and increasing advantages as the number of processors grows. Notably, SPARC brings solution times down to a few seconds for systems with $\mathcal{O}(100-500)$ atoms on large-scale parallel computers, outperforming planewave counterparts by an order of magnitude and more.