Source author record

Hongbin Wang

Hongbin Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Artificial Intelligence eess.IV Machine Learning Multimedia Computation and Language Computational Engineering, Finance, and Science eess.AS eess.SP Hardware Architecture hep-ph hep-th math.CA math.DS physics.ao-ph Sound

Catalog footprint

What is connected

14works

16topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

MambaRain: Multi-Scale Mamba-Attention Framework for 0-3 Hour Precipitation Nowcasting

Accurate precipitation nowcasting over extended horizons (0-3 hours) is essential for disaster mitigation and operational decision-making, yet remains a critical challenge in the field. Existing deterministic approaches are predominantly constrained to shorter prediction windows (0-2 hours), exhibiting severe performance degradation beyond 90 minutes owing to their inherent difficulty in capturing long-range spatiotemporal dependencies from radar-derived observations. To address these fundamental limitations, we propose MambaRain, a novel multi-scale encoder-decoder architecture that synergistically integrates Mamba's linear-complexity long-range temporal modeling with self-attention mechanisms for explicit spatial correlation capture. The core innovation lies in a hybrid design paradigm wherein Mamba blocks leverage selective state space mechanisms to model global temporal dynamics across extended sequences with computational efficiency, while self-attention modules explicitly characterize spatial correlations within precipitation fields - a capability inherently absent in Mamba's sequential processing paradigm. This complementary synergy enables comprehensive spatiotemporal representation learning, effectively extending the viable forecasting horizon to 2-3 hours with substantial accuracy improvements. Furthermore, we introduce a spectral loss formulation to mitigate blurring artifacts characteristic of chaotic precipitation systems, thereby preserving fine-scale motion details critical for nowcasting accuracy. Experimental validation demonstrates that MambaRain substantially outperforms existing deterministic methodologies in 0-3 hour nowcasting tasks, with particularly pronounced performance gains in the challenging 2-3 hour prediction range.

preprint2026arXiv

PADE: A Predictor-Free Sparse Attention Accelerator via Unified Execution and Stage Fusion

Attention-based models have revolutionized AI, but the quadratic cost of self-attention incurs severe computational and memory overhead. Sparse attention methods alleviate this by skipping low-relevance token pairs. However, current approaches lack practicality due to the heavy expense of added sparsity predictor, which severely drops their hardware efficiency. This paper advances the state-of-the-art (SOTA) by proposing a bit-serial enable stage-fusion (BSF) mechanism, which eliminates the need for a separate predictor. However, it faces key challenges: 1) Inaccurate bit-sliced sparsity speculation leads to incorrect pruning; 2) Hardware under-utilization due to fine-grained and imbalanced bit-level workloads. 3) Tiling difficulty caused by the row-wise dependency in sparsity pruning criteria. We propose PADE, a predictor-free algorithm-hardware co-design for dynamic sparse attention acceleration. PADE features three key innovations: 1) Bit-wise uncertainty interval-enabled guard filtering (BUI-GF) strategy to accurately identify trivial tokens during each bit round; 2) Bidirectional sparsity-based out-of-order execution (BS-OOE) to improve hardware utilization; 3) Interleaving-based sparsity-tiled attention (ISTA) to reduce both I/O and computational complexity. These techniques, combined with custom accelerator designs, enable practical sparsity acceleration without relying on an added sparsity predictor. Extensive experiments on 22 benchmarks show that PADE achieves 7.43x speed up and 31.1x higher energy efficiency than Nvidia H100 GPU. Compared to SOTA accelerators, PADE achieves 5.1x, 4.3x and 3.4x energy saving than Sanger, DOTA and SOFA.

preprint2026arXiv

VMU-Diff: A Coarse-to-fine Multi-source Data Fusion Framework for Precipitation Nowcasting

Precipitation nowcasting is a vital spatio-temporal prediction task for meteorological applications but faces challenges due to the chaotic property of precipitation systems. Existing methods predominantly rely on single-source radar data to build either deterministic or probabilistic models for extrapolation. However, the single deterministic model suffers from blurring due to MSE convergence. The single probabilistic model, typically represented by diffusion models, can generate fine details but suffers from spurious artifacts that compromise accuracy and computational inefficiency. To address these challenges, this paper proposes a novel coarse-to-fine Vision Mamba Unet and residual Diffusion (VMU-Diff) based precipitation nowcasting framework. It realizes precipitation nowcasting through a two-stage process, i.e., a deterministic model-based coarse stage to predict global motion trends and a probabilistic model-based fine stage to generate fine prediction details. In the coarse prediction stage, rather than single-source radar data, both radar and multi-band satellite data are taken as input. A spatial-temporal attention block and several Vision mamba state-space blocks realize multi-source data fusion, and predict the future echo global dynamics. The fine-grained stage is realized by a spatio-temporal refine generator based on residual conditional diffusion models. It first obtains spatio-temporal residual features based on coarse prediction and ground truth, and further reconstructs the residual via conditional Mamba state-space module. Experiments on Jiangsu SWAN datasets demonstrate the improvements of our method over state-of-the-art methods, particularly in short-term forecasts.

preprint2022arXiv

Bilateral Network with Channel Splitting Network and Transformer for Thermal Image Super-Resolution

In recent years, the Thermal Image Super-Resolution (TISR) problem has become an attractive research topic. TISR would been used in a wide range of fields, including military, medical, agricultural and animal ecology. Due to the success of PBVS-2020 and PBVS-2021 workshop challenge, the result of TISR keeps improving and attracts more researchers to sign up for PBVS-2022 challenge. In this paper, we will introduce the technical details of our submission to PBVS-2022 challenge designing a Bilateral Network with Channel Splitting Network and Transformer(BN-CSNT) to tackle the TISR problem. Firstly, we designed a context branch based on channel splitting network with transformer to obtain sufficient context information. Secondly, we designed a spatial branch with shallow transformer to extract low level features which can preserve the spatial information. Finally, for the context branch in order to fuse the features from channel splitting network and transformer, we proposed an attention refinement module, and then features from context branch and spatial branch are fused by proposed feature fusion module. The proposed method can achieve PSNR=33.64, SSIM=0.9263 for x4 and PSNR=21.08, SSIM=0.7803 for x2 in the PBVS-2022 challenge test dataset.

preprint2022arXiv

Efficient Progressive High Dynamic Range Image Restoration via Attention and Alignment Network

HDR is an important part of computational photography technology. In this paper, we propose a lightweight neural network called Efficient Attention-and-alignment-guided Progressive Network (EAPNet) for the challenge NTIRE 2022 HDR Track 1 and Track 2. We introduce a multi-dimensional lightweight encoding module to extract features. Besides, we propose Progressive Dilated U-shape Block (PDUB) that can be a progressive plug-and-play module for dynamically tuning MAccs and PSNR. Finally, we use fast and low-power feature-align module to deal with misalignment problem in place of the time-consuming Deformable Convolutional Network (DCN). The experiments show that our method achieves about 20 times compression on MAccs with better mu-PSNR and PSNR compared to the state-of-the-art method. We got the second place of both two tracks during the testing phase. Figure1. shows the visualized result of NTIRE 2022 HDR challenge.

preprint2022arXiv

KECP: Knowledge Enhanced Contrastive Prompting for Few-shot Extractive Question Answering

Extractive Question Answering (EQA) is one of the most important tasks in Machine Reading Comprehension (MRC), which can be solved by fine-tuning the span selecting heads of Pre-trained Language Models (PLMs). However, most existing approaches for MRC may perform poorly in the few-shot learning scenario. To solve this issue, we propose a novel framework named Knowledge Enhanced Contrastive Prompt-tuning (KECP). Instead of adding pointer heads to PLMs, we introduce a seminal paradigm for EQA that transform the task into a non-autoregressive Masked Language Modeling (MLM) generation problem. Simultaneously, rich semantics from the external knowledge base (KB) and the passage context are support for enhancing the representations of the query. In addition, to boost the performance of PLMs, we jointly train the model by the MLM and contrastive learning objectives. Experiments on multiple benchmarks demonstrate that our method consistently outperforms state-of-the-art approaches in few-shot settings by a large margin.

preprint2022arXiv

NTIRE 2022 Challenge on High Dynamic Range Imaging: Methods and Results

This paper reviews the challenge on constrained high dynamic range (HDR) imaging that was part of the New Trends in Image Restoration and Enhancement (NTIRE) workshop, held in conjunction with CVPR 2022. This manuscript focuses on the competition set-up, datasets, the proposed methods and their results. The challenge aims at estimating an HDR image from multiple respective low dynamic range (LDR) observations, which might suffer from under- or over-exposed regions and different sources of noise. The challenge is composed of two tracks with an emphasis on fidelity and complexity constraints: In Track 1, participants are asked to optimize objective fidelity scores while imposing a low-complexity constraint (i.e. solutions can not exceed a given number of operations). In Track 2, participants are asked to minimize the complexity of their solutions while imposing a constraint on fidelity scores (i.e. solutions are required to obtain a higher fidelity score than the prescribed baseline). Both tracks use the same data and metrics: Fidelity is measured by means of PSNR with respect to a ground-truth HDR image (computed both directly and with a canonical tonemapping operation), while complexity metrics include the number of Multiply-Accumulate (MAC) operations and runtime (in seconds).

preprint2022arXiv

Reduction of General One-loop Integrals Using Auxiliary Vector

As a key method to deal with loop integrals, Integration-By-Parts (IBP) method can be used to do reduction as well as establish the differential equations for master integrals. However, when talking about tensor reduction, the Passarino-Veltman (PV) reduction method is also widely used for one-loop integrals. Recently, we have proposed an improved PV reduction method, i.e., the PV reduction method with auxiliary vector $R$, which can easily give analytical reduction results for any tensor rank. However, our results are only for integrals with propagators with power one. In this paper, we generalize our method to one-loop integrals with general tensor structures and propagators with general powers. Our ideas are simple. We solve the generalised reduction problem by combining differentiation over masses and proper limit of reduction with power-one propagators. Finally, we demonstrate our method with several examples. With the result in this paper, we have shown that our improved PV-reduction method with auxiliary vector is a self-completed reduction method for one-loop integrals.

preprint2022arXiv

The MSXF TTS System for ICASSP 2022 ADD Challenge

This paper presents our MSXF TTS system for Task 3.1 of the Audio Deep Synthesis Detection (ADD) Challenge 2022. We use an end to end text to speech system, and add a constraint loss to the system when training stage. The end to end TTS system is VITS, and the pre-training self-supervised model is wav2vec 2.0. And we also explore the influence of the speech speed and volume in spoofing. The faster speech means the less the silence part in audio, the easier to fool the detector. We also find the smaller the volume, the better spoofing ability, though we normalize volume for submission. Our team is identified as C2, and we got the fourth place in the challenge.

preprint2022arXiv

The Second Place Solution for The 4th Large-scale Video Object Segmentation Challenge--Track 3: Referring Video Object Segmentation

The referring video object segmentation task (RVOS) aims to segment object instances in a given video referred by a language expression in all video frames. Due to the requirement of understanding cross-modal semantics within individual instances, this task is more challenging than the traditional semi-supervised video object segmentation where the ground truth object masks in the first frame are given. With the great achievement of Transformer in object detection and object segmentation, RVOS has been made remarkable progress where ReferFormer achieved the state-of-the-art performance. In this work, based on the strong baseline framework--ReferFormer, we propose several tricks to boost further, including cyclical learning rates, semi-supervised approach, and test-time augmentation inference. The improved ReferFormer ranks 2nd place on CVPR2022 Referring Youtube-VOS Challenge.

preprint2022arXiv

The Third Place Solution for CVPR2022 AVA Accessibility Vision and Autonomy Challenge

The goal of AVA challenge is to provide vision-based benchmarks and methods relevant to accessibility. In this paper, we introduce the technical details of our submission to the CVPR2022 AVA Challenge. Firstly, we conducted some experiments to help employ proper model and data augmentation strategy for this task. Secondly, an effective training strategy was applied to improve the performance. Thirdly, we integrated the results from two different segmentation frameworks to improve the performance further. Experimental results demonstrate that our approach can achieve a competitive result on the AVA test set. Finally, our approach achieves 63.008\%AP@0.50:0.95 on the test set of CVPR2022 AVA Challenge.

preprint2020arXiv

Boundedness of Singular Integral Operators on Weak Herz Type Spaces with Variable Exponent

In this paper, the authors define the weak Herz spaces and the weak Herz-type Hardy spaces with variable exponent. As applications, the authors establish the boundedness for a large class of singular integral operators including some critical cases.

preprint2019arXiv

Spatio-temporal variation of temperature for the recent 40 years in Lhasa

It was all known that Lhasa went through a high temperature of 30.8$^{\circ}$C in late June 2019, which hit record highs. To better understand the reasons, based on observations recorded at automatic weather stations in Lhasa, we studied the characteristics of temperature variation at multiple time scales using the linear trend method, Mann-Kendall mutation test, morlet wavelet analysis, R/S analysis and so on. The results showed that: (a) The annual mean temperature (AMT) is rising at a rate of 0.5$^{\circ}$C/10yr, and the average temperature for different seasons also increased significantly, especially in winter. (b) Although there was an intersection in 1995, we found that AMT, did not pass the reliability test of significance level $α$ =0.05, this means there are no abrupt changes for AMT, the values are 7.97$^{\circ}$C and 9.15$^{\circ}$C respectively before and after the intersection point. (c) AMT has a periodic oscillation for 18~25yr and 25~32yr based on a mass of data and the wavelet variance diagrams in Lhasa. AMT has a main cycle of 28yr, cyclic Patterns of temperature changes in spring, summer and autumn is similar to AMT, but it is relatively complex in winter. (d) The Hurst index of AMT and different seasons demonstrates that the temperature are likely to continue to rise in the future in Lhasa.

preprint2018arXiv

Turing instability and Turing-Hopf bifurcation in diffusive Schnakenberg systems with gene expression time delay

For delayed reaction-diffusion Schnakenberg systems with Neumann boundary conditions, critical conditions for Turing instability are derived, which are necessary and sufficient. And existence conditions for Turing, Hopf and Turing-Hopf bifurcations are established. Normal forms truncated to order 3 at Turing-Hopf singularity of codimension 2, are derived. By investigating Turing-Hopf bifurcation, the parameter regions for the stability of a periodic solution, a pair of spatially inhomogeneous steady states and a pair of spatially inhomogeneous periodic solutions, are derived in $(τ,\varepsilon)$ parameter plane ($τ$ for time delay, $\varepsilon$ for diffusion rate). It is revealed that joint effects of diffusion and delay can lead to the occurrence of mixed spatial and temporal patterns. Moreover, it is also demonstrated that various spatially inhomogeneous patterns with different spatial frequencies can be achieved via changing the diffusion rate. And, the phenomenon that time delay may induce a failure of Turing instability observed by Gaffney and Monk (2006) are theoretically explained.

Hongbin Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

MambaRain: Multi-Scale Mamba-Attention Framework for 0-3 Hour Precipitation Nowcasting

PADE: A Predictor-Free Sparse Attention Accelerator via Unified Execution and Stage Fusion

VMU-Diff: A Coarse-to-fine Multi-source Data Fusion Framework for Precipitation Nowcasting

Bilateral Network with Channel Splitting Network and Transformer for Thermal Image Super-Resolution

Efficient Progressive High Dynamic Range Image Restoration via Attention and Alignment Network

KECP: Knowledge Enhanced Contrastive Prompting for Few-shot Extractive Question Answering

NTIRE 2022 Challenge on High Dynamic Range Imaging: Methods and Results

Reduction of General One-loop Integrals Using Auxiliary Vector

The MSXF TTS System for ICASSP 2022 ADD Challenge

The Second Place Solution for The 4th Large-scale Video Object Segmentation Challenge--Track 3: Referring Video Object Segmentation

The Third Place Solution for CVPR2022 AVA Accessibility Vision and Autonomy Challenge

Boundedness of Singular Integral Operators on Weak Herz Type Spaces with Variable Exponent

Spatio-temporal variation of temperature for the recent 40 years in Lhasa

Turing instability and Turing-Hopf bifurcation in diffusive Schnakenberg systems with gene expression time delay