Source author record

Debin Zhao

Debin Zhao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Multimedia Cryptography and Security eess.IV

Catalog footprint

What is connected

15works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

PairDropGS: Paired Dropout-Induced Consistency Regularization for Sparse-View Gaussian Splatting

Dropout-based sparse-view 3D Gaussian Splatting (3DGS) methods alleviate overfitting by randomly suppressing Gaussian primitives during training. Existing methods mainly focus on designing increasingly sophisticated dropout strategies, while they overlook the resulting inconsistencies among different dropped Gaussian subsets. This oversight often leads to unstable reconstruction and suboptimal Gaussian representation learning.In this paper, we revisit dropout-based sparse-view 3DGS from a consistency regularization perspective and propose PairDropGS, a Paired Dropout-induced Consistency Regularization framework for sparse-view Gaussian splatting. Specifically, PairDropGS first constructs a pair of the dropped Gaussian subsets from a shared Gaussian field and designs a low-frequency consistency regularization to constrain their low-frequency rendered structures. This design encourages the shared Gaussian field to preserve stable scene layout and coarse geometry under different random dropouts, while avoiding excessive constraints on ambiguous high-frequency details. Moreover, we introduce a progressive consistency scheduling strategy to gradually strengthen the consistency regularization during training for stability and robustness of reconstruction. Extensive experiments on widely-used sparse-view benchmarks demonstrate that PairDropGS achieves superior training stability, significantly outperforms existing dropout-based 3DGS methods in reconstruction quality, while exhibiting the simplicity and plug-and-play nature for improving dropout-based optimization.

preprint2022arXiv

Deep Attentional Guided Image Filtering

Guided filter is a fundamental tool in computer vision and computer graphics which aims to transfer structure information from guidance image to target image. Most existing methods construct filter kernels from the guidance itself without considering the mutual dependency between the guidance and the target. However, since there typically exist significantly different edges in the two images, simply transferring all structural information of the guidance to the target would result in various artifacts. To cope with this problem, we propose an effective framework named deep attentional guided image filtering, the filtering process of which can fully integrate the complementary information contained in both images. Specifically, we propose an attentional kernel learning module to generate dual sets of filter kernels from the guidance and the target, respectively, and then adaptively combine them by modeling the pixel-wise dependency between the two images. Meanwhile, we propose a multi-scale guided image filtering module to progressively generate the filtering result with the constructed kernels in a coarse-to-fine manner. Correspondingly, a multi-scale fusion strategy is introduced to reuse the intermediate results in the coarse-to-fine process. Extensive experiments show that the proposed framework compares favorably with the state-of-the-art methods in a wide range of guided image filtering applications, such as guided super-resolution, cross-modality restoration, texture removal, and semantic segmentation.

preprint2022arXiv

Fast Hierarchical Deep Unfolding Network for Image Compressed Sensing

By integrating certain optimization solvers with deep neural network, deep unfolding network (DUN) has attracted much attention in recent years for image compressed sensing (CS). However, there still exist several issues in existing DUNs: 1) For each iteration, a simple stacked convolutional network is usually adopted, which apparently limits the expressiveness of these models. 2) Once the training is completed, most hyperparameters of existing DUNs are fixed for any input content, which significantly weakens their adaptability. In this paper, by unfolding the Fast Iterative Shrinkage-Thresholding Algorithm (FISTA), a novel fast hierarchical DUN, dubbed FHDUN, is proposed for image compressed sensing, in which a well-designed hierarchical unfolding architecture is developed to cooperatively explore richer contextual prior information in multi-scale spaces. To further enhance the adaptability, series of hyperparametric generation networks are developed in our framework to dynamically produce the corresponding optimal hyperparameters according to the input content. Furthermore, due to the accelerated policy in FISTA, the newly embedded acceleration module makes the proposed FHDUN save more than 50% of the iterative loops against recent DUNs. Extensive CS experiments manifest that the proposed FHDUN outperforms existing state-of-the-art CS methods, while maintaining fewer iterations.

preprint2022arXiv

Tree-Structured Data Clustering-Driven Neural Network for Intra Prediction in Video Coding

As a crucial part of video compression, intra prediction utilizes local information of images to eliminate the redundancy in spatial domain. In both the High Efficiency Video Coding (H.265/HEVC) and Versatile Video Coding (H.266/VVC), multiple directional prediction modes are employed to find the texture trend of each small block and then the prediction is made based on reference samples in the selected direction. Recently, the intra prediction schemes based on neural networks have achieved great success. In these methods, the networks are trained and applied to intra prediction to assist the directional prediction modes. In this paper, we propose a novel tree-structured data clustering-driven neural network (dubbed TreeNet) for intra prediction, which builds the networks and clusters the training data in a tree-structured manner. Specifically, in each network split and training process of TreeNet, every parent network on a leaf node is split into two child networks by adding or subtracting Gaussian random noise. Then a data clustering-driven training is applied to train the two derived child networks using the clustered training data of their parent. To test the performance, TreeNet is integrated into VVC and HEVC to combine with or replace the directional prediction modes. In addition, a fast termination strategy is proposed to accelerate the search of TreeNet. The experimental results demonstrate that TreeNet with the fast termination can reach an average of 2.8% Bjontegaard distortion rate (BD-rate) improvement (up to 8.1%) and 4.9% BD-rate improvement (up to 8.2%) over VVC (VTM-4.0) and HEVC (HM-16.9) with all intra configuration, respectively.

preprint2021arXiv

Multi-Stage Residual Hiding for Image-into-Audio Steganography

The widespread application of audio communication technologies has speeded up audio data flowing across the Internet, which made it a popular carrier for covert communication. In this paper, we present a cross-modal steganography method for hiding image content into audio carriers while preserving the perceptual fidelity of the cover audio. In our framework, two multi-stage networks are designed: the first network encodes the decreasing multilevel residual errors inside different audio subsequences with the corresponding stage sub-networks, while the second network decodes the residual errors from the modified carrier with the corresponding stage sub-networks to produce the final revealed results. The multi-stage design of proposed framework not only make the controlling of payload capacity more flexible, but also make hiding easier because of the gradual sparse characteristic of residual errors. Qualitative experiments suggest that modifications to the carrier are unnoticeable by human listeners and that the decoded images are highly intelligible.

preprint2016arXiv

Random Walk Graph Laplacian based Smoothness Prior for Soft Decoding of JPEG Images

Given the prevalence of JPEG compressed images, optimizing image reconstruction from the compressed format remains an important problem. Instead of simply reconstructing a pixel block from the centers of indexed DCT coefficient quantization bins (hard decoding), soft decoding reconstructs a block by selecting appropriate coefficient values within the indexed bins with the help of signal priors. The challenge thus lies in how to define suitable priors and apply them effectively. In this paper, we combine three image priors---Laplacian prior for DCT coefficients, sparsity prior and graph-signal smoothness prior for image patches---to construct an efficient JPEG soft decoding algorithm. Specifically, we first use the Laplacian prior to compute a minimum mean square error (MMSE) initial solution for each code block. Next, we show that while the sparsity prior can reduce block artifacts, limiting the size of the over-complete dictionary (to lower computation) would lead to poor recovery of high DCT frequencies. To alleviate this problem, we design a new graph-signal smoothness prior (desired signal has mainly low graph frequencies) based on the left eigenvectors of the random walk graph Laplacian matrix (LERaG). Compared to previous graph-signal smoothness priors, LERaG has desirable image filtering properties with low computation overhead. We demonstrate how LERaG can facilitate recovery of high DCT frequencies of a piecewise smooth (PWS) signal via an interpretation of low graph frequency components as relaxed solutions to normalized cut in spectral clustering. Finally, we construct a soft decoding algorithm using the three signal priors with appropriate prior weights. Experimental results show that our proposal outperforms state-of-the-art soft decoding algorithms in both objective and subjective evaluations noticeably.

preprint2014arXiv

Group-based Sparse Representation for Image Restoration

Traditional patch-based sparse representation modeling of natural images usually suffer from two problems. First, it has to solve a large-scale optimization problem with high computational complexity in dictionary learning. Second, each patch is considered independently in dictionary learning and sparse coding, which ignores the relationship among patches, resulting in inaccurate sparse coding coefficients. In this paper, instead of using patch as the basic unit of sparse representation, we exploit the concept of group as the basic unit of sparse representation, which is composed of nonlocal patches with similar structures, and establish a novel sparse representation modeling of natural images, called group-based sparse representation (GSR). The proposed GSR is able to sparsely represent natural images in the domain of group, which enforces the intrinsic local sparsity and nonlocal self-similarity of images simultaneously in a unified framework. Moreover, an effective self-adaptive dictionary learning method for each group with low complexity is designed, rather than dictionary learning from natural images. To make GSR tractable and robust, a split Bregman based technique is developed to solve the proposed GSR-driven minimization problem for image restoration efficiently. Extensive experiments on image inpainting, image deblurring and image compressive sensing recovery manifest that the proposed GSR modeling outperforms many current state-of-the-art schemes in both PSNR and visual perception.

preprint2014arXiv

Image Compressive Sensing Recovery Using Adaptively Learned Sparsifying Basis via L0 Minimization

From many fewer acquired measurements than suggested by the Nyquist sampling theory, compressive sensing (CS) theory demonstrates that, a signal can be reconstructed with high probability when it exhibits sparsity in some domain. Most of the conventional CS recovery approaches, however, exploited a set of fixed bases (e.g. DCT, wavelet and gradient domain) for the entirety of a signal, which are irrespective of the non-stationarity of natural signals and cannot achieve high enough degree of sparsity, thus resulting in poor CS recovery performance. In this paper, we propose a new framework for image compressive sensing recovery using adaptively learned sparsifying basis via L0 minimization. The intrinsic sparsity of natural images is enforced substantially by sparsely representing overlapped image patches using the adaptively learned sparsifying basis in the form of L0 norm, greatly reducing blocking artifacts and confining the CS solution space. To make our proposed scheme tractable and robust, a split Bregman iteration based technique is developed to solve the non-convex L0 minimization problem efficiently. Experimental results on a wide range of natural images for CS recovery have shown that our proposed algorithm achieves significant performance improvements over many current state-of-the-art schemes and exhibits good convergence property.

preprint2014arXiv

Image Restoration Using Joint Statistical Modeling in Space-Transform Domain

This paper presents a novel strategy for high-fidelity image restoration by characterizing both local smoothness and nonlocal self-similarity of natural images in a unified statistical manner. The main contributions are three-folds. First, from the perspective of image statistics, a joint statistical modeling (JSM) in an adaptive hybrid space-transform domain is established, which offers a powerful mechanism of combining local smoothness and nonlocal self-similarity simultaneously to ensure a more reliable and robust estimation. Second, a new form of minimization functional for solving image inverse problem is formulated using JSM under regularization-based framework. Finally, in order to make JSM tractable and robust, a new Split-Bregman based algorithm is developed to efficiently solve the above severely underdetermined inverse problem associated with theoretical proof of convergence. Extensive experiments on image inpainting, image deblurring and mixed Gaussian plus salt-and-pepper noise removal applications verify the effectiveness of the proposed algorithm.

preprint2014arXiv

Spatially Directional Predictive Coding for Block-based Compressive Sensing of Natural Images

A novel coding strategy for block-based compressive sens-ing named spatially directional predictive coding (SDPC) is proposed, which efficiently utilizes the intrinsic spatial cor-relation of natural images. At the encoder, for each block of compressive sensing (CS) measurements, the optimal pre-diction is selected from a set of prediction candidates that are generated by four designed directional predictive modes. Then, the resulting residual is processed by scalar quantiza-tion (SQ). At the decoder, the same prediction is added onto the de-quantized residuals to produce the quantized CS measurements, which is exploited for CS reconstruction. Experimental results substantiate significant improvements achieved by SDPC-plus-SQ in rate distortion performance as compared with SQ alone and DPCM-plus-SQ.

preprint2014arXiv

Structural Group Sparse Representation for Image Compressive Sensing Recovery

Compressive Sensing (CS) theory shows that a signal can be decoded from many fewer measurements than suggested by the Nyquist sampling theory, when the signal is sparse in some domain. Most of conventional CS recovery approaches, however, exploited a set of fixed bases (e.g. DCT, wavelet, contourlet and gradient domain) for the entirety of a signal, which are irrespective of the nonstationarity of natural signals and cannot achieve high enough degree of sparsity, thus resulting in poor rate-distortion performance. In this paper, we propose a new framework for image compressive sensing recovery via structural group sparse representation (SGSR) modeling, which enforces image sparsity and self-similarity simultaneously under a unified framework in an adaptive group domain, thus greatly confining the CS solution space. In addition, an efficient iterative shrinkage/thresholding algorithm based technique is developed to solve the above optimization problem. Experimental results demonstrate that the novel CS recovery strategy achieves significant performance improvements over the current state-of-the-art schemes and exhibits nice convergence.

preprint2012arXiv

Exploiting Image Local And Nonlocal Consistency For Mixed Gaussian-Impulse Noise Removal

Most existing image denoising algorithms can only deal with a single type of noise, which violates the fact that the noisy observed images in practice are often suffered from more than one type of noise during the process of acquisition and transmission. In this paper, we propose a new variational algorithm for mixed Gaussian-impulse noise removal by exploiting image local consistency and nonlocal consistency simultaneously. Specifically, the local consistency is measured by a hyper-Laplace prior, enforcing the local smoothness of images, while the nonlocal consistency is measured by three-dimensional sparsity of similar blocks, enforcing the nonlocal self-similarity of natural images. Moreover, a Split-Bregman based technique is developed to solve the above optimization problem efficiently. Extensive experiments for mixed Gaussian plus impulse noise show that significant performance improvements over the current state-of-the-art schemes have been achieved, which substantiates the effectiveness of the proposed algorithm.

preprint2012arXiv

High Quality Image Interpolation via Local Autoregressive and Nonlocal 3-D Sparse Regularization

In this paper, we propose a novel image interpolation algorithm, which is formulated via combining both the local autoregressive (AR) model and the nonlocal adaptive 3-D sparse model as regularized constraints under the regularization framework. Estimating the high-resolution image by the local AR regularization is different from these conventional AR models, which weighted calculates the interpolation coefficients without considering the rough structural similarity between the low-resolution (LR) and high-resolution (HR) images. Then the nonlocal adaptive 3-D sparse model is formulated to regularize the interpolated HR image, which provides a way to modify these pixels with the problem of numerical stability caused by AR model. In addition, a new Split-Bregman based iterative algorithm is developed to solve the above optimization problem iteratively. Experiment results demonstrate that the proposed algorithm achieves significant performance improvements over the traditional algorithms in terms of both objective quality and visual perception

preprint2012arXiv

Image Super-Resolution via Dual-Dictionary Learning And Sparse Representation

Learning-based image super-resolution aims to reconstruct high-frequency (HF) details from the prior model trained by a set of high- and low-resolution image patches. In this paper, HF to be estimated is considered as a combination of two components: main high-frequency (MHF) and residual high-frequency (RHF), and we propose a novel image super-resolution method via dual-dictionary learning and sparse representation, which consists of the main dictionary learning and the residual dictionary learning, to recover MHF and RHF respectively. Extensive experimental results on test images validate that by employing the proposed two-layer progressive scheme, more image details can be recovered and much better results can be achieved than the state-of-the-art algorithms in terms of both PSNR and visual perception.

preprint2012arXiv

Improved Total Variation based Image Compressive Sensing Recovery by Nonlocal Regularization

Recently, total variation (TV) based minimization algorithms have achieved great success in compressive sensing (CS) recovery for natural images due to its virtue of preserving edges. However, the use of TV is not able to recover the fine details and textures, and often suffers from undesirable staircase artifact. To reduce these effects, this letter presents an improved TV based image CS recovery algorithm by introducing a new nonlocal regularization constraint into CS optimization problem. The nonlocal regularization is built on the well known nonlocal means (NLM) filtering and takes advantage of self-similarity in images, which helps to suppress the staircase effect and restore the fine details. Furthermore, an efficient augmented Lagrangian based algorithm is developed to solve the above combined TV and nonlocal regularization constrained problem. Experimental results demonstrate that the proposed algorithm achieves significant performance improvements over the state-of-the-art TV based algorithm in both PSNR and visual perception.

Debin Zhao

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

PairDropGS: Paired Dropout-Induced Consistency Regularization for Sparse-View Gaussian Splatting

Deep Attentional Guided Image Filtering

Fast Hierarchical Deep Unfolding Network for Image Compressed Sensing

Tree-Structured Data Clustering-Driven Neural Network for Intra Prediction in Video Coding

Multi-Stage Residual Hiding for Image-into-Audio Steganography

Random Walk Graph Laplacian based Smoothness Prior for Soft Decoding of JPEG Images

Group-based Sparse Representation for Image Restoration

Image Compressive Sensing Recovery Using Adaptively Learned Sparsifying Basis via L0 Minimization

Image Restoration Using Joint Statistical Modeling in Space-Transform Domain

Spatially Directional Predictive Coding for Block-based Compressive Sensing of Natural Images

Structural Group Sparse Representation for Image Compressive Sensing Recovery

Exploiting Image Local And Nonlocal Consistency For Mixed Gaussian-Impulse Noise Removal

High Quality Image Interpolation via Local Autoregressive and Nonlocal 3-D Sparse Regularization

Image Super-Resolution via Dual-Dictionary Learning And Sparse Representation

Improved Total Variation based Image Compressive Sensing Recovery by Nonlocal Regularization