Source author record

Ruiqin Xiong

Ruiqin Xiong appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Multimedia eess.IV

Catalog footprint

What is connected

11works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Contrastive and Selective Hidden Embeddings for Medical Image Segmentation

Medical image segmentation has been widely recognized as a pivot procedure for clinical diagnosis, analysis, and treatment planning. However, the laborious and expensive annotation process lags down the speed of further advances. Contrastive learning-based weight pre-training provides an alternative by leveraging unlabeled data to learn a good representation. In this paper, we investigate how contrastive learning benefits the general supervised medical segmentation tasks. To this end, patch-dragsaw contrastive regularization (PDCR) is proposed to perform patch-level tugging and repulsing with the extent controlled by a continuous affinity score. And a new structure dubbed uncertainty-aware feature selection block (UAFS) is designed to perform the feature selection process, which can handle the learning target shift caused by minority features with high uncertainty. By plugging the proposed 2 modules into the existing segmentation architecture, we achieve state-of-the-art results across 8 public datasets from 6 domains. Newly designed modules further decrease the amount of training data to a quarter while achieving comparable, if not better, performances. From this perspective, we take the opposite direction of the original self/un-supervised contrastive learning by further excavating information contained within the label.

preprint2022arXiv

Cross-Block Difference Guided Fast CU Partition for VVC Intra Coding

In this paper, we propose a new fast CU partition algorithm for VVC intra coding based on cross-block difference. This difference is measured by the gradient and the content of sub-blocks obtained from partition and is employed to guide the skipping of unnecessary horizontal and vertical partition modes. With this guidance, a fast determination of block partitions is accordingly achieved. Compared with VVC, our proposed method can save 41.64% (on average) encoding time with only 0.97% (on average) increase of BD-rate.

preprint2022arXiv

HerosNet: Hyperspectral Explicable Reconstruction and Optimal Sampling Deep Network for Snapshot Compressive Imaging

Hyperspectral imaging is an essential imaging modality for a wide range of applications, especially in remote sensing, agriculture, and medicine. Inspired by existing hyperspectral cameras that are either slow, expensive, or bulky, reconstructing hyperspectral images (HSIs) from a low-budget snapshot measurement has drawn wide attention. By mapping a truncated numerical optimization algorithm into a network with a fixed number of phases, recent deep unfolding networks (DUNs) for spectral snapshot compressive sensing (SCI) have achieved remarkable success. However, DUNs are far from reaching the scope of industrial applications limited by the lack of cross-phase feature interaction and adaptive parameter adjustment. In this paper, we propose a novel Hyperspectral Explicable Reconstruction and Optimal Sampling deep Network for SCI, dubbed HerosNet, which includes several phases under the ISTA-unfolding framework. Each phase can flexibly simulate the sensing matrix and contextually adjust the step size in the gradient descent step, and hierarchically fuse and interact the hidden states of previous phases to effectively recover current HSI frames in the proximal mapping step. Simultaneously, a hardware-friendly optimal binary mask is learned end-to-end to further improve the reconstruction performance. Finally, our HerosNet is validated to outperform the state-of-the-art methods on both simulation and real datasets by large margins. The source code is available at https://github.com/jianzhangcs/HerosNet.

preprint2022arXiv

Optical Flow Estimation for Spiking Camera

As a bio-inspired sensor with high temporal resolution, the spiking camera has an enormous potential in real applications, especially for motion estimation in high-speed scenes. However, frame-based and event-based methods are not well suited to spike streams from the spiking camera due to the different data modalities. To this end, we present, SCFlow, a tailored deep learning pipeline to estimate optical flow in high-speed scenes from spike streams. Importantly, a novel input representation is introduced which can adaptively remove the motion blur in spike streams according to the prior motion. Further, for training SCFlow, we synthesize two sets of optical flow data for the spiking camera, SPIkingly Flying Things and Photo-realistic High-speed Motion, denoted as SPIFT and PHM respectively, corresponding to random high-speed and well-designed scenes. Experimental results show that the SCFlow can predict optical flow from spike streams in different high-speed scenes. Moreover, SCFlow shows promising generalization on \textbf{real spike streams}. Codes and datasets refer to https://github.com/Acnext/Optical-Flow-For-Spiking-Camera.

preprint2022arXiv

Spatio-Temporal Recurrent Networks for Event-Based Optical Flow Estimation

Event camera has offered promising alternative for visual perception, especially in high speed and high dynamic range scenes. Recently, many deep learning methods have shown great success in providing promising solutions to many event-based problems, such as optical flow estimation. However, existing deep learning methods did not address the importance of temporal information well from the perspective of architecture design and cannot effectively extract spatio-temporal features. Another line of research that utilizes Spiking Neural Network suffers from training issues for deeper architecture.To address these points, a novel input representation is proposed that captures the events' temporal distribution for signal enhancement. Moreover, we introduce a spatio-temporal recurrent encoding-decoding neural network architecture for event-based optical flow estimation, which utilizes Convolutional Gated Recurrent Units to extract feature maps from a series of event images. Besides, our architecture allows some traditional frame-based core modules, such as correlation layer and iterative residual refine scheme, to be incorporated. The network is end-to-end trained with self-supervised learning on the Multi-Vehicle Stereo Event Camera dataset. We have shown that it outperforms all the existing state-of-the-art methods by a large margin. The code link is https://github.com/ruizhao26/STE-FlowNet.

preprint2022arXiv

Tree-Structured Data Clustering-Driven Neural Network for Intra Prediction in Video Coding

As a crucial part of video compression, intra prediction utilizes local information of images to eliminate the redundancy in spatial domain. In both the High Efficiency Video Coding (H.265/HEVC) and Versatile Video Coding (H.266/VVC), multiple directional prediction modes are employed to find the texture trend of each small block and then the prediction is made based on reference samples in the selected direction. Recently, the intra prediction schemes based on neural networks have achieved great success. In these methods, the networks are trained and applied to intra prediction to assist the directional prediction modes. In this paper, we propose a novel tree-structured data clustering-driven neural network (dubbed TreeNet) for intra prediction, which builds the networks and clusters the training data in a tree-structured manner. Specifically, in each network split and training process of TreeNet, every parent network on a leaf node is split into two child networks by adding or subtracting Gaussian random noise. Then a data clustering-driven training is applied to train the two derived child networks using the clustered training data of their parent. To test the performance, TreeNet is integrated into VVC and HEVC to combine with or replace the directional prediction modes. In addition, a fast termination strategy is proposed to accelerate the search of TreeNet. The experimental results demonstrate that TreeNet with the fast termination can reach an average of 2.8% Bjontegaard distortion rate (BD-rate) improvement (up to 8.1%) and 4.9% BD-rate improvement (up to 8.2%) over VVC (VTM-4.0) and HEVC (HM-16.9) with all intra configuration, respectively.

preprint2016arXiv

Efficient Multiple Line-Based Intra Prediction for HEVC

Traditional intra prediction usually utilizes the nearest reference line to generate the predicted block when considering strong spatial correlation. However, this kind of single line-based method does not always work well due to at least two issues. One is the incoherence caused by the signal noise or the texture of other object, where this texture deviates from the inherent texture of the current block. The other reason is that the nearest reference line usually has worse reconstruction quality in block-based video coding. Due to these two issues, this paper proposes an efficient multiple line-based intra prediction scheme to improve coding efficiency. Besides the nearest reference line, further reference lines are also utilized. The further reference lines with relatively higher quality can provide potential better prediction. At the same time, the residue compensation is introduced to calibrate the prediction of boundary regions in a block when we utilize further reference lines. To speed up the encoding process, this paper designs several fast algorithms. Experimental results show that, compared with HM-16.9, the proposed fast search method achieves 2.0% bit saving on average and up to 3.7%, with increasing the encoding time by 112%.

preprint2014arXiv

Image Restoration Using Joint Statistical Modeling in Space-Transform Domain

This paper presents a novel strategy for high-fidelity image restoration by characterizing both local smoothness and nonlocal self-similarity of natural images in a unified statistical manner. The main contributions are three-folds. First, from the perspective of image statistics, a joint statistical modeling (JSM) in an adaptive hybrid space-transform domain is established, which offers a powerful mechanism of combining local smoothness and nonlocal self-similarity simultaneously to ensure a more reliable and robust estimation. Second, a new form of minimization functional for solving image inverse problem is formulated using JSM under regularization-based framework. Finally, in order to make JSM tractable and robust, a new Split-Bregman based algorithm is developed to efficiently solve the above severely underdetermined inverse problem associated with theoretical proof of convergence. Extensive experiments on image inpainting, image deblurring and mixed Gaussian plus salt-and-pepper noise removal applications verify the effectiveness of the proposed algorithm.

preprint2012arXiv

Exploiting Image Local And Nonlocal Consistency For Mixed Gaussian-Impulse Noise Removal

Most existing image denoising algorithms can only deal with a single type of noise, which violates the fact that the noisy observed images in practice are often suffered from more than one type of noise during the process of acquisition and transmission. In this paper, we propose a new variational algorithm for mixed Gaussian-impulse noise removal by exploiting image local consistency and nonlocal consistency simultaneously. Specifically, the local consistency is measured by a hyper-Laplace prior, enforcing the local smoothness of images, while the nonlocal consistency is measured by three-dimensional sparsity of similar blocks, enforcing the nonlocal self-similarity of natural images. Moreover, a Split-Bregman based technique is developed to solve the above optimization problem efficiently. Extensive experiments for mixed Gaussian plus impulse noise show that significant performance improvements over the current state-of-the-art schemes have been achieved, which substantiates the effectiveness of the proposed algorithm.

preprint2012arXiv

Image Super-Resolution via Dual-Dictionary Learning And Sparse Representation

Learning-based image super-resolution aims to reconstruct high-frequency (HF) details from the prior model trained by a set of high- and low-resolution image patches. In this paper, HF to be estimated is considered as a combination of two components: main high-frequency (MHF) and residual high-frequency (RHF), and we propose a novel image super-resolution method via dual-dictionary learning and sparse representation, which consists of the main dictionary learning and the residual dictionary learning, to recover MHF and RHF respectively. Extensive experimental results on test images validate that by employing the proposed two-layer progressive scheme, more image details can be recovered and much better results can be achieved than the state-of-the-art algorithms in terms of both PSNR and visual perception.

preprint2012arXiv

Improved Total Variation based Image Compressive Sensing Recovery by Nonlocal Regularization

Recently, total variation (TV) based minimization algorithms have achieved great success in compressive sensing (CS) recovery for natural images due to its virtue of preserving edges. However, the use of TV is not able to recover the fine details and textures, and often suffers from undesirable staircase artifact. To reduce these effects, this letter presents an improved TV based image CS recovery algorithm by introducing a new nonlocal regularization constraint into CS optimization problem. The nonlocal regularization is built on the well known nonlocal means (NLM) filtering and takes advantage of self-similarity in images, which helps to suppress the staircase effect and restore the fine details. Furthermore, an efficient augmented Lagrangian based algorithm is developed to solve the above combined TV and nonlocal regularization constrained problem. Experimental results demonstrate that the proposed algorithm achieves significant performance improvements over the state-of-the-art TV based algorithm in both PSNR and visual perception.

Ruiqin Xiong

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Contrastive and Selective Hidden Embeddings for Medical Image Segmentation

Cross-Block Difference Guided Fast CU Partition for VVC Intra Coding

HerosNet: Hyperspectral Explicable Reconstruction and Optimal Sampling Deep Network for Snapshot Compressive Imaging

Optical Flow Estimation for Spiking Camera

Spatio-Temporal Recurrent Networks for Event-Based Optical Flow Estimation

Tree-Structured Data Clustering-Driven Neural Network for Intra Prediction in Video Coding

Efficient Multiple Line-Based Intra Prediction for HEVC

Image Restoration Using Joint Statistical Modeling in Space-Transform Domain

Exploiting Image Local And Nonlocal Consistency For Mixed Gaussian-Impulse Noise Removal

Image Super-Resolution via Dual-Dictionary Learning And Sparse Representation

Improved Total Variation based Image Compressive Sensing Recovery by Nonlocal Regularization