Source author record

Rui Zeng

Rui Zeng appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Systems and Control Machine Learning Artificial Intelligence eess.IV eess.SP

Catalog footprint

What is connected

14works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

Enhancing Automatic Modulation Recognition through Robust Global Feature Extraction

Automatic Modulation Recognition (AMR) plays a crucial role in wireless communication systems. Deep learning AMR strategies have achieved tremendous success in recent years. Modulated signals exhibit long temporal dependencies, and extracting global features is crucial in identifying modulation schemes. Traditionally, human experts analyze patterns in constellation diagrams to classify modulation schemes. Classical convolutional-based networks, due to their limited receptive fields, excel at extracting local features but struggle to capture global relationships. To address this limitation, we introduce a novel hybrid deep framework named TLDNN, which incorporates the architectures of the transformer and long short-term memory (LSTM). We utilize the self-attention mechanism of the transformer to model the global correlations in signal sequences while employing LSTM to enhance the capture of temporal dependencies. To mitigate the impact like RF fingerprint features and channel characteristics on model generalization, we propose data augmentation strategies known as segment substitution (SS) to enhance the model's robustness to modulation-related features. Experimental results on widely-used datasets demonstrate that our method achieves state-of-the-art performance and exhibits significant advantages in terms of complexity. Our proposed framework serves as a foundational backbone that can be extended to different datasets. We have verified the effectiveness of our augmentation approach in enhancing the generalization of the models, particularly in few-shot scenarios. Code is available at \url{https://github.com/AMR-Master/TLDNN}.

preprint2020arXiv

Adversarial Pulmonary Pathology Translation for Pairwise Chest X-ray Data Augmentation

Recent works show that Generative Adversarial Networks (GANs) can be successfully applied to chest X-ray data augmentation for lung disease recognition. However, the implausible and distorted pathology features generated from the less than perfect generator may lead to wrong clinical decisions. Why not keep the original pathology region? We proposed a novel approach that allows our generative model to generate high quality plausible images that contain undistorted pathology areas. The main idea is to design a training scheme based on an image-to-image translation network to introduce variations of new lung features around the pathology ground-truth area. Moreover, our model is able to leverage both annotated disease images and unannotated healthy lung images for the purpose of generation. We demonstrate the effectiveness of our model on two tasks: (i) we invite certified radiologists to assess the quality of the generated synthetic images against real and other state-of-the-art generative models, and (ii) data augmentation to improve the performance of disease localisation.

preprint2020arXiv

Deep Auto-Encoders with Sequential Learning for Multimodal Dimensional Emotion Recognition

Multimodal dimensional emotion recognition has drawn a great attention from the affective computing community and numerous schemes have been extensively investigated, making a significant progress in this area. However, several questions still remain unanswered for most of existing approaches including: (i) how to simultaneously learn compact yet representative features from multimodal data, (ii) how to effectively capture complementary features from multimodal streams, and (iii) how to perform all the tasks in an end-to-end manner. To address these challenges, in this paper, we propose a novel deep neural network architecture consisting of a two-stream auto-encoder and a long short term memory for effectively integrating visual and audio signal streams for emotion recognition. To validate the robustness of our proposed architecture, we carry out extensive experiments on the multimodal emotion in the wild dataset: RECOLA. Experimental results show that the proposed method achieves state-of-the-art recognition performance and surpasses existing schemes by a significant margin.

preprint2020arXiv

Joint Deep Cross-Domain Transfer Learning for Emotion Recognition

Deep learning has been applied to achieve significant progress in emotion recognition. Despite such substantial progress, existing approaches are still hindered by insufficient training data, and the resulting models do not generalize well under mismatched conditions. To address this challenge, we propose a learning strategy which jointly transfers the knowledge learned from rich datasets to source-poor datasets. Our method is also able to learn cross-domain features which lead to improved recognition performance. To demonstrate the robustness of our proposed framework, we conducted experiments on three benchmark emotion datasets including eNTERFACE, SAVEE, and EMODB. Experimental results show that the proposed method surpassed state-of-the-art transfer learning schemes by a significant margin.

preprint2020arXiv

MTRNet++: One-stage Mask-based Scene Text Eraser

A precise, controllable, interpretable and easily trainable text removal approach is necessary for both user-specific and large-scale text removal applications. To achieve this, we propose a one-stage mask-based text inpainting network, MTRNet++. It has a novel architecture that includes mask-refine, coarse-inpainting and fine-inpainting branches, and attention blocks. With this architecture, MTRNet++ can remove text either with or without an external mask. It achieves state-of-the-art results on both the Oxford and SCUT datasets without using external ground-truth masks. The results of ablation studies demonstrate that the proposed multi-branch architecture with attention blocks is effective and essential. It also demonstrates controllability and interpretability.

preprint2019arXiv

MTRNet: A Generic Scene Text Eraser

Text removal algorithms have been proposed for uni-lingual scripts with regular shapes and layouts. However, to the best of our knowledge, a generic text removal method which is able to remove all or user-specified text regions regardless of font, script, language or shape is not available. Developing such a generic text eraser for real scenes is a challenging task, since it inherits all the challenges of multi-lingual and curved text detection and inpainting. To fill this gap, we propose a mask-based text removal network (MTRNet). MTRNet is a conditional adversarial generative network (cGAN) with an auxiliary mask. The introduced auxiliary mask not only makes the cGAN a generic text eraser, but also enables stable training and early convergence on a challenging large-scale synthetic dataset, initially proposed for text detection in real scenes. What's more, MTRNet achieves state-of-the-art results on several real-world datasets including ICDAR 2013, ICDAR 2017 MLT, and CTW1500, without being explicitly trained on this data, outperforming previous state-of-the-art methods trained directly on these datasets.

preprint2015arXiv

Color Image Classification via Quaternion Principal Component Analysis Network

The Principal Component Analysis Network (PCANet), which is one of the recently proposed deep learning architectures, achieves the state-of-the-art classification accuracy in various databases. However, the performance of PCANet may be degraded when dealing with color images. In this paper, a Quaternion Principal Component Analysis Network (QPCANet), which is an extension of PCANet, is proposed for color images classification. Compared to PCANet, the proposed QPCANet takes into account the spatial distribution information of color images and ensures larger amount of intra-class invariance of color images. Experiments conducted on different color image datasets such as Caltech-101, UC Merced Land Use, Georgia Tech face and CURet have revealed that the proposed QPCANet achieves higher classification accuracy than PCANet.

preprint2015arXiv

Error Gradient-based Variable-Lp Norm Constraint LMS Algorithm for Sparse System Identification

Sparse adaptive filtering has gained much attention due to its wide applicability in the field of signal processing. Among the main algorithm families, sparse norm constraint adaptive filters develop rapidly in recent years. However, when applied for system identification, most priori work in sparse norm constraint adaptive filtering suffers from the difficulty of adaptability to the sparsity of the systems to be identified. To address this problem, we propose a novel variable p-norm constraint least mean square (LMS) algorithm, which serves as a variant of the conventional Lp-LMS algorithm established for sparse system identification. The parameter p is iteratively adjusted by the gradient descent method applied to the instantaneous square error. Numerical simulations show that this new approach achieves better performance than the traditional Lp-LMS and LMS algorithms in terms of steady-state error and convergence rate.

preprint2015arXiv

Gradient Compared Lp-LMS Algorithms for Sparse System Identification

In this paper, we propose two novel p-norm penalty least mean square (Lp-LMS) algorithms as supplements of the conventional Lp-LMS algorithm established for sparse adaptive filtering recently. A gradient comparator is employed to selectively apply the zero attractor of p-norm constraint for only those taps that have the same polarity as that of the gradient of the squared instantaneous error, which leads to the new proposed gradient compared p-norm constraint LMS algorithm (LpGC-LMS). We explain that the LpGC-LMS can achieve lower mean square error than the standard Lp-LMS algorithm theoretically and experimentally. To further improve the performance of the filter, the LpNGC-LMS algorithm is derived using a new gradient comparator which takes the sign-smoothed version of the previous one. The performance of the LpNGC-LMS is superior to that of the LpGC-LMS in theory and in simulations. Moreover, these two comparators can be easily applied to other norm constraint LMS algorithms to derive some new approaches for sparse adaptive filtering. The numerical simulation results show that the two proposed algorithms achieve better performance than the standard LMS algorithm and Lp-LMS algorithm in terms of convergence rate and steady-state behavior in sparse system identification settings.

preprint2015arXiv

Kernel principal component analysis network for image classification

In order to classify the nonlinear feature with linear classifier and improve the classification accuracy, a deep learning network named kernel principal component analysis network (KPCANet) is proposed. First, mapping the data into higher space with kernel principal component analysis to make the data linearly separable. Then building a two-layer KPCANet to obtain the principal components of image. Finally, classifying the principal components with linearly classifier. Experimental results show that the proposed KPCANet is effective in face recognition, object recognition and hand-writing digits recognition, it also outperforms principal component analysis network (PCANet) generally as well. Besides, KPCANet is invariant to illumination and stable to occlusion and slight deformation.

preprint2015arXiv

p Norm Constraint Leaky LMS Algorithm for Sparse System Identification

This paper proposes a new leaky least mean square (leaky LMS, LLMS) algorithm in which a norm penalty is introduced to force the solution to be sparse in the application of system identification. The leaky LMS algorithm is derived because the performance ofthe standard LMS algorithm deteriorates when the input is highly correlated. However, both ofthem do not take the sparsity information into account to yield better behaviors. As a modification ofthe LLMS algorithm, the proposed algorithm, named Lp-LLMS, incorporates a p norm penalty into the cost function ofthe LLMS to obtain a shrinkage in the weight update equation, which then enhances the performance of the filter in system identification settings, especially when the impulse response is sparse. The simulation results verify that the proposed algorithm improves the performance ofthe filter in sparse system settings in the presence ofnoisy input signals.

preprint2015arXiv

p-norm-like Constraint Leaky LMS Algorithm for Sparse System Identification

In this paper, we propose a novel leaky least mean square (leaky LMS, LLMS) algorithm which employs a p-norm-like constraint to force the solution to be sparse in the application of system identification. As an extension of the LMS algorithm which is the most widely-used adaptive filtering technique, the LLMS algorithm has been proposed for decades, due to the deteriorated performance of the standard LMS algorithm with highly correlated input. However, both ofthem do not consider the sparsity information to have better behaviors. As a sparse-aware modification of the LLMS, our proposed Lplike-LLMS algorithm, incorporates a p-norm-like penalty into the cost function of the LLMS to obtain a shrinkage in the weight update, which then enhances the performance in sparse system identification settings. The simulation results show that the proposed algorithm improves the performance of the filter in sparse system settings in the presence of noisy input signals.

preprint2014arXiv

Multilinear Principal Component Analysis Network for Tensor Object Classification

The recently proposed principal component analysis network (PCANet) has been proved high performance for visual content classification. In this letter, we develop a tensorial extension of PCANet, namely, multilinear principal analysis component network (MPCANet), for tensor object classification. Compared to PCANet, the proposed MPCANet uses the spatial structure and the relationship between each dimension of tensor objects much more efficiently. Experiments were conducted on different visual content datasets including UCF sports action video sequences database and UCF11 database. The experimental results have revealed that the proposed MPCANet achieves higher classification accuracy than PCANet for tensor object classification.

preprint2014arXiv

Tensor object classification via multilinear discriminant analysis network

This paper proposes a multilinear discriminant analysis network (MLDANet) for the recognition of multidimensional objects, known as tensor objects. The MLDANet is a variation of linear discriminant analysis network (LDANet) and principal component analysis network (PCANet), both of which are the recently proposed deep learning algorithms. The MLDANet consists of three parts: 1) The encoder learned by MLDA from tensor data. 2) Features maps ob-tained from decoder. 3) The use of binary hashing and histogram for feature pooling. A learning algorithm for MLDANet is described. Evaluations on UCF11 database indicate that the proposed MLDANet outperforms the PCANet, LDANet, MPCA + LDA, and MLDA in terms of classification for tensor objects.

Rui Zeng

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

Enhancing Automatic Modulation Recognition through Robust Global Feature Extraction

Adversarial Pulmonary Pathology Translation for Pairwise Chest X-ray Data Augmentation

Deep Auto-Encoders with Sequential Learning for Multimodal Dimensional Emotion Recognition

Joint Deep Cross-Domain Transfer Learning for Emotion Recognition

MTRNet++: One-stage Mask-based Scene Text Eraser

MTRNet: A Generic Scene Text Eraser

Color Image Classification via Quaternion Principal Component Analysis Network

Error Gradient-based Variable-Lp Norm Constraint LMS Algorithm for Sparse System Identification

Gradient Compared Lp-LMS Algorithms for Sparse System Identification

Kernel principal component analysis network for image classification

p Norm Constraint Leaky LMS Algorithm for Sparse System Identification

p-norm-like Constraint Leaky LMS Algorithm for Sparse System Identification

Multilinear Principal Component Analysis Network for Tensor Object Classification

Tensor object classification via multilinear discriminant analysis network