Source author record

Qichao Ying

Qichao Ying appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Multimedia Artificial Intelligence

Catalog footprint

What is connected

9works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

Fact-checking based fake news detection: a review

This paper reviews and summarizes the research results on fact-based fake news from the perspectives of tasks and problems, algorithm strategies, and datasets. First, the paper systematically explains the task definition and core problems of fact-based fake news detection. Second, the paper summarizes the existing detection methods based on the algorithm principles. Third, the paper analyzes the classic and newly proposed datasets in the field, and summarizes the experimental results on each dataset. Finally, the paper summarizes the advantages and disadvantages of existing methods, proposes several challenges that methods in this field may face, and looks forward to the next stage of research. It is hoped that this paper will provide reference for subsequent work in the field.

preprint2024arXiv

From Covert Hiding to Visual Editing: Robust Generative Video Steganography

Traditional video steganography methods are based on modifying the covert space for embedding, whereas we propose an innovative approach that embeds secret message within semantic feature for steganography during the video editing process. Although existing traditional video steganography methods display a certain level of security and embedding capacity, they lack adequate robustness against common distortions in online social networks (OSNs). In this paper, we introduce an end-to-end robust generative video steganography network (RoGVS), which achieves visual editing by modifying semantic feature of videos to embed secret message. We employ face-swapping scenario to showcase the visual editing effects. We first design a secret message embedding module to adaptively hide secret message into the semantic feature of videos. Extensive experiments display that the proposed RoGVS method applied to facial video datasets demonstrate its superiority over existing video and image steganography techniques in terms of both robustness and capacity.

preprint2024arXiv

PROMPT-IML: Image Manipulation Localization with Pre-trained Foundation Models Through Prompt Tuning

Deceptive images can be shared in seconds with social networking services, posing substantial risks. Tampering traces, such as boundary artifacts and high-frequency information, have been significantly emphasized by massive networks in the Image Manipulation Localization (IML) field. However, they are prone to image post-processing operations, which limit the generalization and robustness of existing methods. We present a novel Prompt-IML framework. We observe that humans tend to discern the authenticity of an image based on both semantic and high-frequency information, inspired by which, the proposed framework leverages rich semantic knowledge from pre-trained visual foundation models to assist IML. We are the first to design a framework that utilizes visual foundation models specially for the IML task. Moreover, we design a Feature Alignment and Fusion module to align and fuse features of semantic features with high-frequency features, which aims at locating tampered regions from multiple perspectives. Experimental results demonstrate that our model can achieve better performance on eight typical fake image datasets and outstanding robustness.

preprint2022arXiv

A DTCWT-SVD Based Video Watermarking resistant to frame rate conversion

Videos can be easily tampered, copied and redistributed by attackers for illegal and monetary usage. Such behaviors severely jeopardize the interest of content owners. Despite huge efforts made in digital video watermarking for copyright protection, typical distortions in video transmission including signal attacks, geometric attacks and temporal synchronization attacks can still easily erase the embedded signal. Among them, temporal synchronization attacks which include frame dropping, frame insertion and frame rate conversion is one of the most prevalent attacks. To address this issue, we present a new video watermarking based on joint Dual-Tree Cosine Wavelet Transformation (DTCWT) and Singular Value Decomposition (SVD), which is resistant to frame rate conversion. We first extract a set of candidate coefficient by applying SVD decomposition after DTCWT transform. Then, we simulate the watermark embedding by adjusting the shape of candidate coefficient. Finally, we perform group-level watermarking that includes moderate temporal redundancy to resist temporal desynchronization attacks. Extensive experimental results show that the proposed scheme is more resilient to temporal desynchronization attacks and performs better than the existing blind video watermarking schemes.

preprint2022arXiv

High-Capacity Framework for Reversible Data Hiding in Encrypted Image Using Pixel Predictions and Entropy Encoding

While the existing vacating room before encryption (VRBE) based schemes can achieve decent embedding rate, the payloads of the existing vacating room after encryption (VRAE) based schemes are relatively low. To address this issue, this paper proposes a generalized framework for high-capacity RDHEI for both VRBE and VRAE cases. First, an efficient embedding room generation algorithm (ERGA) is designed to produce large embedding room by using pixel prediction and entropy encoding. Then, we propose two RDHEI schemes, one for VRBE, another for VRAE. In the VRBE scenario, the image owner generates the embedding room with ERGA and encrypts the preprocessed image by using the stream cipher with two encryption keys. Then, the data hider locates the embedding room and embeds the encrypted additional data. In the VRAE scenario, the cover image is encrypted by an improved block modulation and permutation encryption algorithm, where the spatial redundancy in the plain-text image is largely preserved. Then, the data hider applies ERGA on the encrypted image to generate the embedding room and conducts data embedding. For both schemes, the receivers with different authentication keys can respectively conduct error-free data extraction and/or error-free image recovery. The experimental results show that the two proposed schemes outperform many state-of-the-art RDHEI arts. Besides, the schemes can ensure high security level, where the original image can be hardly discovered from the encrypted version before and after data hiding by the unauthorized user.

preprint2022arXiv

Image Generation Network for Covert Transmission in Online Social Network

Online social networks have stimulated communications over the Internet more than ever, making it possible for secret message transmission over such noisy channels. In this paper, we propose a Coverless Image Steganography Network, called CIS-Net, that synthesizes a high-quality image directly conditioned on the secret message to transfer. CIS-Net is composed of four modules, namely, the Generation, Adversarial, Extraction, and Noise Module. The receiver can extract the hidden message without any loss even the images have been distorted by JPEG compression attacks. To disguise the behaviour of steganography, we collected images in the context of profile photos and stickers and train our network accordingly. As such, the generated images are more inclined to escape from malicious detection and attack. The distinctions from previous image steganography methods are majorly the robustness and losslessness against diverse attacks. Experiments over diverse public datasets have manifested the superior ability of anti-steganalysis.

preprint2022arXiv

Multimodal Fake News Detection via CLIP-Guided Learning

Multimodal fake news detection has attracted many research interests in social forensics. Many existing approaches introduce tailored attention mechanisms to guide the fusion of unimodal features. However, how the similarity of these features is calculated and how it will affect the decision-making process in FND are still open questions. Besides, the potential of pretrained multi-modal feature learning models in fake news detection has not been well exploited. This paper proposes a FND-CLIP framework, i.e., a multimodal Fake News Detection network based on Contrastive Language-Image Pretraining (CLIP). Given a targeted multimodal news, we extract the deep representations from the image and text using a ResNet-based encoder, a BERT-based encoder and two pair-wise CLIP encoders. The multimodal feature is a concatenation of the CLIP-generated features weighted by the standardized cross-modal similarity of the two modalities. The extracted features are further processed for redundancy reduction before feeding them into the final classifier. We introduce a modality-wise attention module to adaptively reweight and aggregate the features. We have conducted extensive experiments on typical fake news datasets. The results indicate that the proposed framework has a better capability in mining crucial features for fake news detection. The proposed FND-CLIP can achieve better performances than previous works, i.e., 0.7\%, 6.8\% and 1.3\% improvements in overall accuracy on Weibo, Politifact and Gossipcop, respectively. Besides, we justify that CLIP-based learning can allow better flexibility on multimodal feature selection.

preprint2022arXiv

Robust Watermarking for Video Forgery Detection with Improved Imperceptibility and Robustness

Videos are prone to tampering attacks that alter the meaning and deceive the audience. Previous video forgery detection schemes find tiny clues to locate the tampered areas. However, attackers can successfully evade supervision by destroying such clues using video compression or blurring. This paper proposes a video watermarking network for tampering localization. We jointly train a 3D-UNet-based watermark embedding network and a decoder that predicts the tampering mask. The perturbation made by watermark embedding is close to imperceptible. Considering that there is no off-the-shelf differentiable video codec simulator, we propose to mimic video compression by ensembling simulation results of other typical attacks, e.g., JPEG compression and blurring, as an approximation. Experimental results demonstrate that our method generates watermarked videos with good imperceptibility and robustly and accurately locates tampered areas within the attacked version.

preprint2022arXiv

RWN: Robust Watermarking Network for Image Cropping Localization

Image cropping can be maliciously used to manipulate the layout of an image and alter the underlying meaning. Previous image crop detection schemes only predicts whether an image has been cropped, ignoring which part of the image is cropped. This paper presents a novel robust watermarking network (RWN) for image crop localization. We train an anti-crop processor (ACP) that embeds a watermark into a target image. The visually indistinguishable protected image is then posted on the social network instead of the original image. At the recipient's side, ACP extracts the watermark from the attacked image, and we conduct feature matching on the original and extracted watermark to locate the position of the crop in the original image plane. We further extend our scheme to detect tampering attack on the attacked image. Besides, we explore a simple yet efficient method (JPEG-Mixup) to improve the generalization of JPEG robustness. According to our comprehensive experiments, RWN is the first to provide high-accuracy and robust image crop localization. Besides, the accuracy of tamper detection is comparable with many state-of-the-art passive-based methods.

Qichao Ying

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Fact-checking based fake news detection: a review

From Covert Hiding to Visual Editing: Robust Generative Video Steganography

PROMPT-IML: Image Manipulation Localization with Pre-trained Foundation Models Through Prompt Tuning

A DTCWT-SVD Based Video Watermarking resistant to frame rate conversion

High-Capacity Framework for Reversible Data Hiding in Encrypted Image Using Pixel Predictions and Entropy Encoding

Image Generation Network for Covert Transmission in Online Social Network

Multimodal Fake News Detection via CLIP-Guided Learning

Robust Watermarking for Video Forgery Detection with Improved Imperceptibility and Robustness

RWN: Robust Watermarking Network for Image Cropping Localization