Researcher profile

Yangming Zhou

Yangming Zhou contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2024arXiv

Fact-checking based fake news detection: a review

This paper reviews and summarizes the research results on fact-based fake news from the perspectives of tasks and problems, algorithm strategies, and datasets. First, the paper systematically explains the task definition and core problems of fact-based fake news detection. Second, the paper summarizes the existing detection methods based on the algorithm principles. Third, the paper analyzes the classic and newly proposed datasets in the field, and summarizes the experimental results on each dataset. Finally, the paper summarizes the advantages and disadvantages of existing methods, proposes several challenges that methods in this field may face, and looks forward to the next stage of research. It is hoped that this paper will provide reference for subsequent work in the field.

preprint2022arXiv

Frequent Itemset-driven Search for Finding Minimum Node Separators in Complex Networks

Finding an optimal set of critical nodes in a complex network has been a long-standing problem in the fields of both artificial intelligence and operations research. Potential applications include epidemic control, network security, carbon emission monitoring, emergence response, drug design, and vulnerability assessment. In this work, we consider the problem of finding a minimal node separator whose removal separates a graph into multiple different connected components with fewer than a limited number of vertices in each component. To solve it, we propose a frequent itemset-driven search approach, which integrates the concept of frequent itemset mining in data mining into the well-known memetic search framework. Starting from a high-quality population built by the solution construction and population repair procedures, it iteratively employs the frequent itemset recombination operator (to generate promising offspring solution based on itemsets that frequently occur in high-quality solutions), tabu search-based simulated annealing (to find high-quality local optima), population repair procedure (to modify the population), and rank-based population management strategy (to guarantee a healthy population). Extensive evaluations on 50 widely used benchmark instances show that it significantly outperforms state-of-the-art algorithms. In particular, it discovers 29 new upper bounds and matches 18 previous best-known bounds. Finally, experimental analyses are performed to confirm the effectiveness of key algorithmic modules of the proposed method.

preprint2022arXiv

Heuristic Search for Rank Aggregation with Application to Label Ranking

Rank aggregation aims to combine the preference rankings of a number of alternatives from different voters into a single consensus ranking. As a useful model for a variety of practical applications, however, it is a computationally challenging problem. In this paper, we propose an effective hybrid evolutionary ranking algorithm to solve the rank aggregation problem with both complete and partial rankings. The algorithm features a semantic crossover based on concordant pairs and a late acceptance local search reinforced by an efficient incremental evaluation technique. Experiments are conducted to assess the algorithm, indicating a highly competitive performance on benchmark instances compared with state-of-the-art algorithms. To demonstrate its practical usefulness, the algorithm is applied to label ranking, which is an important machine learning task.

preprint2022arXiv

Multimodal Fake News Detection via CLIP-Guided Learning

Multimodal fake news detection has attracted many research interests in social forensics. Many existing approaches introduce tailored attention mechanisms to guide the fusion of unimodal features. However, how the similarity of these features is calculated and how it will affect the decision-making process in FND are still open questions. Besides, the potential of pretrained multi-modal feature learning models in fake news detection has not been well exploited. This paper proposes a FND-CLIP framework, i.e., a multimodal Fake News Detection network based on Contrastive Language-Image Pretraining (CLIP). Given a targeted multimodal news, we extract the deep representations from the image and text using a ResNet-based encoder, a BERT-based encoder and two pair-wise CLIP encoders. The multimodal feature is a concatenation of the CLIP-generated features weighted by the standardized cross-modal similarity of the two modalities. The extracted features are further processed for redundancy reduction before feeding them into the final classifier. We introduce a modality-wise attention module to adaptively reweight and aggregate the features. We have conducted extensive experiments on typical fake news datasets. The results indicate that the proposed framework has a better capability in mining crucial features for fake news detection. The proposed FND-CLIP can achieve better performances than previous works, i.e., 0.7\%, 6.8\% and 1.3\% improvements in overall accuracy on Weibo, Politifact and Gossipcop, respectively. Besides, we justify that CLIP-based learning can allow better flexibility on multimodal feature selection.

preprint2022arXiv

Robust Watermarking for Video Forgery Detection with Improved Imperceptibility and Robustness

Videos are prone to tampering attacks that alter the meaning and deceive the audience. Previous video forgery detection schemes find tiny clues to locate the tampered areas. However, attackers can successfully evade supervision by destroying such clues using video compression or blurring. This paper proposes a video watermarking network for tampering localization. We jointly train a 3D-UNet-based watermark embedding network and a decoder that predicts the tampering mask. The perturbation made by watermark embedding is close to imperceptible. Considering that there is no off-the-shelf differentiable video codec simulator, we propose to mimic video compression by ensembling simulation results of other typical attacks, e.g., JPEG compression and blurring, as an approximation. Experimental results demonstrate that our method generates watermarked videos with good imperceptibility and robustly and accurately locates tampered areas within the attacked version.

preprint2020arXiv

Cost-Quality Adaptive Active Learning for Chinese Clinical Named Entity Recognition

Clinical Named Entity Recognition (CNER) aims to automatically identity clinical terminologies in Electronic Health Records (EHRs), which is a fundamental and crucial step for clinical research. To train a high-performance model for CNER, it usually requires a large number of EHRs with high-quality labels. However, labeling EHRs, especially Chinese EHRs, is time-consuming and expensive. One effective solution to this is active learning, where a model asks labelers to annotate data which the model is uncertain of. Conventional active learning assumes a single labeler that always replies noiseless answers to queried labels. However, in real settings, multiple labelers provide diverse quality of annotation with varied costs and labelers with low overall annotation quality can still assign correct labels for some specific instances. In this paper, we propose a Cost-Quality Adaptive Active Learning (CQAAL) approach for CNER in Chinese EHRs, which maintains a balance between the annotation quality, labeling costs, and the informativeness of selected instances. Specifically, CQAAL selects cost-effective instance-labeler pairs to achieve better annotation quality with lower costs in an adaptive manner. Computational results on the CCKS-2017 Task 2 benchmark dataset demonstrate the superiority and effectiveness of the proposed CQAAL.

preprint2020arXiv

NE-LP: Normalized Entropy and Loss Prediction based Sampling for Active Learning in Chinese Word Segmentation on EHRs

Electronic Health Records (EHRs) in hospital information systems contain patients' diagnosis and treatments, so EHRs are essential to clinical data mining. Of all the tasks in the mining process, Chinese Word Segmentation (CWS) is a fundamental and important one, and most state-of-the-art methods greatly rely on large-scale of manually-annotated data. Since annotation is time-consuming and expensive, efforts have been devoted to techniques, such as active learning, to locate the most informative samples for modeling. In this paper, we follow the trend and present an active learning method for CWS in EHRs. Specically, a new sampling strategy combining Normalized Entropy with Loss Prediction (NE-LP) is proposed to select the most representative data. Meanwhile, to minimize the computational cost of learning, we propose a joint model including a word segmenter and a loss prediction model. Furthermore, to capture interactions between adjacent characters, bigram features are also applied in the joint model. To illustrate the effectiveness of NE-LP, we conducted experiments on EHRs collected from the Shuguang Hospital Affiliated to Shanghai University of Traditional Chinese Medicine. The results demonstrate that NE-LP consistently outperforms conventional uncertainty-based sampling strategies for active learning in CWS.