Source author record

Yilin Zhang

Yilin Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computation and Language astro-ph.GA astro-ph.HE astro-ph.IM Cryptography and Security gr-qc Machine Learning Methodology Social and Information Networks stat.OT

Catalog footprint

What is connected

5works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Don't Click That: Teaching Web Agents to Resist Deceptive Interfaces

Vision-language model (VLM) based web agents demonstrate impressive autonomous GUI interaction but remain vulnerable to deceptive interface elements. Existing approaches either detect deception without task integration or document attacks without proposing defenses. We formalize deception-aware web agent defense and propose DUDE (Deceptive UI Detector & Evaluator), a two-stage framework combining hybrid-reward learning with asymmetric penalties and experience summarization to distill failure patterns into transferable guidance. We introduce RUC (Real UI Clickboxes), a benchmark of 1,407 scenarios spanning four domains and deception categories. Experiments show DUDE reduces deception susceptibility by 53.8% while maintaining task performance, establishing an effective foundation for robust web agent deployment.

preprint2022arXiv

Detecting fake news by enhanced text representation with multi-EDU-structure awareness

Since fake news poses a serious threat to society and individuals, numerous studies have been brought by considering text, propagation and user profiles. Due to the data collection problem, these methods based on propagation and user profiles are less applicable in the early stages. A good alternative method is to detect news based on text as soon as they are released, and a lot of text-based methods were proposed, which usually utilized words, sentences or paragraphs as basic units. But, word is a too fine-grained unit to express coherent information well, sentence or paragraph is too coarse to show specific information. Which granularity is better and how to utilize it to enhance text representation for fake news detection are two key problems. In this paper, we introduce Elementary Discourse Unit (EDU) whose granularity is between word and sentence, and propose a multi-EDU-structure awareness model to improve text representation for fake news detection, namely EDU4FD. For the multi-EDU-structure awareness, we build the sequence-based EDU representations and the graph-based EDU representations. The former is gotten by modeling the coherence between consecutive EDUs with TextCNN that reflect the semantic coherence. For the latter, we first extract rhetorical relations to build the EDU dependency graph, which can show the global narrative logic and help deliver the main idea truthfully. Then a Relation Graph Attention Network (RGAT) is set to get the graph-based EDU representation. Finally, the two EDU representations are incorporated as the enhanced text representation for fake news detection, using a gated recursive unit combined with a global attention mechanism. Experiments on four cross-source fake news datasets show that our model outperforms the state-of-the-art text-based methods.

preprint2022arXiv

Model-Free, Monotone Invariant and Computationally Efficient Feature Screening with Data-adaptive Threshold

Feature screening for ultrahigh-dimension, in general, proceeds with two essential steps. The first step is measuring and ranking the marginal dependence between response and covariates, and the second is determining the threshold. We develop a new screening procedure, called SIT-BY procedure, that possesses appealing statistical properties in both steps. By employing sliced independence estimates in the measuring and ranking stage, our proposed procedure requires no model assumptions, remains invariant to monotone transformation, and achieves almost linear computation complexity. Inspired by false discovery rate (FDR) control procedures, we offer a data-adaptive threshold benefit from the asymptotic normality of test statistics. Under moderate conditions, we demonstrate that our procedure can asymptotically control the FDR while maintaining the sure screening property. We investigate the finite sample performance of our proposed procedure via extensive simulations and an application to genome-wide dataset.

preprint2020arXiv

A Multitask Deep Learning Approach for User Depression Detection on Sina Weibo

In recent years, due to the mental burden of depression, the number of people who endanger their lives has been increasing rapidly. The online social network (OSN) provides researchers with another perspective for detecting individuals suffering from depression. However, existing studies of depression detection based on machine learning still leave relatively low classification performance, suggesting that there is significant improvement potential for improvement in their feature engineering. In this paper, we manually build a large dataset on Sina Weibo (a leading OSN with the largest number of active users in the Chinese community), namely Weibo User Depression Detection Dataset (WU3D). It includes more than 20,000 normal users and more than 10,000 depressed users, both of which are manually labeled and rechecked by professionals. By analyzing the user's text, social behavior, and posted pictures, ten statistical features are concluded and proposed. In the meantime, text-based word features are extracted using the popular pretrained model XLNet. Moreover, a novel deep neural network classification model, i.e. FusionNet (FN), is proposed and simultaneously trained with the above-extracted features, which are seen as multiple classification tasks. The experimental results show that FusionNet achieves the highest F1-Score of 0.9772 on the test dataset. Compared to existing studies, our proposed method has better classification performance and robustness for unbalanced training samples. Our work also provides a new way to detect depression on other OSN platforms.

preprint2015arXiv

Detection and localization of single-source gravitational waves with pulsar timing arrays

Pulsar timing arrays (PTAs) can be used to search for very low frequency ($10^{-9}$--$10^{-7}$ Hz) gravitational waves (GWs). In this paper we present a general method for the detection and localization of single-source GWs using PTAs. We demonstrate the effectiveness of this new method for three types of signals: monochromatic waves as expected from individual supermassive binary black holes in circular orbits, GWs from eccentric binaries and GW bursts. We also test its implementation in realistic data sets that include effects such as uneven sampling and heterogeneous data spans and measurement precision. It is shown that our method, which works in the frequency domain, performs as well as published time-domain methods. In particular, we find it equivalent to the $\mathcal{F}_{e}$-statistic for monochromatic waves. We also discuss the construction of null streams -- data streams that have null response to GWs, and the prospect of using null streams as a consistency check in the case of detected GW signals. Finally, we present sensitivities to individual supermassive binary black holes in eccentric orbits. We find that a monochromatic search that is designed for circular binaries can efficiently detect eccentric binaries with both high and low eccentricities, while a harmonic summing technique provides greater sensitivities only for binaries with moderate eccentricities.