Source author record

Eyup Halit Yilmaz

Eyup Halit Yilmaz appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Social and Information Networks Computation and Language

Catalog footprint

What is connected

2works

2topics

2close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

BlackLivesMatter 2020: An Analysis of Deleted and Suspended Users in Twitter

After George Floyd's death in May 2020, the volume of discussion in social media increased dramatically. A series of protests followed this tragic event, called as the 2020 BlackLivesMatter movement. Eventually, many user accounts are deleted by their owners or suspended due to violating the rules of social media platforms. In this study, we analyze what happened in Twitter before and after the event triggers with respect to deleted and suspended users. We create a novel dataset that includes approximately 500k users sharing 20m tweets, half of whom actively participated in the 2020 BlackLivesMatter discussion, but some of them were deleted or suspended later. We particularly examine the factors for undesirable behavior in terms of spamming, negative language, hate speech, and misinformation spread. We find that the users who participated to the 2020 BlackLivesMatter discussion have more negative and undesirable tweets, compared to the users who did not. Furthermore, the number of new accounts in Twitter increased significantly after the trigger event occurred, yet new users are more oriented to have undesirable tweets, compared to old ones.

preprint2022arXiv

Large-Scale Hate Speech Detection with Cross-Domain Transfer

The performance of hate speech detection models relies on the datasets on which the models are trained. Existing datasets are mostly prepared with a limited number of instances or hate domains that define hate topics. This hinders large-scale analysis and transfer learning with respect to hate domains. In this study, we construct large-scale tweet datasets for hate speech detection in English and a low-resource language, Turkish, consisting of human-labeled 100k tweets per each. Our datasets are designed to have equal number of tweets distributed over five domains. The experimental results supported by statistical tests show that Transformer-based language models outperform conventional bag-of-words and neural models by at least 5% in English and 10% in Turkish for large-scale hate speech detection. The performance is also scalable to different training sizes, such that 98% of performance in English, and 97% in Turkish, are recovered when 20% of training instances are used. We further examine the generalization ability of cross-domain transfer among hate domains. We show that 96% of the performance of a target domain in average is recovered by other domains for English, and 92% for Turkish. Gender and religion are more successful to generalize to other domains, while sports fail most.

Eyup Halit Yilmaz

What is connected

Connect this record

See the researcher in context

Building this map preview

2 published item(s)

BlackLivesMatter 2020: An Analysis of Deleted and Suspended Users in Twitter

Large-Scale Hate Speech Detection with Cross-Domain Transfer