Source author record

Cagri Toraman

Cagri Toraman appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Social and Information Networks

Catalog footprint

What is connected

3works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

OpenEthics: A Comprehensive Ethical Evaluation of Open-Source Generative Large Language Models

Generative large language models present significant potential but also raise critical ethical concerns, including issues of safety, fairness, robustness, and reliability. Most existing ethical studies, however, are limited by their narrow focus, a lack of language diversity, and an evaluation of a restricted set of models. To address these gaps, we present a broad ethical evaluation of 29 recent open-source LLMs using a novel dataset that assesses four key ethical dimensions: robustness, reliability, safety, and fairness. Our analysis includes both a high-resource language, English, and a low-resource language, Turkish, providing a comprehensive assessment and a guide for safer model development. Using an LLM-as-a-Judge methodology, our experimental results indicate that many open-source models demonstrate strong performance in safety, fairness, and robustness, while reliability remains a key concern. Ethical evaluation shows cross-linguistic consistency, and larger models generally exhibit better ethical performance. We also show that jailbreak templates are ineffective for most of the open-source models examined in this study. We share all materials including data and scripts at https://github.com/metunlp/openethics

preprint2022arXiv

BlackLivesMatter 2020: An Analysis of Deleted and Suspended Users in Twitter

After George Floyd's death in May 2020, the volume of discussion in social media increased dramatically. A series of protests followed this tragic event, called as the 2020 BlackLivesMatter movement. Eventually, many user accounts are deleted by their owners or suspended due to violating the rules of social media platforms. In this study, we analyze what happened in Twitter before and after the event triggers with respect to deleted and suspended users. We create a novel dataset that includes approximately 500k users sharing 20m tweets, half of whom actively participated in the 2020 BlackLivesMatter discussion, but some of them were deleted or suspended later. We particularly examine the factors for undesirable behavior in terms of spamming, negative language, hate speech, and misinformation spread. We find that the users who participated to the 2020 BlackLivesMatter discussion have more negative and undesirable tweets, compared to the users who did not. Furthermore, the number of new accounts in Twitter increased significantly after the trigger event occurred, yet new users are more oriented to have undesirable tweets, compared to old ones.

preprint2022arXiv

Large-Scale Hate Speech Detection with Cross-Domain Transfer

The performance of hate speech detection models relies on the datasets on which the models are trained. Existing datasets are mostly prepared with a limited number of instances or hate domains that define hate topics. This hinders large-scale analysis and transfer learning with respect to hate domains. In this study, we construct large-scale tweet datasets for hate speech detection in English and a low-resource language, Turkish, consisting of human-labeled 100k tweets per each. Our datasets are designed to have equal number of tweets distributed over five domains. The experimental results supported by statistical tests show that Transformer-based language models outperform conventional bag-of-words and neural models by at least 5% in English and 10% in Turkish for large-scale hate speech detection. The performance is also scalable to different training sizes, such that 98% of performance in English, and 97% in Turkish, are recovered when 20% of training instances are used. We further examine the generalization ability of cross-domain transfer among hate domains. We show that 96% of the performance of a target domain in average is recovered by other domains for English, and 92% for Turkish. Gender and religion are more successful to generalize to other domains, while sports fail most.

Cagri Toraman

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

OpenEthics: A Comprehensive Ethical Evaluation of Open-Source Generative Large Language Models

BlackLivesMatter 2020: An Analysis of Deleted and Suspended Users in Twitter

Large-Scale Hate Speech Detection with Cross-Domain Transfer