Source author record

Jeremy Blackburn

Jeremy Blackburn appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cs.CY Social and Information Networks physics.soc-ph Human-Computer Interaction Networking and Internet Architecture Artificial Intelligence Computer Vision Cryptography and Security Distributed, Parallel, and Cluster Computing Multimedia Performance

Catalog footprint

What is connected

22works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Feels Bad Man: Dissecting Automated Hateful Meme Detection Through the Lens of Facebook's Challenge

Internet memes have become a dominant method of communication; at the same time, however, they are also increasingly being used to advocate extremism and foster derogatory beliefs. Nonetheless, we do not have a firm understanding as to which perceptual aspects of memes cause this phenomenon. In this work, we assess the efficacy of current state-of-the-art multimodal machine learning models toward hateful meme detection, and in particular with respect to their generalizability across platforms. We use two benchmark datasets comprising 12,140 and 10,567 images from 4chan's "Politically Incorrect" board (/pol/) and Facebook's Hateful Memes Challenge dataset to train the competition's top-ranking machine learning models for the discovery of the most prominent features that distinguish viral hateful memes from benign ones. We conduct three experiments to determine the importance of multimodality on classification performance, the influential capacity of fringe Web communities on mainstream social platforms and vice versa, and the models' learning transferability on 4chan memes. Our experiments show that memes' image characteristics provide a greater wealth of information than its textual content. We also find that current systems developed for online detection of hate speech in memes necessitate further concentration on its visual elements to improve their interpretation of underlying cultural connotations, implying that multimodal models fail to adequately grasp the intricacies of hate speech in memes and generalize across social media platforms.

preprint2022arXiv

On Xing Tian and the Perseverance of Anti-China Sentiment Online

Sinophobia, anti-Chinese sentiment, has existed on the Web for a long time. The outbreak of COVID-19 and the extended quarantine has further amplified it. However, we lack a quantitative understanding of the cause of Sinophobia as well as how it evolves over time. In this paper, we conduct a large-scale longitudinal measurement of Sinophobia, between 2016 and 2021, on two mainstream and fringe Web communities. By analyzing 8B posts from Reddit and 206M posts from 4chan's /pol/, we investigate the origins, evolution, and content of Sinophobia. We find that, anti-Chinese content may be evoked by political events not directly related to China, e.g., the U.S. withdrawal from the Paris Agreement. And during the COVID-19 pandemic, daily usage of Sinophobic slurs has significantly increased even with the hate-speech ban policy. We also show that the semantic meaning of the words "China" and "Chinese" are shifting towards Sinophobic slurs with the rise of COVID-19 and remain the same in the pandemic period. We further use topic modeling to show the topics of Sinophobic discussion are pretty diverse and broad. We find that both Web communities share some common Sinophobic topics like ethnics, economics and commerce, weapons and military, foreign relations, etc. However, compared to 4chan's /pol/, more daily life-related topics including food, game, and stock are found in Reddit. Our finding also reveals that the topics related to COVID-19 and blaming the Chinese government are more prevalent in the pandemic period. To the best of our knowledge, this paper is the longest quantitative measurement of Sinophobia.

preprint2022arXiv

The Gospel According to Q: Understanding the QAnon Conspiracy from the Perspective of Canonical Information

The QAnon conspiracy theory claims that a cabal of (literally) blood-thirsty politicians and media personalities are engaged in a war to destroy society. By interpreting cryptic "drops" of information from an anonymous insider calling themself Q, adherents of the conspiracy theory believe that Donald Trump is leading them in an active fight against this cabal. QAnon has been covered extensively by the media, as its adherents have been involved in multiple violent acts, including the January 6th, 2021 seditious storming of the US Capitol building. Nevertheless, we still have relatively little understanding of how the theory evolved and spread on the Web, and the role played in that by multiple platforms. To address this gap, we study QAnon from the perspective of "Q" themself. We build a dataset of 4,949 canonical Q drops collected from six "aggregation sites," which curate and archive them from their original posting to anonymous and ephemeral image boards. We expose that these sites have a relatively low (overall) agreement, and thus at least some Q drops should probably be considered apocryphal. We then analyze the Q drops' contents to identify topics of discussion and find statistically significant indications that drops were not authored by a single individual. Finally, we look at how posts on Reddit are used to disseminate Q drops to wider audiences. We find that dissemination was (initially) limited to a few sub-communities and that, while heavy-handed moderation decisions have reduced the overall issue, the "gospel" of Q persists on the Web.

preprint2022arXiv

Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots

Chatbots are used in many applications, e.g., automated agents, smart home assistants, interactive characters in online games, etc. Therefore, it is crucial to ensure they do not behave in undesired manners, providing offensive or toxic responses to users. This is not a trivial task as state-of-the-art chatbot models are trained on large, public datasets openly collected from the Internet. This paper presents a first-of-its-kind, large-scale measurement of toxicity in chatbots. We show that publicly available chatbots are prone to providing toxic responses when fed toxic queries. Even more worryingly, some non-toxic queries can trigger toxic responses too. We then set out to design and experiment with an attack, ToxicBuddy, which relies on fine-tuning GPT-2 to generate non-toxic queries that make chatbots respond in a toxic manner. Our extensive experimental evaluation demonstrates that our attack is effective against public chatbot models and outperforms manually-crafted malicious queries proposed by previous work. We also evaluate three defense mechanisms against ToxicBuddy, showing that they either reduce the attack performance at the cost of affecting the chatbot's utility or are only effective at mitigating a portion of the attack. This highlights the need for more research from the computer security and online safety communities to ensure that chatbot models do not hurt their users. Overall, we are confident that ToxicBuddy can be used as an auditing tool and that our work will pave the way toward designing more effective defenses for chatbot safety.

preprint2021arXiv

"Go eat a bat, Chang!": On the Emergence of Sinophobic Behavior on Web Communities in the Face of COVID-19

The outbreak of the COVID-19 pandemic has changed our lives in unprecedented ways. In the face of the projected catastrophic consequences, many countries have enacted social distancing measures in an attempt to limit the spread of the virus. Under these conditions, the Web has become an indispensable medium for information acquisition, communication, and entertainment. At the same time, unfortunately, the Web is being exploited for the dissemination of potentially harmful and disturbing content, such as the spread of conspiracy theories and hateful speech towards specific ethnic groups, in particular towards Chinese people since COVID-19 is believed to have originated from China. In this paper, we make a first attempt to study the emergence of Sinophobic behavior on the Web during the outbreak of the COVID-19 pandemic. We collect two large-scale datasets from Twitter and 4chan's Politically Incorrect board (/pol/) over a time period of approximately five months and analyze them to investigate whether there is a rise or important differences with regard to the dissemination of Sinophobic content. We find that COVID-19 indeed drives the rise of Sinophobia on the Web and that the dissemination of Sinophobic content is a cross-platform phenomenon: it exists on fringe Web communities like \dspol, and to a lesser extent on mainstream ones like Twitter. Also, using word embeddings over time, we characterize the evolution and emergence of new Sinophobic slurs on both Twitter and /pol/. Finally, we find interesting differences in the context in which words related to Chinese people are used on the Web before and after the COVID-19 outbreak: on Twitter we observe a shift towards blaming China for the situation, while on /pol/ we find a shift towards using more (and new) Sinophobic slurs.

preprint2021arXiv

"Is it a Qoincidence?": An Exploratory Study of QAnon on Voat

Online fringe communities offer fertile grounds for users seeking and sharing ideas fueling suspicion of mainstream news and conspiracy theories. Among these, the QAnon conspiracy theory emerged in 2017 on 4chan, broadly supporting the idea that powerful politicians, aristocrats, and celebrities are closely engaged in a global pedophile ring. Simultaneously, governments are thought to be controlled by "puppet masters," as democratically elected officials serve as a fake showroom of democracy. This paper provides an empirical exploratory analysis of the QAnon community on Voat.co, a Reddit-esque news aggregator, which has captured the interest of the press for its toxicity and for providing a platform to QAnon followers. More precisely, we analyze a large dataset from /v/GreatAwakening, the most popular QAnon-related subverse (the Voat equivalent of a subreddit), to characterize activity and user engagement. To further understand the discourse around QAnon, we study the most popular named entities mentioned in the posts, along with the most prominent topics of discussion, which focus on US politics, Donald Trump, and world events. We also use word embeddings to identify narratives around QAnon-specific keywords. Our graph visualization shows that some of the QAnon-related ones are closely related to those from the Pizzagate conspiracy theory and so-called drops by "Q." Finally, we analyze content toxicity, finding that discussions on /v/GreatAwakening are less toxic than in the broad Voat community.

preprint2021arXiv

A Multi-Platform Analysis of Political News Discussion and Sharing on Web Communities

The news ecosystem has become increasingly complex, encompassing a wide range of sources with varying levels of trustworthiness, and with public commentary giving different spins to the same stories. In this paper, we present a multi-platform measurement of this ecosystem. We compile a list of 1,073 news websites and extract posts from four Web communities (Twitter, Reddit, 4chan, and Gab) that contain URLs from these sources. This yields a dataset of 38M posts containing 15M news URLs, spanning almost three years. We study the data along several axes, assessing the trustworthiness of shared news, designing a method to group news articles into stories, analyzing these stories are discussed and measuring the influence various Web communities have in that. Our analysis shows that different communities discuss different types of news, with polarized communities like Gab and /r/The_Donald subreddit disproportionately referencing untrustworthy sources. We also find that fringe communities often have a disproportionate influence on other platforms w.r.t. pushing narratives around certain news, for example about political elections, immigration, or foreign policy.

preprint2021arXiv

Dissecting the Meme Magic: Understanding Indicators of Virality in Image Memes

Despite the increasingly important role played by image memes, we do not yet have a solid understanding of the elements that might make a meme go viral on social media. In this paper, we investigate what visual elements distinguish image memes that are highly viral on social media from those that do not get re-shared, across three dimensions: composition, subjects, and target audience. Drawing from research in art theory, psychology, marketing, and neuroscience, we develop a codebook to characterize image memes, and use it to annotate a set of 100 image memes collected from 4chan's Politically Incorrect Board (/pol/). On the one hand, we find that highly viral memes are more likely to use a close-up scale, contain characters, and include positive or negative emotions. On the other hand, image memes that do not present a clear subject the viewer can focus attention on, or that include long text are not likely to be re-shared by users. We train machine learning models to distinguish between image memes that are likely to go viral and those that are unlikely to be re-shared, obtaining an AUC of 0.866 on our dataset. We also show that the indicators of virality identified by our model can help characterize the most viral memes posted on mainstream online social networks too, as our classifiers are able to predict 19 out of the 20 most popular image memes posted on Twitter and Reddit between 2016 and 2018. Overall, our analysis sheds light on what indicators characterize viral and non-viral visual content online, and set the basis for developing better techniques to create or moderate content that is more likely to catch the viewer's attention.

preprint2020arXiv

A First Look at Zoombombing

Online meeting tools like Zoom and Google Meet have become central to our professional, educational, and personal lives. This has opened up new opportunities for large scale harassment. In particular, a phenomenon known as zoombombing has emerged, in which aggressors join online meetings with the goal of disrupting them and harassing their participants. In this paper, we conduct the first data-driven analysis of calls for zoombombing attacks on social media. We identify ten popular online meeting tools and extract posts containing meeting invitations to these platforms on a mainstream social network, Twitter, and on a fringe community known for organizing coordinated attacks against online users, 4chan. We then perform manual annotation to identify posts that are calling for zoombombing attacks, and apply thematic analysis to develop a codebook to better characterize the discussion surrounding calls for zoombombing. During the first seven months of 2020, we identify over 200 calls for zoombombing between Twitter and 4chan, and analyze these calls both quantitatively and qualitatively. Our findings indicate that the vast majority of calls for zoombombing are not made by attackers stumbling upon meeting invitations or bruteforcing their meeting ID, but rather by insiders who have legitimate access to these meetings, particularly students in high school and college classes. This has important security implications, because it makes common protections against zoombombing, such as password protection, ineffective. We also find instances of insiders instructing attackers to adopt the names of legitimate participants in the class to avoid detection, making countermeasures like setting up a waiting room and vetting participants less effective. Based on these observations, we argue that the only effective defense against zoombombing is creating unique join links for each participant.

preprint2020arXiv

Raiders of the Lost Kek: 3.5 Years of Augmented 4chan Posts from the Politically Incorrect Board

This paper presents a dataset with over 3.3M threads and 134.5M posts from the Politically Incorrect board (/pol/) of the imageboard forum 4chan, posted over a period of almost 3.5 years (June 2016-November 2019). To the best of our knowledge, this represents the largest publicly available 4chan dataset, providing the community with an archive of posts that have been permanently deleted from 4chan and are otherwise inaccessible. We augment the data with a set of additional labels, including toxicity scores and the named entities mentioned in each post. We also present a statistical analysis of the dataset, providing an overview of what researchers interested in using it can expect, as well as a simple content analysis, shedding light on the most prominent discussion topics, the most popular entities mentioned, and the toxicity level of each post. Overall, we are confident that our work will motivate and assist researchers in studying and understanding 4chan, as well as its role on the greater Web. For instance, we hope this dataset may be used for cross-platform studies of social media, as well as being useful for other types of research like natural language processing. Finally, our dataset can assist qualitative work focusing on in-depth case studies of specific narratives, events, or social theories.

preprint2020arXiv

The Pushshift Reddit Dataset

Social media data has become crucial to the advancement of scientific understanding. However, even though it has become ubiquitous, just collecting large-scale social media data involves a high degree of engineering skill set and computational resources. In fact, research is often times gated by data engineering problems that must be overcome before analysis can proceed. This has resulted recognition of datasets as meaningful research contributions in and of themselves. Reddit, the so called "front page of the Internet," in particular has been the subject of numerous scientific studies. Although Reddit is relatively open to data acquisition compared to social media platforms like Facebook and Twitter, the technical barriers to acquisition still remain. Thus, Reddit's millions of subreddits, hundreds of millions of users, and hundreds of billions of comments are at the same time relatively accessible, but time consuming to collect and analyze systematically. In this paper, we present the Pushshift Reddit dataset. Pushshift is a social media data collection, analysis, and archiving platform that since 2015 has collected Reddit data and made it available to researchers. Pushshift's Reddit dataset is updated in real-time, and includes historical data back to Reddit's inception. In addition to monthly dumps, Pushshift provides computational tools to aid in searching, aggregating, and performing exploratory analysis on the entirety of the dataset. The Pushshift Reddit dataset makes it possible for social media researchers to reduce time spent in the data collection, cleaning, and storage phases of their projects.

preprint2020arXiv

The Pushshift Telegram Dataset

Messaging platforms, especially those with a mobile focus, have become increasingly ubiquitous in society. These mobile messaging platforms can have deceivingly large user bases, and in addition to being a way for people to stay in touch, are often used to organize social movements, as well as a place for extremists and other ne'er-do-well to congregate. In this paper, we present a dataset from one such mobile messaging platform: Telegram. Our dataset is made up of over 27.8K channels and 317M messages from 2.2M unique users. To the best of our knowledge, our dataset comprises the largest and most complete of its kind. In addition to the raw data, we also provide the source code used to collect it, allowing researchers to run their own data collection instance. We believe the Pushshift Telegram dataset can help researchers from a variety of disciplines interested in studying online social movements, protests, political extremism, and disinformation.

preprint2019arXiv

A Quantitative Approach to Understanding Online Antisemitism

A new wave of growing antisemitism, driven by fringe Web communities, is an increasingly worrying presence in the socio-political realm. The ubiquitous and global nature of the Web has provided tools used by these groups to spread their ideology to the rest of the Internet. Although the study of antisemitism and hate is not new, the scale and rate of change of online data has impacted the efficacy of traditional approaches to measure and understand these troubling trends. In this paper, we present a large-scale, quantitative study of online antisemitism. We collect hundreds of million posts and images from alt-right Web communities like 4chan's Politically Incorrect board (/pol/) and Gab. Using scientifically grounded methods, we quantify the escalation and spread of antisemitic memes and rhetoric across the Web. We find the frequency of antisemitic content greatly increases (in some cases more than doubling) after major political events such as the 2016 US Presidential Election and the "Unite the Right" rally in Charlottesville. We extract semantic embeddings from our corpus of posts and demonstrate how automated techniques can discover and categorize the use of antisemitic terminology. We additionally examine the prevalence and spread of the antisemitic "Happy Merchant" meme, and in particular how these fringe communities influence its propagation to more mainstream communities like Twitter and Reddit. Taken together, our results provide a data-driven, quantitative framework for understanding online antisemitism. Our methods serve as a framework to augment current qualitative efforts by anti-hate groups, providing new insights into the growth and spread of hate online.

preprint2019arXiv

Characterizing the Use of Images in State-Sponsored Information Warfare Operations by Russian Trolls on Twitter

State-sponsored organizations are increasingly linked to efforts aimed to exploit social media for information warfare and manipulating public opinion. Typically, their activities rely on a number of social network accounts they control, aka trolls, that post and interact with other users disguised as "regular" users. These accounts often use images and memes, along with textual content, in order to increase the engagement and the credibility of their posts. In this paper, we present the first study of images shared by state-sponsored accounts by analyzing a ground truth dataset of 1.8M images posted to Twitter by accounts controlled by the Russian Internet Research Agency. First, we analyze the content of the images as well as their posting activity. Then, using Hawkes Processes, we quantify their influence on popular Web communities like Twitter, Reddit, 4chan's Politically Incorrect board (/pol/), and Gab, with respect to the dissemination of images. We find that the extensive image posting activity of Russian trolls coincides with real-world events (e.g., the Unite the Right rally in Charlottesville), and shed light on their targets as well as the content disseminated via images. Finally, we show that the trolls were more effective in disseminating politics-related imagery than other images.

preprint2016arXiv

C3PO: Computation Congestion Control (PrOactive) - an algorithm for dynamic diffusion of ephemeral in-network services

There is an obvious trend that more and more data and computation are migrating into networks nowadays. Combining mature virtualization technologies with service-centric net- working, we are entering into an era where countless services reside in an ISP network to provide low-latency access. Such services are often computation intensive and are dynamically created and destroyed on demands everywhere in the network to perform various tasks. Consequently, these ephemeral in-network services introduce a new type of congestion in the network which we refer to as "computation congestion". The service load need to be effectively distributed on different nodes in order to maintain the funtionality and responsiveness of the network, which calls for a new design rather than reusing the centralised scheduler designed for cloud-based services. In this paper, we study both passive and proactive control strategies, based on the proactive control we further propose a fully distributed solution which is low complexity, adaptive, and responsive to network dynamics.

preprint2015arXiv

Enabling Social Applications via Decentralized Social Data Management

An unprecedented information wealth produced by online social networks, further augmented by location/collocation data, is currently fragmented across different proprietary services. Combined, it can accurately represent the social world and enable novel socially-aware applications. We present Prometheus, a socially-aware peer-to-peer service that collects social information from multiple sources into a multigraph managed in a decentralized fashion on user-contributed nodes, and exposes it through an interface implementing non-trivial social inferences while complying with user-defined access policies. Simulations and experiments on PlanetLab with emulated application workloads show the system exhibits good end-to-end response time, low communication overhead and resilience to malicious attacks.

preprint2015arXiv

Exploring Cyberbullying and Other Toxic Behavior in Team Competition Online Games

In this work we explore cyberbullying and other toxic behavior in team competition online games. Using a dataset of over 10 million player reports on 1.46 million toxic players along with corresponding crowdsourced decisions, we test several hypotheses drawn from theories explaining toxic behavior. Besides providing large-scale, empirical based understanding of toxic behavior, our work can be used as a basis for building systems to detect, prevent, and counter-act toxic behavior.

preprint2015arXiv

To HTTP/2, or Not To HTTP/2, That Is The Question

As of February, 2015, HTTP/2, the update to the 16-year-old HTTP 1.1, is officially complete. HTTP/2 aims to improve the Web experience by solving well-known problems (e.g., head of line blocking and redundant headers), while introducing new features (e.g., server push and content priority). On paper HTTP/2 represents the future of the Web. Yet, it is unclear whether the Web itself will, and should, hop on board. To shed some light on these questions, we built a measurement platform that monitors HTTP/2 adoption and performance across the Alexa top 1 million websites on a daily basis. Our system is live and up-to-date results can be viewed at http://isthewebhttp2yet.com/. In this paper, we report our initial findings from a 6 month measurement campaign (November 2014 - May 2015). We find 13,000 websites reporting HTTP/2 support, but only 600, mostly hosted by Google and Twitter, actually serving content. In terms of speed, we find no significant benefits from HTTP/2 under stable network conditions. More benefits appear in a 3G network where current Web development practices make HTTP/2 more resilient to losses and delay variation than previously believed.

preprint2014arXiv

Linguistic Analysis of Toxic Behavior in an Online Video Game

In this paper we explore the linguistic components of toxic behavior by using crowdsourced data from over 590 thousand cases of accused toxic players in a popular match-based competition game, League of Legends. We perform a series of linguistic analyses to gain a deeper understanding of the role communication plays in the expression of toxic behavior. We characterize linguistic behavior of toxic players and compare it with that of typical players in an online competition game. We also find empirical support describing how a player transitions from typical to toxic behavior. Our findings can be helpful to automatically detect and warn players who may become toxic and thus insulate potential victims from toxic playing in advance.

preprint2014arXiv

STFU NOOB! Predicting Crowdsourced Decisions on Toxic Behavior in Online Games

One problem facing players of competitive games is negative, or toxic, behavior. League of Legends, the largest eSport game, uses a crowdsourcing platform called the Tribunal to judge whether a reported toxic player should be punished or not. The Tribunal is a two stage system requiring reports from those players that directly observe toxic behavior, and human experts that review aggregated reports. While this system has successfully dealt with the vague nature of toxic behavior by majority rules based on many votes, it naturally requires tremendous cost, time, and human efforts. In this paper, we propose a supervised learning approach for predicting crowdsourced decisions on toxic behavior with large-scale labeled data collections; over 10 million user reports involved in 1.46 million toxic players and corresponding crowdsourced decisions. Our result shows good performance in detecting overwhelmingly majority cases and predicting crowdsourced decisions on them. We demonstrate good portability of our classifier across regions. Finally, we estimate the practical implications of our approach, potential cost savings and victim protection.

preprint2014arXiv

The power of indirect social ties

While direct social ties have been intensely studied in the context of computer-mediated social networks, indirect ties (e.g., friends of friends) have seen little attention. Yet in real life, we often rely on friends of our friends for recommendations (of good doctors, good schools, or good babysitters), for introduction to a new job opportunity, and for many other occasional needs. In this work we attempt to 1) quantify the strength of indirect social ties, 2) validate it, and 3) empirically demonstrate its usefulness for distributed applications on two examples. We quantify social strength of indirect ties using a(ny) measure of the strength of the direct ties that connect two people and the intuition provided by the sociology literature. We validate the proposed metric experimentally by comparing correlations with other direct social tie evaluators. We show via data-driven experiments that the proposed metric for social strength can be used successfully for social applications. Specifically, we show that it alleviates known problems in friend-to-friend storage systems by addressing two previously documented shortcomings: reduced set of storage candidates and data availability correlations. We also show that it can be used for predicting the effects of a social diffusion with an accuracy of up to 93.5%.

preprint2011arXiv

Cheaters in the Steam Community Gaming Social Network

Online gaming is a multi-billion dollar industry that entertains a large, global population. One unfortunate phenomenon, however, poisons the competition and the fun: cheating. The costs of cheating span from industry-supported expenditures to detect and limit cheating, to victims' monetary losses due to cyber crime. This paper studies cheaters in the Steam Community, an online social network built on top of the world's dominant digital game delivery platform. We collected information about more than 12 million gamers connected in a global social network, of which more than 700 thousand have their profiles flagged as cheaters. We also collected in-game interaction data of over 10 thousand players from a popular multiplayer gaming server. We show that cheaters are well embedded in the social and interaction networks: their network position is largely undistinguishable from that of fair players. We observe that the cheating behavior appears to spread through a social mechanism: the presence and the number of cheater friends of a fair player is correlated with the likelihood of her becoming a cheater in the future. Also, we observe that there is a social penalty involved with being labeled as a cheater: cheaters are likely to switch to more restrictive privacy settings once they are tagged and they lose more friends than fair players. Finally, we observe that the number of cheaters is not correlated with the geographical, real-world population density, or with the local popularity of the Steam Community. This analysis can ultimately inform the design of mechanisms to deal with anti-social behavior (e.g., spamming, automated collection of data) in generic online social networks.

Jeremy Blackburn

What is connected

Connect this record

See the researcher in context

Building this map preview

22 published item(s)

Feels Bad Man: Dissecting Automated Hateful Meme Detection Through the Lens of Facebook's Challenge

On Xing Tian and the Perseverance of Anti-China Sentiment Online

The Gospel According to Q: Understanding the QAnon Conspiracy from the Perspective of Canonical Information

Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots

"Go eat a bat, Chang!": On the Emergence of Sinophobic Behavior on Web Communities in the Face of COVID-19

"Is it a Qoincidence?": An Exploratory Study of QAnon on Voat

A Multi-Platform Analysis of Political News Discussion and Sharing on Web Communities

Dissecting the Meme Magic: Understanding Indicators of Virality in Image Memes

A First Look at Zoombombing

Raiders of the Lost Kek: 3.5 Years of Augmented 4chan Posts from the Politically Incorrect Board

The Pushshift Reddit Dataset

The Pushshift Telegram Dataset

A Quantitative Approach to Understanding Online Antisemitism

Characterizing the Use of Images in State-Sponsored Information Warfare Operations by Russian Trolls on Twitter

C3PO: Computation Congestion Control (PrOactive) - an algorithm for dynamic diffusion of ephemeral in-network services

Enabling Social Applications via Decentralized Social Data Management

Exploring Cyberbullying and Other Toxic Behavior in Team Competition Online Games

To HTTP/2, or Not To HTTP/2, That Is The Question

Linguistic Analysis of Toxic Behavior in an Online Video Game

STFU NOOB! Predicting Crowdsourced Decisions on Toxic Behavior in Online Games

The power of indirect social ties

Cheaters in the Steam Community Gaming Social Network