Researcher profile

Sayan Sinha

Sayan Sinha contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

AHA: Scalable Alternative History Analysis for Operational Timeseries Applications

Many operational systems collect high-dimensional timeseries data about users/systems on key performance metrics. For instance, ISPs, content distribution networks, and video delivery services collect quality of experience metrics for user sessions associated with metadata (e.g., location, device, ISP). Over such historical data, operators and data analysts often need to run retrospective analysis; e.g., analyze anomaly detection algorithms, experiment with different configurations for alerts, evaluate new algorithms, and so on. We refer to this class of workloads as alternative history analysis for operational datasets. We show that in such settings, traditional data processing solutions (e.g., data warehouses, sampling, sketching, big-data systems) either pose high operational costs or do not guarantee accurate replay. We design and implement a system, called AHA (Alternative History Analytics), that overcomes both challenges to provide cost efficiency and fidelity for high-dimensional data. The design of AHA is based on analytical and empirical insights about such workloads: 1) the decomposability of underlying statistics; 2) sparsity in terms of active number of subpopulations over attribute-value combinations; and 3) efficiency structure of aggregation operations in modern analytics databases. Using multiple real-world datasets and as well as case-studies on production pipelines at a large video analytics company, we show that AHA provides 100% accuracy for a broad range of downstream tasks and up to 85x lower total cost of ownership (i.e., compute + storage) compared to conventional methods.

preprint2021arXiv

RESPER: Computationally Modelling Resisting Strategies in Persuasive Conversations

Modelling persuasion strategies as predictors of task outcome has several real-world applications and has received considerable attention from the computational linguistics community. However, previous research has failed to account for the resisting strategies employed by an individual to foil such persuasion attempts. Grounded in prior literature in cognitive and social psychology, we propose a generalised framework for identifying resisting strategies in persuasive conversations. We instantiate our framework on two distinct datasets comprising persuasion and negotiation conversations. We also leverage a hierarchical sequence-labelling neural architecture to infer the aforementioned resisting strategies automatically. Our experiments reveal the asymmetry of power roles in non-collaborative goal-directed conversations and the benefits accrued from incorporating resisting strategies on the final conversation outcome. We also investigate the role of different resisting strategies on the conversation outcome and glean insights that corroborate with past findings. We also make the code and the dataset of this work publicly available at https://github.com/americast/resper.

preprint2020arXiv

Analysing the Extent of Misinformation in Cancer Related Tweets

Twitter has become one of the most sought after places to discuss a wide variety of topics, including medically relevant issues such as cancer. This helps spread awareness regarding the various causes, cures and prevention methods of cancer. However, no proper analysis has been performed, which discusses the validity of such claims. In this work, we aim to tackle the misinformation spread in such platforms. We collect and present a dataset regarding tweets which talk specifically about cancer and propose an attention-based deep learning model for automated detection of misinformation along with its spread. We then do a comparative analysis of the linguistic variation in the text corresponding to misinformation and truth. This analysis helps us gather relevant insights on various social aspects related to misinformed tweets.

preprint2020arXiv

NARMADA: Need and Available Resource Managing Assistant for Disasters and Adversities

Although a lot of research has been done on utilising Online Social Media during disasters, there exists no system for a specific task that is critical in a post-disaster scenario -- identifying resource-needs and resource-availabilities in the disaster-affected region, coupled with their subsequent matching. To this end, we present NARMADA, a semi-automated platform which leverages the crowd-sourced information from social media posts for assisting post-disaster relief coordination efforts. The system employs Natural Language Processing and Information Retrieval techniques for identifying resource-needs and resource-availabilities from microblogs, extracting resources from the posts, and also matching the needs to suitable availabilities. The system is thus capable of facilitating the judicious management of resources during post-disaster relief operations.

preprint2020arXiv

RAPPER: Ransomware Prevention via Performance Counters

Ransomware can produce direct and controllable economic loss, which makes it one of the most prominent threats in cyber security. As per the latest statistics, more than half of malwares reported in Q1 of 2017 are ransomwares and there is a potent threat of a novice cybercriminals accessing ransomware-as-a-service. The concept of public-key based data kidnapping and subsequent extortion was introduced in 1996. Since then, variants of ransomware emerged with different cryptosystems and larger key sizes, the underlying techniques remained same. Though there are works in literature which proposes a generic framework to detect the crypto ransomwares, we present a two step unsupervised detection tool which when suspects a process activity to be malicious, issues an alarm for further analysis to be carried in the second step and detects it with minimal traces. The two step detection framework- RAPPER uses Artificial Neural Network and Fast Fourier Transformation to develop a highly accurate, fast and reliable solution to ransomware detection using minimal trace points. We also introduce a special detection module for successful identification of disk encryption processes from potential ransomware operations, both having similar characteristics but with different objective. We provide a comprehensive solution to tackle almost all scenarios (standard benchmark, disk encryption and regular high computational processes) pertaining to the crypto ransomwares in light of software security.