Source author record

Sourav Medya

Sourav Medya appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Data Structures and Algorithms Social and Information Networks Artificial Intelligence Computational Engineering, Finance, and Science Databases Human-Computer Interaction Information Retrieval Neural and Evolutionary Computing q-fin.ST

Catalog footprint

What is connected

8works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Debiasing Message Passing to Mitigate Popularity Bias in GNN-based Collaborative Filtering

Collaborative filtering (CF) models based on graph neural networks (GNNs) achieve strong performance in recommender systems by propagating user-item signals over interaction graphs. However, they are highly susceptible to popularity bias, since skewed interaction distributions and repeated message passing across high-order neighborhoods amplify the influence of popular items while suppressing long-tail ones. Existing debiasing approaches, including re-weighting objectives, regularization, causal methods, and post-processing, are less effective in GNN-based settings because they do not directly counteract bias propagated through the aggregation process, and recent in-aggregation weighting methods often rely on static heuristics or unstable embedding estimates. We propose Debiasing Popularity Amplification in Aggregation (DPAA), a popularity debiasing framework for GNN-based CF that integrates adaptive, embedding-aware interaction weighting and layer-wise weighting directly into message passing. DPAA assigns interaction-level weights from a representation-aware popularity signal, stabilized by a smooth transition from pre-trained to evolving model embeddings during training. It further introduces a layer-wise weighting that amplifies higher-order neighborhoods, surfacing long-range interactions with diverse and underexposed items. Experiments on real-world and semi-synthetic datasets show that DPAA outperforms state-of-the-art popularity-bias correction methods for GNN-based CF.

preprint2026arXiv

Empowering Older Adults in Digital Technology Use with Foundation Models

While high-quality technology support can assist older adults in using digital applications, many struggle to articulate their issues due to unfamiliarity with technical terminology and age-related cognitive changes. This study examines these communication challenges and explores AI-based approaches to mitigate them. We conducted a diary study with English-speaking, community-dwelling older adults to collect asynchronous, technology-related queries and used reflexive thematic analysis to identify communication barriers. To address these barriers, we evaluated how foundation models can paraphrase older adults' queries to improve solution accuracy. Two controlled experiments followed: one with younger adults evaluating AI-rephrased queries and another with older adults evaluating AI-generated solutions. We also developed a pipeline using large language models to generate the first synthetic dataset of how older adults request tech support (OATS). We identified four key communication challenges: verbosity, incompleteness, over-specification, and under-specification. Our prompt-chaining approach using the large language model, GPT-4o, elicited contextual details, paraphrased the original query, and generated a solution. AI-rephrased queries significantly improved solution accuracy (69% vs. 46%) and Google search results (69% vs. 35%). Younger adults better understood AI-rephrased queries (93.7% vs. 65.8%) and reported greater confidence and ease. Older adults reported high perceived ability to answer contextual questions (89.8%) and follow solutions (94.7%), with high confidence and ease. OATS demonstrated strong fidelity and face validity. This work shows how foundation models can enhance technology support for older adults by addressing age-related communication barriers. The OATS dataset offers a scalable resource for developing equitable AI systems that better serve aging populations.

preprint2026arXiv

GRAPHLCP: Structure-Aware Localized Conformal Prediction on Graphs

Conformal prediction (CP) provides a distribution-free approach to uncertainty quantification with finite-sample guarantees. However, applying CP to graph neural networks (GNNs) remains challenging as the combinatorial nature of graphs often leads to insufficiently certain predictions and indiscriminative embeddings. Existing methods primarily rely on embedding-space proximity for localization, which can be unreliable for graphs and yield inefficient prediction sets. We propose GRAPHLCP, a proximity-based localized CP framework that explicitly incorporates graph topology and inter-node dependencies into localization and weighting. Our approach introduces a feature-aware densification step to mitigate locality bias in sparse graphs, followed by a Personalized PageRank-based kernel computation to model structural proximity. This enables topology-dependent anchor sampling and calibration weighting that captures both local and long-range dependencies. Extensive experiments on several regression and classification datasets demonstrate that GRAPHLCP guarantees marginal coverage with finite samples while efficiently attaining favorable test conditional coverage across various conditioning scenarios.

preprint2024arXiv

COMBHelper: A Neural Approach to Reduce Search Space for Graph Combinatorial Problems

Combinatorial Optimization (CO) problems over graphs appear routinely in many applications such as in optimizing traffic, viral marketing in social networks, and matching for job allocation. Due to their combinatorial nature, these problems are often NP-hard. Existing approximation algorithms and heuristics rely on the search space to find the solutions and become time-consuming when this space is large. In this paper, we design a neural method called COMBHelper to reduce this space and thus improve the efficiency of the traditional CO algorithms based on node selection. Specifically, it employs a Graph Neural Network (GNN) to identify promising nodes for the solution set. This pruned search space is then fed to the traditional CO algorithms. COMBHelper also uses a Knowledge Distillation (KD) module and a problem-specific boosting module to bring further efficiency and efficacy. Our extensive experiments show that the traditional CO algorithms with COMBHelper are at least 2 times faster than their original versions.

preprint2022arXiv

An Exploratory Study of Stock Price Movements from Earnings Calls

Financial market analysis has focused primarily on extracting signals from accounting, stock price, and other numerical hard data reported in P&L statements or earnings per share reports. Yet, it is well-known that the decision-makers routinely use soft text-based documents that interpret the hard data they narrate. Recent advances in computational methods for analyzing unstructured and soft text-based data at scale offer possibilities for understanding financial market behavior that could improve investments and market equity. A critical and ubiquitous form of soft data are earnings calls. Earnings calls are periodic (often quarterly) statements usually by CEOs who attempt to influence investors' expectations of a company's past and future performance. Here, we study the statistical relationship between earnings calls, company sales, stock performance, and analysts' recommendations. Our study covers a decade of observations with approximately 100,000 transcripts of earnings calls from 6,300 public companies from January 2010 to December 2019. In this study, we report three novel findings. First, the buy, sell and hold recommendations from professional analysts made prior to the earnings have low correlation with stock price movements after the earnings call. Second, using our graph neural network based method that processes the semantic features of earnings calls, we reliably and accurately predict stock price movements in five major areas of the economy. Third, the semantic features of transcripts are more predictive of stock price movements than sales and earnings per share, i.e., traditional hard data in most of the cases.

preprint2020arXiv

K-Core Minimization: A Game Theoretic Approach

K-cores are maximal induced subgraphs where all vertices have degree at least k. These dense patterns have applications in community detection, network visualization and protein function prediction. However, k-cores can be quite unstable to network modifications, which motivates the question: How resilient is the k-core structure of a network, such as the Web or Facebook, to edge deletions? We investigate this question from an algorithmic perspective. More specifically, we study the problem of computing a small set of edges for which the removal minimizes the $k$-core structure of a network. This paper provides a comprehensive characterization of the hardness of the k-core minimization problem (KCM), including innaproximability and fixed-parameter intractability. Motivated by such a challenge in terms of algorithm design, we propose a novel algorithm inspired by Shapley value -- a cooperative game-theoretic concept -- that is able to leverage the strong interdependencies in the effects of edge removals in the search space. As computing Shapley values is also NP-hard, we efficiently approximate them using a randomized algorithm with probabilistic guarantees. Our experiments, using several real datasets, show that the proposed algorithm outperforms competing solutions in terms of k-core minimization while being able to handle large graphs. Moreover, we illustrate how KCM can be applied in the analysis of the k-core resilience of networks.

preprint2020arXiv

Manipulating Node Similarity Measures in Networks

Node similarity measures quantify how similar a pair of nodes are in a network. These similarity measures turn out to be an important fundamental tool for many real world applications such as link prediction in networks, recommender systems etc. An important class of similarity measures are local similarity measures. Two nodes are considered similar under local similarity measures if they have large overlap between their neighboring set of nodes. Manipulating node similarity measures via removing edges is an important problem. This type of manipulation, for example, hinders effectiveness of link prediction in terrorists networks. Fortunately, all the popular computational problems formulated around manipulating similarity measures turn out to be NP-hard. We, in this paper, provide fine grained complexity results of these problems through the lens of parameterized complexity. In particular, we show that some of these problems are fixed parameter tractable (FPT) with respect to various natural parameters whereas other problems remain intractable W[1]-hard and W[2]-hard in particular). Finally we show the effectiveness of our proposed FPT algorithms on real world datasets as well as synthetic networks generated using Barabasi-Albert and Erdos-Renyi models.

preprint2016arXiv

Towards Scalable Network Delay Minimization

Reduction of end-to-end network delays is an optimization task with applications in multiple domains. Low delays enable improved information flow in social networks, quick spread of ideas in collaboration networks, low travel times for vehicles on road networks and increased rate of packets in the case of communication networks. Delay reduction can be achieved by both improving the propagation capabilities of individual nodes and adding additional edges in the network. One of the main challenges in such design problems is that the effects of local changes are not independent, and as a consequence, there is a combinatorial search-space of possible improvements. Thus, minimizing the cumulative propagation delay requires novel scalable and data-driven approaches. In this paper, we consider the problem of network delay minimization via node upgrades. Although the problem is NP-hard, we show that probabilistic approximation for a restricted version can be obtained. We design scalable and high-quality techniques for the general setting based on sampling and targeted to different models of delay distribution. Our methods scale almost linearly with the graph size and consistently outperform competitors in quality.

Sourav Medya

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Debiasing Message Passing to Mitigate Popularity Bias in GNN-based Collaborative Filtering

Empowering Older Adults in Digital Technology Use with Foundation Models

GRAPHLCP: Structure-Aware Localized Conformal Prediction on Graphs

COMBHelper: A Neural Approach to Reduce Search Space for Graph Combinatorial Problems

An Exploratory Study of Stock Price Movements from Earnings Calls

K-Core Minimization: A Game Theoretic Approach

Manipulating Node Similarity Measures in Networks

Towards Scalable Network Delay Minimization