Researcher profile

Ayush Chauhan

Ayush Chauhan contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2025arXiv

Shielded RecRL: Explanation Generation for Recommender Systems without Ranking Degradation

We introduce Shielded RecRL, a reinforcement learning approach to generate personalized explanations for recommender systems without sacrificing the system's original ranking performance. Unlike prior RLHF-based recommender methods that directly optimize item rankings, our two-tower architecture keeps the recommender's ranking model intact while a language model learns to produce helpful explanations. We design a composite reward signal combining explanation length, content relevance, and coherence, and apply proximal policy optimization (PPO) with a KL-divergence constraint to fine-tune a large language model with only 0.4% of its parameters trainable via LoRA adapters. In experiments on an Amazon Books dataset (approximately 50K interactions in the fantasy and romance genres), Shielded RecRL improved the relative click-through rate (CTR) by 22.5% (1.225x over baseline) while keeping the recommender's item-ranking behavior virtually unchanged. An extensive ablation study confirms that our gradient shielding strategy and reward design effectively balance explanation quality and policy drift. Our results demonstrate that Shielded RecRL enhances user-facing aspects of recommendations through rich, personalized explanations without degrading core recommendation accuracy.

preprint2020arXiv

Dis-entangling Mixture of Interventions on a Causal Bayesian Network Using Aggregate Observations

We study the problem of separating a mixture of distributions, all of which come from interventions on a known causal bayesian network. Given oracle access to marginals of all distributions resulting from interventions on the network, and estimates of marginals from the mixture distribution, we want to recover the mixing proportions of different mixture components. We show that in the worst case, mixing proportions cannot be identified using marginals only. If exact marginals of the mixture distribution were known, under a simple assumption of excluding a few distributions from the mixture, we show that the mixing proportions become identifiable. Our identifiability proof is constructive and gives an efficient algorithm recovering the mixing proportions exactly. When exact marginals are not available, we design an optimization framework to estimate the mixing proportions. Our problem is motivated from a real-world scenario of an e-commerce business, where multiple interventions occur at a given time, leading to deviations in expected metrics. We conduct experiments on the well known publicly available ALARM network and on a proprietary dataset from a large e-commerce company validating the performance of our method.