Researcher profile

Mehdi Ben Ayed

Mehdi Ben Ayed contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 11 - UnverifiedVerification L1Unclaimed author
1works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

1 published item(s)

preprint2026arXiv

A Production-Ready RL Framework for Personalized Utility Tuning with Pareto Sweeping in Pinterest Recommender Systems

Large-scale recommenders encode multi-objective trade-offs by combining multiple predicted outcomes into a single utility score. Although this utility layer can be updated independently of the ranker, weight tuning remains largely manual, globally applied, slow to adapt to changing environments and business needs, and hard to govern as priorities shift. We propose PRL-PUTS, a Production-ready, ranker independent RL framework for Personalized Utility-weight Tuning with Pareto Sweeping. We cast utility tuning as a one-step, value-based RL problem: given request context, an agent selects a utility-weight vector that re-weights ranker predictions to maximize request-level engagement rewards. To visualize performance across the trade-off spectrum and allow decision makers to update the deployed operating policy instantly, we adopt an inference-time Pareto frontier sweeping via a scalarization parameter, producing a family of policies and an empirical Pareto frontier used as a governance artifact for operating policy selection. PRL-PUTS runs in parallel with ranking inference without adding serving latency. We validate PRL-PUTS with offline analysis using unbiased exploration logs and online experiments on Pinterest Homefeed where PRL-PUTS showed significant increases in engagement compared to baseline such as +0.13\% increase in successful session, a core metric for user engagement.