Researcher profile

Deeksha Sinha

Deeksha Sinha contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2026arXiv

Estimate Level Adjustment For Inference With Proxies Under Random Distribution Shifts

In many scientific domains, including experimentation, researchers rely on measurements of proxy outcomes to achieve faster and more frequent reads, especially when the primary outcome of interest is challenging to measure directly. While proxies offer a more readily accessible observation for inference, the ultimate goal is to draw statistical inferences about the primary outcome parameter and proxy data are typically imperfect in some ways. To correct for these imperfections, current statistical inference methods often depend on strict identifying assumptions (such as surrogacy, covariate/label shift, or missingness assumptions). These assumptions can be difficult to validate and may be violated by various additional sources of distribution shift, potentially leading to biased parameter estimates and miscalibrated uncertainty quantification. We introduce an estimate-level framework, inspired by domain adaptation techniques, to empirically calibrate proxy-based inference. This framework models the proxy-primary metric discrepancy as a random effect at the parameter level, estimating its distribution from aggregated historical observations across past domains (e.g., experiments, time periods, or distinct segments). This method avoids the requirement for retaining individual-level response data. Additionally, this adjustment can be layered on top of existing proxy-correction methods (such as prediction-powered inference or importance weighting) to account for additional biases not addressed by those corrections. To manage uncertainty when the number of historical domains is limited, we provide both a method-of-moments estimator and a domain bootstrap procedure. We further validate this approach using publicly available datasets and real-world experiments.

preprint2021arXiv

Optimizing Revenue while showing Relevant Assortments at Scale

Scalable real-time assortment optimization has become essential in e-commerce operations due to the need for personalization and the availability of a large variety of items. While this can be done when there are simplistic assortment choices to be made, the optimization process becomes difficult when imposing constraints on the collection of relevant assortments based on insights by store-managers and historically well-performing assortments. We design fast and flexible algorithms based on variations of binary search that find the (approximately) optimal assortment in this difficult regime. In particular, we revisit the problem of large-scale assortment optimization under the multinomial logit choice model without any assumptions on the structure of the feasible assortments. We speed up the comparison steps using advances in similarity search in the field of information retrieval/machine learning. For an arbitrary collection of assortments, our algorithms can find a solution in time that is sub-linear in the number of assortments, and for the simpler case of cardinality constraints - linear in the number of items (existing methods are quadratic or worse). Empirical validations using a real world dataset (in addition to experiments using semi-synthetic data based on the Billion Prices dataset and several retail transaction datasets) show that our algorithms are competitive even when the number of items is $\sim 10^5$ ($10\times$ larger instances than previously studied).