Source author record

David Holtz

David Holtz appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications econ.EM Human-Computer Interaction Methodology astro-ph.CO econ.GN Machine Learning q-fin.EC Social and Information Networks

Catalog footprint

What is connected

5works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Prompt Adaptation as a Dynamic Complement in Generative AI Systems

As generative AI systems rapidly improve, a key question emerges: how do users adapt to these changes, and when does such adaptation matter for realizing performance gains? Drawing on theories of dynamic capabilities and IT complements, we study prompt adaptation--how users adjust their inputs in response to evolving model behavior--using a common experimental design applied to two preregistered tasks with 3,750 total participants who submitted nearly 37,000 prompts. We show that the importance of prompt adaptation depends critically on task structure. In a task with fixed evaluation criteria and an unambiguous goal, user prompt adaptation accounts for roughly half of the performance gains from a model upgrade. In contrast, in an open-ended creative task where the space of acceptable outputs is effectively unbounded and quality is subjective, performance improvements are driven primarily by model capability; prompt adaptation plays a limited role. We further show that automated prompt rewriting cannot generally substitute for human adaptation: when aligned with task objectives, it can modestly improve performance, but when misaligned, it can actively undermine the gains from model improvements. Together, these findings position prompt adaptation as a dynamic complement whose importance depends on task structure and system design, and suggest that without it, a substantial share of the economic value created by advances in generative models may go unrealized.

preprint2020arXiv

Limiting Bias from Test-Control Interference in Online Marketplace Experiments

In an A/B test, the typical objective is to measure the total average treatment effect (TATE), which measures the difference between the average outcome if all users were treated and the average outcome if all users were untreated. However, a simple difference-in-means estimator will give a biased estimate of the TATE when outcomes of control units depend on the outcomes of treatment units, an issue we refer to as test-control interference. Using a simulation built on top of data from Airbnb, this paper considers the use of methods from the network interference literature for online marketplace experimentation. We model the marketplace as a network in which an edge exists between two sellers if their goods substitute for one another. We then simulate seller outcomes, specifically considering a "status quo" context and "treatment" context that forces all sellers to lower their prices. We use the same simulation framework to approximate TATE distributions produced by using blocked graph cluster randomization, exposure modeling, and the Hajek estimator for the difference in means. We find that while blocked graph cluster randomization reduces the bias of the naive difference-in-means estimator by as much as 62%, it also significantly increases the variance of the estimator. On the other hand, the use of more sophisticated estimators produces mixed results. While some provide (small) additional reductions in bias and small reductions in variance, others lead to increased bias and variance. Overall, our results suggest that experiment design and analysis techniques from the network experimentation literature are promising tools for reducing bias due to test-control interference in marketplace experiments.

preprint2020arXiv

Reducing Interference Bias in Online Marketplace Pricing Experiments

Online marketplace designers frequently run A/B tests to measure the impact of proposed product changes. However, given that marketplaces are inherently connected, total average treatment effect estimates obtained through Bernoulli randomized experiments are often biased due to violations of the stable unit treatment value assumption. This can be particularly problematic for experiments that impact sellers' strategic choices, affect buyers' preferences over items in their consideration set, or change buyers' consideration sets altogether. In this work, we measure and reduce bias due to interference in online marketplace experiments by using observational data to create clusters of similar listings, and then using those clusters to conduct cluster-randomized field experiments. We provide a lower bound on the magnitude of bias due to interference by conducting a meta-experiment that randomizes over two experiment designs: one Bernoulli randomized, one cluster randomized. In both meta-experiment arms, treatment sellers are subject to a different platform fee policy than control sellers, resulting in different prices for buyers. By conducting a joint analysis of the two meta-experiment arms, we find a large and statistically significant difference between the total average treatment effect estimates obtained with the two designs, and estimate that 32.60% of the Bernoulli-randomized treatment effect estimate is due to interference bias. We also find weak evidence that the magnitude and/or direction of interference bias depends on extent to which a marketplace is supply- or demand-constrained, and analyze a second meta-experiment to highlight the difficulty of detecting interference bias when treatment interventions require intention-to-treat analysis.

preprint2020arXiv

The Engagement-Diversity Connection: Evidence from a Field Experiment on Spotify

It remains unknown whether personalized recommendations increase or decrease the diversity of content people consume. We present results from a randomized field experiment on Spotify testing the effect of personalized recommendations on consumption diversity. In the experiment, both control and treatment users were given podcast recommendations, with the sole aim of increasing podcast consumption. Treatment users' recommendations were personalized based on their music listening history, whereas control users were recommended popular podcasts among users in their demographic group. We find that, on average, the treatment increased podcast streams by 28.90%. However, the treatment also decreased the average individual-level diversity of podcast streams by 11.51%, and increased the aggregate diversity of podcast streams by 5.96%, indicating that personalized recommendations have the potential to create patterns of consumption that are homogenous within and diverse across users, a pattern reflecting Balkanization. Our results provide evidence of an "engagement-diversity trade-off" when recommendations are optimized solely to drive consumption: while personalized recommendations increase user engagement, they also affect the diversity of consumed content. This shift in consumption diversity can affect user retention and lifetime value, and impact the optimal strategy for content producers. We also observe evidence that our treatment affected streams from sections of Spotify's app not directly affected by the experiment, suggesting that exposure to personalized recommendations can affect the content that users consume organically. We believe these findings highlight the need for academics and practitioners to continue investing in personalization methods that explicitly take into account the diversity of content recommended.

preprint2010arXiv

The Atacama Cosmology Telescope: Cosmology from Galaxy Clusters Detected via the Sunyaev-Zel'dovich Effect

We present constraints on cosmological parameters based on a sample of Sunyaev-Zel'dovich-selected galaxy clusters detected in a millimeter-wave survey by the Atacama Cosmology Telescope. The cluster sample used in this analysis consists of 9 optically-confirmed high-mass clusters comprising the high-significance end of the total cluster sample identified in 455 square degrees of sky surveyed during 2008 at 148 GHz. We focus on the most massive systems to reduce the degeneracy between unknown cluster astrophysics and cosmology derived from SZ surveys. We describe the scaling relation between cluster mass and SZ signal with a 4-parameter fit. Marginalizing over the values of the parameters in this fit with conservative priors gives sigma_8 = 0.851 +/- 0.115 and w = -1.14 +/- 0.35 for a spatially-flat wCDM cosmological model with WMAP 7-year priors on cosmological parameters. This gives a modest improvement in statistical uncertainty over WMAP 7-year constraints alone. Fixing the scaling relation between cluster mass and SZ signal to a fiducial relation obtained from numerical simulations and calibrated by X-ray observations, we find sigma_8 = 0.821 +/- 0.044 and w = -1.05 +/- 0.20. These results are consistent with constraints from WMAP 7 plus baryon acoustic oscillations plus type Ia supernoava which give sigma_8 = 0.802 +/- 0.038 and w = -0.98 +/- 0.053. A stacking analysis of the clusters in this sample compared to clusters simulated assuming the fiducial model also shows good agreement. These results suggest that, given the sample of clusters used here, both the astrophysics of massive clusters and the cosmological parameters derived from them are broadly consistent with current models.

David Holtz

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Prompt Adaptation as a Dynamic Complement in Generative AI Systems

Limiting Bias from Test-Control Interference in Online Marketplace Experiments

Reducing Interference Bias in Online Marketplace Pricing Experiments

The Engagement-Diversity Connection: Evidence from a Field Experiment on Spotify

The Atacama Cosmology Telescope: Cosmology from Galaxy Clusters Detected via the Sunyaev-Zel'dovich Effect