Researcher profile

Sahil Verma

Sahil Verma contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2026arXiv

How Many Images Does It Take? Estimating Imitation Thresholds in Text-to-Image Models

Text-to-image models are trained using large datasets of image-text pairs collected from the internet. These datasets often include copyrighted and private images. Training models on such datasets enables them to generate images that might violate copyright laws and individual privacy. This phenomenon is termed imitation -- generation of images with content that has recognizable similarity to its training images. In this work we estimate the point at which a model was trained on enough instances of a concept to be able to imitate it -- the imitation threshold. We posit this question as a new problem and propose an efficient approach that estimates the imitation threshold without incurring the colossal cost of training these models from scratch. We experiment with two domains -- human faces and art styles, and evaluate four text-to-image models that were trained on three pretraining datasets. We estimate the imitation threshold of these models to be in the range of 200-700 images, depending on the domain and the model. The imitation threshold provides an empirical basis for copyright violation claims and acts as a guiding principle for text-to-image model developers that aim to comply with copyright and privacy laws. Website: https://how-many-van-goghs-does-it-take.github.io/. Code: https://github.com/vsahil/MIMETIC-2.

preprint2022arXiv

BASED-XAI: Breaking Ablation Studies Down for Explainable Artificial Intelligence

Explainable artificial intelligence (XAI) methods lack ground truth. In its place, method developers have relied on axioms to determine desirable properties for their explanations' behavior. For high stakes uses of machine learning that require explainability, it is not sufficient to rely on axioms as the implementation, or its usage, can fail to live up to the ideal. As a result, there exists active research on validating the performance of XAI methods. The need for validation is especially magnified in domains with a reliance on XAI. A procedure frequently used to assess their utility, and to some extent their fidelity, is an ablation study. By perturbing the input variables in rank order of importance, the goal is to assess the sensitivity of the model's performance. Perturbing important variables should correlate with larger decreases in measures of model capability than perturbing less important features. While the intent is clear, the actual implementation details have not been studied rigorously for tabular data. Using five datasets, three XAI methods, four baselines, and three perturbations, we aim to show 1) how varying perturbations and adding simple guardrails can help to avoid potentially flawed conclusions, 2) how treatment of categorical variables is an important consideration in both post-hoc explainability and ablation studies, and 3) how to identify useful baselines for XAI methods and viable perturbations for ablation studies.

preprint2022arXiv

Debug-Localize-Repair: A Symbiotic Construction for Heap Manipulations

We present Wolverine2, an integrated Debug-Localize-Repair environment for heap manipulating programs. Wolverine2 provides an interactive debugging environment: while concretely executing a program via on an interactive shell supporting common debugging facilities, Wolverine2 displays the abstract program states (as box-and-arrow diagrams) as a visual aid to the programmer, packages a novel, proof-directed repair algorithm to quickly synthesize the repair patches and a new bug localization algorithm to reduce the search space of repairs. Wolverine2 supports "hot-patching" of the generated patches to provide a seamless debugging environment, and also facilitates new debug-localize-repair possibilities: \textit{specification refinement} and \textit{checkpoint-based hopping}. We evaluate Wolverine2 on 6400 buggy programs (generated using automated fault injection) on a variety of data-structures like singly, doubly, and circular linked lists, AVL trees, Red-Black trees, Splay Trees and Binary Search Trees; Wolverine2 could repair all the buggy instances within realistic programmer wait-time (less than 5 sec in most cases). Wolverine2 could also repair more than 80\% of the 247 (buggy) student submissions where a reasonable attempt was made.

preprint2021arXiv

Removing biased data to improve fairness and accuracy

Machine learning systems are often trained using data collected from historical decisions. If past decisions were biased, then automated systems that learn from historical data will also be biased. We propose a black-box approach to identify and remove biased training data. Machine learning models trained on such debiased data (a subset of the original training data) have low individual discrimination, often 0%. These models also have greater accuracy and lower statistical disparity than models trained on the full historical data. We evaluated our methodology in experiments using 6 real-world datasets. Our approach outperformed seven previous approaches in terms of individual discrimination and accuracy.

preprint2020arXiv

Benchmarking Symbolic Execution Using Constraint Problems -- Initial Results

Symbolic execution is a powerful technique for bug finding and program testing. It is successful in finding bugs in real-world code. The core reasoning techniques use constraint solving, path exploration, and search, which are also the same techniques used in solving combinatorial problems, e.g., finite-domain constraint satisfaction problems (CSPs). We propose CSP instances as more challenging benchmarks to evaluate the effectiveness of the core techniques in symbolic execution. We transform CSP benchmarks into C programs suitable for testing the reasoning capabilities of symbolic execution tools. From a single CSP P, we transform P depending on transformation choice into different C programs. Preliminary testing with the KLEE, Tracer-X, and LLBMC tools show substantial runtime differences from transformation and solver choice. Our C benchmarks are effective in showing the limitations of existing symbolic execution tools. The motivation for this work is we believe that benchmarks of this form can spur the development and engineering of improved core reasoning in symbolic execution engines.

preprint2020arXiv

Facets of Fairness in Search and Recommendation

Several recent works have highlighted how search and recommender systems exhibit bias along different dimensions. Counteracting this bias and bringing a certain amount of fairness in search is crucial to not only creating a more balanced environment that considers relevance and diversity but also providing a more sustainable way forward for both content consumers and content producers. This short paper examines some of the recent works to define relevance, diversity, and related concepts. Then, it focuses on explaining the emerging concept of fairness in various recommendation settings. In doing so, this paper presents comparisons and highlights contracts among various measures, and gaps in our conceptual and evaluative frameworks.