Source author record

Abhilasha Ravichander

Abhilasha Ravichander appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Artificial Intelligence cs.CY Human-Computer Interaction

Catalog footprint

What is connected

4works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

To Call or Not to Call: A Framework to Assess and Optimize LLM Tool Calling

Agentic AI architectures augment LLMs with external tools, unlocking strong capabilities. However, tool use is not always beneficial; some calls may be redundant or even harmful. Effective tool use, therefore, hinges on a core LLM decision: whether to call or not call a tool, when performing a task. This decision is particularly challenging for web search tools, where the benefits of external information depend on the model's internal knowledge and its ability to integrate potentially noisy tool responses. We introduce a principled framework inspired by decision-making theory to evaluate web search tool-use decisions along three key factors: necessity, utility, and affordability. Our analysis combines two complementary lenses: a normative perspective that infers true need and utility from an optimal allocation of tool calls, and a descriptive perspective that infers the model's self-perceived need and utility from their observed behaviors. We find that models' perceived need and utility of tool calls are often misaligned with their true need and utility. Building on this framework, we train lightweight estimators of need and utility based on models' hidden states. Our estimators enable simple controllers that can improve decision quality and lead to stronger task performance than the self-perceived set up across three tasks and six models.

preprint2022arXiv

Exploring and Improving the Accessibility of Data Privacy-related Information for People Who Are Blind or Low-vision

We present a study of privacy attitudes and behaviors of people who are blind or low vision. Our study involved in-depth interviews with 21 US participants. The study explores their risk perceptions and also whether and how they go about obtaining information about the data practices of digital technologies with which they interact. One objective of the study is to better understand this user group's needs for more accessible privacy tools. We also share some reflections on the challenge of recruiting an inclusive sample of participants from an already underrepresented user group in computing and how we were able to overcome this challenge.

preprint2021arXiv

NoiseQA: Challenge Set Evaluation for User-Centric Question Answering

When Question-Answering (QA) systems are deployed in the real world, users query them through a variety of interfaces, such as speaking to voice assistants, typing questions into a search engine, or even translating questions to languages supported by the QA system. While there has been significant community attention devoted to identifying correct answers in passages assuming a perfectly formed question, we show that components in the pipeline that precede an answering engine can introduce varied and considerable sources of error, and performance can degrade substantially based on these upstream noise sources even for powerful pre-trained QA models. We conclude that there is substantial room for progress before QA systems can be effectively deployed, highlight the need for QA evaluation to expand to consider real-world use, and hope that our findings will spur greater community interest in the issues that arise when our systems actually need to be of utility to humans.

preprint2021arXiv

Probing the Probing Paradigm: Does Probing Accuracy Entail Task Relevance?

Although neural models have achieved impressive results on several NLP benchmarks, little is understood about the mechanisms they use to perform language tasks. Thus, much recent attention has been devoted to analyzing the sentence representations learned by neural encoders, through the lens of `probing' tasks. However, to what extent was the information encoded in sentence representations, as discovered through a probe, actually used by the model to perform its task? In this work, we examine this probing paradigm through a case study in Natural Language Inference, showing that models can learn to encode linguistic properties even if they are not needed for the task on which the model was trained. We further identify that pretrained word embeddings play a considerable role in encoding these properties rather than the training task itself, highlighting the importance of careful controls when designing probing experiments. Finally, through a set of controlled synthetic tasks, we demonstrate models can encode these properties considerably above chance-level even when distributed in the data as random noise, calling into question the interpretation of absolute claims on probing tasks.

Abhilasha Ravichander

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

To Call or Not to Call: A Framework to Assess and Optimize LLM Tool Calling

Exploring and Improving the Accessibility of Data Privacy-related Information for People Who Are Blind or Low-vision

NoiseQA: Challenge Set Evaluation for User-Centric Question Answering

Probing the Probing Paradigm: Does Probing Accuracy Entail Task Relevance?