Source author record

Kinjal Basu

Kinjal Basu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Computation Applications math.NA Methodology Computation and Language Logic in Computer Science math.ST Statistics Theory Computer Vision cs.CY math.CO math.OC Numerical Analysis

Catalog footprint

What is connected

20works

15topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

ToolRM: Outcome Reward Models for Tool-Calling Large Language Models

As large language models (LLMs) increasingly interact with external tools, reward modeling for tool use has emerged as a critical yet underexplored area of research. Existing reward models, trained primarily on natural language outputs, struggle to evaluate tool-based reasoning and execution. To quantify this gap, we introduce FC-RewardBench, the first benchmark to systematically evaluate reward models in tool-calling scenarios. Our analysis shows that current reward models frequently miss key signals of effective tool use, highlighting the need for domain-specific modeling. We address this by proposing a training framework for outcome reward models using data synthesized from permissively licensed, open-weight LLMs. We introduce ToolRM - a suite of reward models for tool-use ranging from 1.7B to 14B parameters. Across diverse settings, these models consistently outperform general-purpose baselines. Notably, they achieve up to a 25% improvement with Best-of-N sampling, while also improving robustness to input noise, enabling effective data filtering, and supporting RL-training of policy models.

preprint2022arXiv

Achieving Fairness via Post-Processing in Web-Scale Recommender Systems

Building fair recommender systems is a challenging and crucial area of study due to its immense impact on society. We extended the definitions of two commonly accepted notions of fairness to recommender systems, namely equality of opportunity and equalized odds. These fairness measures ensure that equally "qualified" (or "unqualified") candidates are treated equally regardless of their protected attribute status (such as gender or race). We propose scalable methods for achieving equality of opportunity and equalized odds in rankings in the presence of position bias, which commonly plagues data generated from recommender systems. Our algorithms are model agnostic in the sense that they depend only on the final scores provided by a model, making them easily applicable to virtually all web-scale recommender systems. We conduct extensive simulations as well as real-world experiments to show the efficacy of our approach.

preprint2022arXiv

Efficient Vertex-Oriented Polytopic Projection for Web-scale Applications

We consider applications involving a large set of instances of projecting points to polytopes. We develop an intuition guided by theoretical and empirical analysis to show that when these instances follow certain structures, a large majority of the projections lie on vertices of the polytopes. To do these projections efficiently we derive a vertex-oriented incremental algorithm to project a point onto any arbitrary polytope, as well as give specific algorithms to cater to simplex projection and polytopes where the unit box is cut by planes. Such settings are especially useful in web-scale applications such as optimal matching or allocation problems. Several such problems in internet marketplaces (e-commerce, ride-sharing, food delivery, professional services, advertising, etc.), can be formulated as Linear Programs (LP) with such polytope constraints that require a projection step in the overall optimization process. We show that in the very recent work, the polytopic projection is the most expensive step and our efficient projection algorithms help in gaining massive improvements in performance.

preprint2022arXiv

Heterogeneous Calibration: A post-hoc model-agnostic framework for improved generalization

We introduce the notion of heterogeneous calibration that applies a post-hoc model-agnostic transformation to model outputs for improving AUC performance on binary classification tasks. We consider overconfident models, whose performance is significantly better on training vs test data and give intuition onto why they might under-utilize moderately effective simple patterns in the data. We refer to these simple patterns as heterogeneous partitions of the feature space and show theoretically that perfectly calibrating each partition separately optimizes AUC. This gives a general paradigm of heterogeneous calibration as a post-hoc procedure by which heterogeneous partitions of the feature space are identified through tree-based algorithms and post-hoc calibration techniques are applied to each partition to improve AUC. While the theoretical optimality of this framework holds for any model, we focus on deep neural networks (DNNs) and test the simplest instantiation of this paradigm on a variety of open-source datasets. Experiments demonstrate the effectiveness of this framework and the future potential for applying higher-performing partitioning schemes along with more effective calibration techniques.

preprint2022arXiv

Pushing the limits of fairness impossibility: Who's the fairest of them all?

The impossibility theorem of fairness is a foundational result in the algorithmic fairness literature. It states that outside of special cases, one cannot exactly and simultaneously satisfy all three common and intuitive definitions of fairness - demographic parity, equalized odds, and predictive rate parity. This result has driven most works to focus on solutions for one or two of the metrics. Rather than follow suit, in this paper we present a framework that pushes the limits of the impossibility theorem in order to satisfy all three metrics to the best extent possible. We develop an integer-programming based approach that can yield a certifiably optimal post-processing method for simultaneously satisfying multiple fairness criteria under small violations. We show experiments demonstrating that our post-processor can improve fairness across the different definitions simultaneously with minimal model performance reduction. We also discuss applications of our framework for model selection and fairness explainability, thereby attempting to answer the question: who's the fairest of them all?

preprint2021arXiv

Knowledge-driven Natural Language Understanding of English Text and its Applications

Understanding the meaning of a text is a fundamental challenge of natural language understanding (NLU) research. An ideal NLU system should process a language in a way that is not exclusive to a single task or a dataset. Keeping this in mind, we have introduced a novel knowledge driven semantic representation approach for English text. By leveraging the VerbNet lexicon, we are able to map syntax tree of the text to its commonsense meaning represented using basic knowledge primitives. The general purpose knowledge represented from our approach can be used to build any reasoning based NLU system that can also provide justification. We applied this approach to construct two NLU applications that we present here: SQuARE (Semantic-based Question Answering and Reasoning Engine) and StaCACK (Stateful Conversational Agent using Commonsense Knowledge). Both these systems work by "truly understanding" the natural language text they process and both provide natural language explanations for their responses while maintaining high accuracy.

preprint2020arXiv

A Framework for Fairness in Two-Sided Marketplaces

Many interesting problems in the Internet industry can be framed as a two-sided marketplace problem. Examples include search applications and recommender systems showing people, jobs, movies, products, restaurants, etc. Incorporating fairness while building such systems is crucial and can have a deep social and economic impact (applications include job recommendations, recruiters searching for candidates, etc.). In this paper, we propose a definition and develop an end-to-end framework for achieving fairness while building such machine learning systems at scale. We extend prior work to develop an optimization framework that can tackle fairness constraints from both the source and destination sides of the marketplace, as well as dynamic aspects of the problem. The framework is flexible enough to adapt to different definitions of fairness and can be implemented in very large-scale settings. We perform simulations to show the efficacy of our approach.

preprint2020arXiv

Adaptive Rate of Convergence of Thompson Sampling for Gaussian Process Optimization

We consider the problem of global optimization of a function over a continuous domain. In our setup, we can evaluate the function sequentially at points of our choice and the evaluations are noisy. We frame it as a continuum-armed bandit problem with a Gaussian Process prior on the function. In this regime, most algorithms have been developed to minimize some form of regret. In this paper, we study the convergence of the sequential point $x^t$ to the global optimizer $x^*$ for the Thompson Sampling approach. Under some assumptions and regularity conditions, we prove concentration bounds for $x^t$ where the probability that $x^t$ is bounded away from $x^*$ decays exponentially fast in $t$. Moreover, the result allows us to derive adaptive convergence rates depending on the function structure.

preprint2020arXiv

Evaluating Fairness Using Permutation Tests

Machine learning models are central to people's lives and impact society in ways as fundamental as determining how people access information. The gravity of these models imparts a responsibility to model developers to ensure that they are treating users in a fair and equitable manner. Before deploying a model into production, it is crucial to examine the extent to which its predictions demonstrate biases. This paper deals with the detection of bias exhibited by a machine learning model through statistical hypothesis testing. We propose a permutation testing methodology that performs a hypothesis test that a model is fair across two groups with respect to any given metric. There are increasingly many notions of fairness that can speak to different aspects of model fairness. Our aim is to provide a flexible framework that empowers practitioners to identify significant biases in any metric they wish to study. We provide a formal testing mechanism as well as extensive experiments to show how this method works in practice.

preprint2020arXiv

SQuARE: Semantics-based Question Answering and Reasoning Engine

Understanding the meaning of a text is a fundamental challenge of natural language understanding (NLU) and from its early days, it has received significant attention through question answering (QA) tasks. We introduce a general semantics-based framework for natural language QA and also describe the SQuARE system, an application of this framework. The framework is based on the denotational semantics approach widely used in programming language research. In our framework, valuation function maps syntax tree of the text to its commonsense meaning represented using basic knowledge primitives (the semantic algebra) coded using answer set programming (ASP). We illustrate an application of this framework by using VerbNet primitives as our semantic algebra and a novel algorithm based on partial tree matching that generates an answer set program that represents the knowledge in the text. A question posed against that text is converted into an ASP query using the same framework and executed using the s(CASP) goal-directed ASP system. Our approach is based purely on (commonsense) reasoning. SQuARE achieves 100% accuracy on all the five datasets of bAbI QA tasks that we have tested. The significance of our work is that, unlike other machine learning based approaches, ours is based on "understanding" the text and does not require any training. SQuARE can also generate an explanation for an answer while maintaining high accuracy.

preprint2016arXiv

Asymptotic Normality of Scrambled Geometric Net Quadrature

In a very recent work, Basu and Owen (2015) propose the use of scrambled geometric nets in numerical integration when the domain is a product of $s$ arbitrary spaces of dimension $d$ having a certain partitioning constraint. It was shown that for a class of smooth functions, the integral estimate has variance $O( n^{-1 -2/d} (\log n)^{s-1})$ for scrambled geometric nets, compared to $O(n^{-1})$ for ordinary Monte Carlo. The main idea of this paper is to develop on the work by Loh (2003), to show that the scrambled geometric net estimate has an asymptotic normal distribution for certain smooth functions defined on products of suitable subsets of $\mathbb{R}^d$.

preprint2016arXiv

Large scale multi-objective optimization: Theoretical and practical challenges

Multi-objective optimization (MOO) is a well-studied problem for several important recommendation problems. While multiple approaches have been proposed, in this work, we focus on using constrained optimization formulations (e.g., quadratic and linear programs) to formulate and solve MOO problems. This approach can be used to pick desired operating points on the trade-off curve between multiple objectives. It also works well for internet applications which serve large volumes of online traffic, by working with Lagrangian duality formulation to connect dual solutions (computed offline) with the primal solutions (computed online). We identify some key limitations of this approach -- namely the inability to handle user and item level constraints, scalability considerations and variance of dual estimates introduced by sampling processes. We propose solutions for each of the problems and demonstrate how through these solutions we significantly advance the state-of-the-art in this realm. Our proposed methods can exactly handle user and item (and other such local) constraints, achieve a $100\times$ scalability boost over existing packages in R and reduce variance of dual estimates by two orders of magnitude.

preprint2015arXiv

Almost Empty Monochromatic Triangles in Planar Point Sets

For positive integers $c, s \geq 1$, let $M_3(c, s)$ be the least integer such that any set of at least $M_3(c, s)$ points in the plane, no three on a line and colored with $c$ colors, contains a monochromatic triangle with at most $s$ interior points. The case $s=0$, which corresponds to empty monochromatic triangles, has been studied extensively over the last few years. In particular, it is known that $M_3(1, 0)=3$, $M_3(2, 0)=9$ and $M_3(c, 0)=\infty$, for $c\geq 3$. In this paper we extend these results when $c \geq 2$ and $s \geq 1$. We prove that the least integer $λ_3(c)$ such that $M_3(c, λ_3(c))< \infty$ satisfies: $$\left\lfloor\frac{c-1}{2}\right\rfloor \leqλ_3(c)\leq c-2,$$ where $c \geq 2$. Moreover, the exact values of $M_3(c, s)$ are determined for small values of $c$ and $s$. We also conjecture that $λ_3(4)=1$, and verify it for sufficiently large Horton sets.

preprint2015arXiv

Quasi-Monte Carlo tractability of high dimensional integration over products of simplices

Quasi-Monte Carlo (QMC) methods for high dimensional integrals over unit cubes and products of spheres are well-studied in literature. We study QMC tractability of integrals of functions defined over the product of $m$ copies of the simplex $T^d \subset \mathbb{R}^{d}$. The domain is a tensor product of $m$ reproducing kernel Hilbert spaces defined by `weights' $γ_{m,j}$, for $j = 1,2, \ldots, m$. Similar to the results on the unit cube in $m$ dimensions, and the product of $m$ copies of the $d$-dimensional sphere, we prove that strong polynomial tractability holds iff $\limsup_{m \rightarrow \infty} \sum_{j=1}^m γ_{m,j} < \infty$ and polynomial tractability holds iff $\limsup_{m \rightarrow \infty} \frac{\sum_{j=1}^m γ_{m,j}}{\log(m + 1 )} < \infty$. We also show that weak tractability holds iff $\lim_{m \rightarrow \infty} \frac{\sum_{j=1}^m γ_{m,j}}{m} = 0$. The proofs employ Sobolev space techniques and weighted reproducing kernel Hilbert space techniques for the simplex and products of simplices as domain. Properties of orthogonal polynomials on a simplex are also used extensively.

preprint2015arXiv

Transformations and Hardy-Krause variation

Using a multivariable Faa di Bruno formula we give conditions on transformations $τ:[0,1]^m\to\mathcal{X}$ where $\mathcal{X}$ is a closed and bounded subset of $\mathbb{R}^d$ such that $f\circτ$ is of bounded variation in the sense of Hardy and Krause for all $f\in C^d(\mathcal{x})$. We give similar conditions for $f\circτ$ to be smooth enough for scrambled net sampling to attain $O(n^{-3/2+ε})$ accuracy. Some popular symmetric transformations to the simplex and sphere are shown to satisfy neither condition. Some other transformations due to Fang and Wang (1993) satisfy the first but not the second condition. We provide transformations for the simplex that makes $f\circτ$ smooth enough to fully benefit from scrambled net sampling for all $f$ in a class of generalized polynomials. We also find sufficient conditions for the Rosenblatt-Hlawka-Mück transformation in $\mathbb{R}^2$ and for importance sampling to be of bounded variation in the sense of Hardy and Krause.

preprint2014arXiv

Low discrepancy constructions in the triangle

Most quasi-Monte Carlo research focuses on sampling from the unit cube. Many problems, especially in computer graphics, are defined via quadrature over the unit triangle. Quasi-Monte Carlo methods for the triangle have been developed by Pillands and Cools (2005) and by Brandolini et al. (2013). This paper presents two QMC constructions in the triangle with a vanishing discrepancy. The first is a version of the van der Corput sequence customized to the unit triangle. It is an extensible digital construction that attains a discrepancy below 12/sqrt(N). The second construction rotates an integer lattice through an angle whose tangent is a quadratic irrational number. It attains a discrepancy of O(log(N)/N) which is the best possible rate. Previous work strongly indicated that such a discrepancy was possible, but no constructions were available. Scrambling the digits of the first construction improves its accuracy for integration of smooth functions. Both constructions also yield convergent estimates for integrands that are Riemann integrable on the triangle without requiring bounded variation.

preprint2012arXiv

A spatio-spectral hybridization for edge preservation and noisy image restoration via local parametric mixtures and Lagrangian relaxation

This paper investigates a fully unsupervised statistical method for edge preserving image restoration and compression using a spatial decomposition scheme. Smoothed maximum likelihood is used for local estimation of edge pixels from mixture parametric models of local templates. For the complementary smooth part the traditional L2-variational problem is solved in the Fourier domain with Thin Plate Spline (TPS) regularization. It is well known that naive Fourier compression of the whole image fails to restore a piece-wise smooth noisy image satisfactorily due to Gibbs phenomenon. Images are interpreted as relative frequency histograms of samples from bi-variate densities where the sample sizes might be unknown. The set of discontinuities is assumed to be completely unsupervised Lebesgue-null, compact subset of the plane in the continuous formulation of the problem. Proposed spatial decomposition uses a widely used topological concept, partition of unity. The decision on edge pixel neighborhoods are made based on the multiple testing procedure of Holms. Statistical summary of the final output is decomposed into two layers of information extraction, one for the subset of edge pixels and the other for the smooth region. Robustness is also demonstrated by applying the technique on noisy degradation of clean images.

preprint2012arXiv

Spline Smoothing for Estimation of Circular Probability Distributions via Spectral Isomorphism and its Spatial Adaptation

Consider the problem when $X_1,X_2,..., X_n$ are distributed on a circle following an unknown distribution $F$ on $S^1$. In this article we have consider the absolute general set-up where the density can have local features such as discontinuities and edges. Furthermore, there can be outlying data which can follow some discrete distributions. The traditional Kernel Density Estimation methods fail to identify such local features in the data. Here we device a non-parametric density estimate on $S^1$, by the use of a novel technique which we term as Fourier Spline. We have also tried to identify and incorporate local features such as support, discontinuity or edges in the final density estimate. Several new results are proved in this regard. Simulation studies have also been performed to see how our methodology works. Finally a real life example is also shown.

preprint2012arXiv

Tests for exponentiality against NBUE alternatives: a Monte Carlo comparison

Testing of various classes of life distributions has been addressed in the literature for more than 45 years. In this paper, we consider the problem of testing exponentiality (which essentially implies no ageing) against positive ageing which is captured by the fairly large class of new better than used in expectation (NBUE) distributions. These tests of exponentiality against NBUE alternatives are discussed and compared. The empirical size of the tests is obtained by simulations. Power comparisons for different popular alternatives are done using Monte Carlo simulations. These comparisons are made for both small and large sample sizes. The paper concludes with a discussion in which suggestions are made regarding the choices of the test when a particular alternative is suspected.

preprint2012arXiv

The exact null distribution of the generalized Hollander-Proschan type test for NBUE alternatives

In this note we derive the exact null distribution for the test statistic proposed by Anis and Mitra (2011) for testing exponentiality against NBUE alternatives. As a special case, we obtain the exact null distribution for the test statistic proposed by Hollander and Proschan (1975). Selected critical values for different size are tabulated for these two statistics. Some remarks concerning the benefits of using the exact distribution are made.

Kinjal Basu

What is connected

Connect this record

See the researcher in context

Building this map preview

20 published item(s)

ToolRM: Outcome Reward Models for Tool-Calling Large Language Models

Achieving Fairness via Post-Processing in Web-Scale Recommender Systems

Efficient Vertex-Oriented Polytopic Projection for Web-scale Applications

Heterogeneous Calibration: A post-hoc model-agnostic framework for improved generalization

Pushing the limits of fairness impossibility: Who's the fairest of them all?

Knowledge-driven Natural Language Understanding of English Text and its Applications

A Framework for Fairness in Two-Sided Marketplaces

Adaptive Rate of Convergence of Thompson Sampling for Gaussian Process Optimization

Evaluating Fairness Using Permutation Tests

SQuARE: Semantics-based Question Answering and Reasoning Engine

Asymptotic Normality of Scrambled Geometric Net Quadrature

Large scale multi-objective optimization: Theoretical and practical challenges

Almost Empty Monochromatic Triangles in Planar Point Sets

Quasi-Monte Carlo tractability of high dimensional integration over products of simplices

Transformations and Hardy-Krause variation

Low discrepancy constructions in the triangle

A spatio-spectral hybridization for edge preservation and noisy image restoration via local parametric mixtures and Lagrangian relaxation

Spline Smoothing for Estimation of Circular Probability Distributions via Spectral Isomorphism and its Spatial Adaptation

Tests for exponentiality against NBUE alternatives: a Monte Carlo comparison

The exact null distribution of the generalized Hollander-Proschan type test for NBUE alternatives