Researcher profile

Arijit Ghosh

Arijit Ghosh contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2026arXiv

Spectral Norm, Economical Sieve, and Linear Invariance Testing of Boolean Functions

Given Boolean functions \( f, g : \mathbb{F}_2^n \to \{-1,+1\} \), we say they are {\em linearly isomorphic} if there exists \( A \in \mathrm{GL}_n(\mathbb{F}_2) \) such that \( f(x)=g(Ax) \) for all \( x \). We study this problem in the tolerant property testing framework under the known--unknown model, where \( g \) is given explicitly and \( f \) is accessible only via oracle queries, meaning the algorithm may adaptively request the value of \( f(x) \) for inputs \( x \in \mathbb{F}_2^n \) of its choice. Given parameters \( ε\ge 0 \) and \( ω>0 \), the goal is to distinguish whether there exists \( A \in \mathrm{GL}_n(\mathbb{F}_{2})\) such that the normalized Hamming distance between \( f \) and \( g(Ax) \) is at most \( ε\), or whether for every \( A \in \mathrm{GL}_n(\mathbb{F}_2) \) the distance is at least \( ε+ω\). Our main result is a tolerant tester making \( \widetilde{O} \left( \left( m/ω\right)^4 \right) \) queries to \( f \), where \( m \) is an upper bound on the spectral norm of \( g \), improving the previous \( \widetilde{O} \left( \left( m/ω\right)^{24} \right) \) bound of Wimmer and Yoshida. We complement this with a nearly matching lower bound of \( Ω(m^2) \) for constant \( ω\) (for example, \( ω=1/4 \)), improving the prior \( Ω(\log m) \) lower bound of Grigorescu, Wimmer and Xie. A key technical ingredient on the algorithmic side is a query-efficient local list corrector. For the lower bound, we give a reduction from communication complexity using a novel subclass of Maiorana--McFarland functions from symmetric-key cryptography.

preprint2026arXiv

Spectral Shadows: When Communication Complexity Meets Linear Invariance Testing

In this short note, we initiate the study of the Linear Isomorphism Testing Problem in the setting of communication complexity, a natural linear algebraic generalization of the classical Equality problem. Given Boolean functions $f, g : \mathbb{F}_2^n \to \{-1, +1\}$, Alice and Bob are tasked with determining whether $f$ and $g$ are equivalent up to a nonsingular linear transformation of the input variables, or far from being so. This problem has been extensively investigated in several models of computation, including standard algorithmic and property testing frameworks, owing to its fundamental connections with combinatorial circuit design, complexity theory, and cryptography. However, despite its broad relevance, it has remained unexplored in the context of communication complexity, a gap we address in this work. Our main results demonstrate that the approximate spectral norm of the input functions plays a central role in governing the communication complexity of this problem. We design a simple deterministic protocol whose communication cost is polynomial in the approximate spectral norm, and complement it with nearly matching lower bounds (up to a quadratic gap). In the randomised setting with private coins, we present an even more efficient protocol, though equally simple, that achieves a quadratically improved dependence on the approximate spectral norm compared to the deterministic case, and we prove that such a dependence is essentially unavoidable. These results identify the approximate spectral norm as a key complexity measure for testing linear invariance in the communication complexity framework. As a core technical ingredient, we establish new junta theorems for Boolean functions with small approximate spectral norm, which may be of independent interest in Fourier analysis and learning theory.

preprint2023arXiv

Sparse Geometric Set Systems and the Beck-Fiala Conjecture

We investigate the combinatorial discrepancy of geometric set systems having bounded shallow cell complexity in the \emph{Beck-Fiala} setting, where each point belongs to at most $t$ ranges. For set systems with shallow cell complexity $ψ(m,k)=g(m)k^{c}$, where $(i)$ $g(m) = o(m^{\varepsilon})$ for any $\varepsilon\in (0,1],$ $(ii)$ $ψ$ is non-decreasing in $m$, and $(iii)$ $c>0$ is independent of $m$ and $k$, we get a discrepancy bound of \[ O\left(\sqrt{\left(\log n+\left(t^{c}g(n)\right)^{\frac{1}{1+c}}\right)\log n}\right).\] For $t=ω(\log^2 n)$, in several cases, such as for set systems of points and half-planes / disks / pseudo-disks in $\mathbb{R}^2$, points and orthants in $\mathbb{R}^3$ etc., these bounds are $o(\sqrt{t})$, which verifies (and improves upon) the conjectured bound of Beck and Fiala~\emph{(Disc. Appl. Math., 1981)}. Our bounds are obtained by showing the existence of \emph{matchings with low crossing number}, using the multiplicative weights update method of Welzl \emph{(SoCG, 1988)}, together with the recent bound of Mustafa \emph{(Disc. Comp. Geom., 2015)} on \emph{shallow packings} of set systems in terms of their shallow cell complexity. For set systems of shallow cell complexity $ψ(m,k)=m^{c_1}g(m)k^{c}$, we obtain matchings with crossing number at most \[ O\left(\left(n^{c_1}g(n)t^{c}\right)^{\frac{1}{1+c_1+c}}\right).\] These are of independent interest.

preprint2022arXiv

Efficiently Sampling and Estimating from Substructures using Linear Algebraic Queries

Given an unknown $n \times n$ matrix $A$ having non-negative entries, the \emph{inner product} (IP) oracle takes as inputs a specified row (or a column) of $A$ and a vector $v \in \mathbb{R}^{n}$, and returns their inner product. A derivative of IP is the induced degree query in an unknown graph $G=(V(G), E(G))$ that takes a vertex $u \in V(G)$ and a subset $S \subseteq V(G)$ as input and reports the number of neighbors of $u$ that are present in $S$. The goal of this paper is to understand the strength of the inner product oracle. Our results in that direction are as follows: (I) IP oracle can solve bilinear form estimation, i.e., estimate the value of ${\bf x}^{T}A\bf{y}$ given two vectors ${\bf x},\, {\bf y} \in \mathbb{R}^{n}$ with non-negative entries and can sample almost uniformly entries of a matrix with non-negative entries; (ii) We tackle for the first time weighted edge estimation and weighted sampling of edges that follow as an application to the bilinear form estimation and almost uniform sampling problems, respectively; (iii) induced degree query, a derivative of IP can solve edge estimation and an almost uniform edge sampling in induced subgraphs. To the best of our knowledge, these are the first set of Oracle-based query complexity results for induced subgraphs. We show that IP/induced degree queries over the whole graph can simulate local queries in any induced subgraph; (iv) Apart from the above, we also show that IP can solve several problems related to matrix, like testing if the matrix is diagonal, symmetric, doubly stochastic, etc.

preprint2022arXiv

Faster Counting and Sampling Algorithms using Colorful Decision Oracle

In this work, we consider $d$-{\sc Hyperedge Estimation} and $d$-{\sc Hyperedge Sample} problem in a hypergraph $\mathcal{H}(U(\mathcal{H}),\mathcal{F}(\mathcal{H}))$ in the query complexity framework, where $U(\mathcal{H})$ denotes the set of vertices and $\mathcal{F}(\mathcal{H})$ denotes the set of hyperedges. The oracle access to the hypergraph is called {\sc Colorful Independence Oracle} ({\sc CID}), which takes $d$ (non-empty) pairwise disjoint subsets of vertices $A_1,\ldots,A_d \subseteq U(\mathcal{H})$ as input, and answers whether there exists a hyperedge in $\mathcal{H}$ having (exactly) one vertex in each $A_i, i \in \{1,2,\ldots,d\}$. The problem of $d$-{\sc Hyperedge Estimation} and $d$-{\sc Hyperedge Sample} with {\sc CID} oracle access is important in its own right as a combinatorial problem. Also, Dell {\it{et al.}}~[SODA '20] established that {\em decision} vs {\em counting} complexities of a number of combinatorial optimization problems can be abstracted out as $d$-{\sc Hyperedge Estimation} problems with a {\sc CID} oracle access. The main technical contribution of the paper is an algorithm that estimates $m= \lvert {\mathcal{F}(\mathcal{H})}\rvert$ with $\widehat{m}$ such that { $$ \frac{1}{C_{d}\log^{d-1} n} \;\leq\; \frac{\widehat{m}}{m} \;\leq\; C_{d} \log ^{d-1} n . $$ by using at most $C_{d}\log ^{d+2} n$ many {\sc CID} queries, where $n$ denotes the number of vertices in the hypergraph $\mathcal{H}$ and $C_{d}$ is a constant that depends only on $d$}. Our result coupled with the framework of Dell {\it{et al.}}~[SODA '21] implies improved bounds for a number of fundamental problems.

preprint2022arXiv

Tolerant Bipartiteness Testing in Dense Graphs

Bipartite testing has been a central problem in the area of property testing since its inception in the seminal work of Goldreich, Goldwasser and Ron [FOCS'96 and JACM'98]. Though the non-tolerant version of bipartite testing has been extensively studied in the literature, the tolerant variant is not well understood. In this paper, we consider the following version of tolerant bipartite testing: Given a parameter $\varepsilon \in (0,1)$ and access to the adjacency matrix of a graph $G$, we can decide whether $G$ is $\varepsilon$-close to being bipartite or $G$ is at least $(2+Ω(1))\varepsilon$-far from being bipartite, by performing $\widetilde{\mathcal{O}}\left(\frac{1}{\varepsilon ^3}\right)$ queries and in $2^{\widetilde{\mathcal{O}}(1/\varepsilon)}$ time. This improves upon the state-of-the-art query and time complexities of this problem of $\widetilde{\mathcal{O}}\left(\frac{1}{\varepsilon ^6}\right)$ and $2^{\widetilde{\mathcal{O}}(1/\varepsilon^2)}$, respectively, from the work of Alon, Fernandez de la Vega, Kannan and Karpinski (STOC'02 and JCSS'03), where $\widetilde{\mathcal{O}}(\cdot)$ hides a factor polynomial in $\log \frac{1}{\varepsilon}$.

preprint2020arXiv

Disjointness through the Lens of Vapnik-Chervonenkis Dimension: Sparsity and Beyond

The disjointness problem - where Alice and Bob are given two subsets of $\{1, \dots, n\}$ and they have to check if their sets intersect - is a central problem in the world of communication complexity. While both deterministic and randomized communication complexities for this problem are known to be $Θ(n)$, it is also known that if the sets are assumed to be drawn from some restricted set systems then the communication complexity can be much lower. In this work, we explore how communication complexity measures change with respect to the complexity of the underlying set system. The complexity measure for the set system that we use in this work is the Vapnik-Chervonenkis (VC) dimension. More precisely, on any set system with VC dimension bounded by $d$, we analyze how large can the deterministic and randomized communication complexities be, as a function of $d$ and $n$. In this paper, we construct two natural set systems of VC dimension $d$, motivated from geometry. Using these set systems we show that the deterministic and randomized communication complexity can be $\widetildeΘ\left(d\log \left( n/d \right)\right)$ for set systems of VC dimension $d$ and this matches the deterministic upper bound for all set systems of VC dimension $d$. We also study the deterministic and randomized communication complexities of the set intersection problem when sets belong to a set system of bounded VC dimension. We show that there exists set systems of VC dimension $d$ such that both deterministic and randomized (one-way and multi-round) complexity for the set intersection problem can be as high as $Θ\left( d\log \left( n/d \right) \right)$, and this is tight among all set systems of VC dimension $d$.

preprint2020arXiv

Hyperedge Estimation using Polylogarithmic Subset Queries

In this work, we estimate the number of hyperedges in a hypergraph ${\cal H}(U({\cal H}), {\cal F}({\cal H}))$, where $U({\cal H})$ denotes the set of vertices and ${\cal F}({\cal H}))$ denotes the set of hyperedges. We assume a query oracle access to the hypergraph ${\cal H}$. Estimating the number of edges, triangles or small subgraphs in a graph is a well studied problem. Beame \etal~and Bhattacharya \etal~gave algorithms to estimate the number of edges and triangles in a graph using queries to the {\sc Bipartite Independent Set} ({\sc BIS}) and the {\sc Tripartite Independent Set} ({\sc TIS}) oracles, respectively. We generalize the earlier works by estimating the number of hyperedges using a query oracle, known as the {\bf Generalized $d$-partite independent set oracle ({\sc GPIS})}, that takes $d$ (non-empty) pairwise disjoint subsets of vertices $A_1,\ldots,A_d \subseteq U({\cal H})$ as input, and answers whether there exists a hyperedge in ${\cal H}$ having (exactly) one vertex in each $A_i, i \in \{1,2,\ldots,d\}$. We give a randomized algorithm for the hyperedge estimation problem using the {\sc GPIS} query oracle to output $\widehat{m}$ for $m({\cal H})$ satisfying $(1-ε) \cdot m({\cal H}) \leq \widehat{m} \leq (1+ε) \cdot m({\cal H})$. The number of queries made by our algorithm, assuming $d$ to be a constant, is polylogarithmic in the number of vertices of the hypergraph.

preprint2020arXiv

On Triangle Estimation using Tripartite Independent Set Queries

Estimating the number of triangles in a graph is one of the most fundamental problems in sublinear algorithms. In this work, we provide an algorithm that approximately counts the number of triangles in a graph using only polylogarithmic queries when \emph{the number of triangles on any edge in the graph is polylogarithmically bounded}. Our query oracle {\em Tripartite Independent Set} (TIS) takes three disjoint sets of vertices $A$, $B$ and $C$ as inputs, and answers whether there exists a triangle having one endpoint in each of these three sets. Our query model generally belongs to the class of \emph{group queries} (Ron and Tsur, ACM ToCT, 2016; Dell and Lapinskas, STOC 2018) and in particular is inspired by the {\em Bipartite Independent Set} (BIS) query oracle of Beame {\em et al.} (ITCS 2018). We extend the algorithmic framework of Beame {\em et al.}, with \tis replacing \bis, for approximately counting triangles in graphs.

preprint2020arXiv

Query Complexity of Global Minimum Cut

In this work, we resolve the query complexity of global minimum cut problem for a graph by designing a randomized algorithm for approximating the size of minimum cut in a graph, where the graph can be accessed through local queries like {\sc Degree}, {\sc Neighbor}, and {\sc Adjacency} queries. Given $ε\in (0,1)$, the algorithm with high probability outputs an estimate $\hat{t}$ satisfying the following $(1-ε) t \leq \hat{t} \leq (1+ε) t$, where $m$ is the number of edges in the graph and $t$ is the size of minimum cut in the graph. The expected number of local queries used by our algorithm is $\min\left\{m+n,\frac{m}{t}\right\}\mbox{poly}\left(\log n,\frac{1}ε\right)$ where $n$ is the number of vertices in the graph. Eden and Rosenbaum showed that $Ω(m/t)$ many local queries are required for approximating the size of minimum cut in graphs. These two results together resolve the query complexity of the problem of estimating the size of minimum cut in graphs using local queries. Building on the lower bound of Eden and Rosenbaum, we show that, for all $t \in \mathbb{N}$, $Ω(m)$ local queries are required to decide if the size of the minimum cut in the graph is $t$ or $t-2$. Also, we show that, for any $t \in \mathbb{N}$, $Ω(m)$ local queries are required to find all the minimum cut edges even if it is promised that the input graph has a minimum cut of size $t$. Both of our lower bound results are randomized, and hold even if we can make {\sc Random Edge} query apart from local queries.