Researcher profile

Shashank Gupta

Shashank Gupta contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2026arXiv

A Benchmark for End-to-End Zero-Shot Biomedical Relation Extraction with LLMs: Experiments with OpenAI Models

Extracting relations from scientific literature is a fundamental task in biomedical NLP because entities and relations among them drive hypothesis generation and knowledge discovery. As literature grows rapidly, relation extraction (RE) is indispensable to curate knowledge graphs to be used as computable structured and symbolic representations. With the rise of LLMs, it is pertinent to examine if it is better to skip tailoring supervised RE methods, save annotation burden, and just use zero shot RE (ZSRE) via LLM API calls. In this paper, we propose a benchmark with seven biomedical RE datasets with interesting characteristics and evaluate three Open AI models (GPT-4, o1, and GPT-OSS-120B) for end-to-end ZSRE. We show that LLM-based ZSRE is inching closer to supervised methods in performances on some datasets but still struggles on complex inputs expressing multiple relations with different predicates. Our error analysis reveals scope for improvements.

preprint2026arXiv

Towards Two-Stage Counterfactual Learning to Rank

Counterfactual learning to rank (CLTR) aims to learn a ranking policy from user interactions while correcting for the inherent biases in interaction data, such as position bias. Existing CLTR methods assume a single ranking policy that selects top-K ranking from the entire document candidate set. In real-world applications, the candidate document set is on the order of millions, making a single-stage ranking policy impractical. In order to scale to millions of documents, real-world ranking systems are designed in a two-stage fashion, with a candidate generator followed by a ranker. The existing CLTR method for a two-stage offline ranking system only considers the top-1 ranking set-up and only focuses on training the candidate generator, with the ranker fixed. A CLTR method for training both the ranker and candidate generator jointly is missing from the existing literature. In this paper, we propose a two-stage CLTR estimator that considers the interaction between the two stages and estimates the joint value of the two policies offline. In addition, we propose a novel joint optimization method to train the candidate and ranker policies, respectively. To the best of our knowledge, we are the first to propose a CLTR estimator and learning method for two-stage ranking. Experimental results on a semi-synthetic benchmark demonstrate the effectiveness of the proposed joint CLTR method over baselines.

preprint2022arXiv

"All-versus-nothing" proof of genuine tripartite steering and entanglement certification in the two-sided device-independent scenario

We consider the task of certification of genuine entanglement of tripartite states. For this purpose, we first present an "all-versus-nothing" proof of genuine tripartite Einstein-Podolsky-Rosen (EPR) steering by demonstrating the non-existence of a hybrid local hidden state (LHS) model in the tripartite network as a motivation to our main result. A full logical contradiction of the predictions of the hybrid LHS model with quantum mechanical outcome statistics for any three-qubit generalized Greenberger-Horne-Zeilinger (GGHZ) states and pure W-class states is shown. Using logical contradiction, we can distinguish between the GGHZ and W-class state in a two-sided device-independent (2SDI) steering scenario. We next formulate a 2SDI steering inequality which is a generalization of the fine-grained steering inequality (FGI) derived in \cite{PKM14} for the tripartite scenario. We show that the maximum quantum violation of this tripartite FGI can be used to certify genuine entanglement of three-qubit pure states.

preprint2022arXiv

A universal whitening algorithm for commercial random number generators

Random number generators are imperfect due to manufacturing bias and technological imperfections. These imperfections are removed using post-processing algorithms that in general compress the data and do not work in every scenario. In this work, we present a universal whitening algorithm using n-qubit permutation matrices to remove the imperfections in commercial random number generators without compression. Specifically, we demonstrate the efficacy of our algorithm in several categories of random number generators and its comparison with cryptographic hash functions and block ciphers. We have achieved improvement in almost every randomness parameter evaluated using ENT randomness test suite. The modified random number files obtained after the application of our algorithm in the raw random data file pass the NIST SP 800-22 tests in both the cases: 1. The raw file does not pass all the tests. 2. The raw file also passes all the tests.

preprint2022arXiv

Constructive Feedback of Non-Markovianity on Resources in Random Quantum States

We explore the impact of non-Markovian channels on the quantum correlations (QCs) of Haar uniformly generated random two-qubit input states with different ranks -- either one of the qubits (single-sided) or both the qubits independently (double-sided) are passed through noisy channels. Under dephasing and depolarizing channels with varying non-Markovian strength, entanglement and quantum discord of the output states collapse and revive with the increase of noise. We find that in case of the depolarizing double-sided channel, both the QCs of random states show a higher number of revivals on average than that of the single-sided ones with a fixed non-Markovianity strength, irrespective of the rank of the states -- we call such a counter-intuitive event as a constructive feedback of non-Markovianity. Consequently, the average noise at which QCs of random states show first revival decreases with the increase of the strength of non-Markovian noise, thereby indicating the role of non-Markovian channels on the regenerations of QCs even in presence of a high amount of noise. However, we observe that non-Markovianity does not play any role to increase the robustness in random quantum states which can be measured by the mean value of critical noise at which quantum correlations first collapse. Moreover, we observe that the tendency of a state to show regeneration increases with the increase of average QCs of the random input states along with non-Markovianity.

preprint2022arXiv

Knowledge Infused Decoding

Pre-trained language models (LMs) have been shown to memorize a substantial amount of knowledge from the pre-training corpora; however, they are still limited in recalling factually correct knowledge given a certain context. Hence, they tend to suffer from counterfactual or hallucinatory generation when used in knowledge-intensive natural language generation (NLG) tasks. Recent remedies to this problem focus on modifying either the pre-training or task fine-tuning objectives to incorporate knowledge, which normally require additional costly training or architecture modification of LMs for practical applications. We present Knowledge Infused Decoding (KID) -- a novel decoding algorithm for generative LMs, which dynamically infuses external knowledge into each step of the LM decoding. Specifically, we maintain a local knowledge memory based on the current context, interacting with a dynamically created external knowledge trie, and continuously update the local memory as a knowledge-aware constraint to guide decoding via reinforcement learning. On six diverse knowledge-intensive NLG tasks, task-agnostic LMs (e.g., GPT-2 and BART) armed with KID outperform many task-optimized state-of-the-art models, and show particularly strong performance in few-shot scenarios over seven related knowledge-infusion techniques. Human evaluation confirms KID's ability to generate more relevant and factual language for the input context when compared with multiple baselines. Finally, KID also alleviates exposure bias and provides stable generation quality when generating longer sequences. Code for KID is available at https://github.com/microsoft/KID.

preprint2022arXiv

Lung-Originated Tumor Segmentation from Computed Tomography Scan (LOTUS) Benchmark

Lung cancer is one of the deadliest cancers, and in part its effective diagnosis and treatment depend on the accurate delineation of the tumor. Human-centered segmentation, which is currently the most common approach, is subject to inter-observer variability, and is also time-consuming, considering the fact that only experts are capable of providing annotations. Automatic and semi-automatic tumor segmentation methods have recently shown promising results. However, as different researchers have validated their algorithms using various datasets and performance metrics, reliably evaluating these methods is still an open challenge. The goal of the Lung-Originated Tumor Segmentation from Computed Tomography Scan (LOTUS) Benchmark created through 2018 IEEE Video and Image Processing (VIP) Cup competition, is to provide a unique dataset and pre-defined metrics, so that different researchers can develop and evaluate their methods in a unified fashion. The 2018 VIP Cup started with a global engagement from 42 countries to access the competition data. At the registration stage, there were 129 members clustered into 28 teams from 10 countries, out of which 9 teams made it to the final stage and 6 teams successfully completed all the required tasks. In a nutshell, all the algorithms proposed during the competition, are based on deep learning models combined with a false positive reduction technique. Methods developed by the three finalists show promising results in tumor segmentation, however, more effort should be put into reducing the false positive rate. This competition manuscript presents an overview of the VIP-Cup challenge, along with the proposed algorithms and results.

preprint2022arXiv

Quantum entropy expansion using n-qubit permutation matrices in Galois field

Random numbers are critical for any cryptographic application. However, the data that is flowing through the internet is not secure because of entropy deprived pseudo random number generators and unencrypted IoTs. In this work, we address the issue of lesser entropy of several data formats. Specifically, we use the large information space associated with the n-qubit permutation matrices to expand the entropy of any data without increasing the size of the data. We take English text with the entropy in the range 4 - 5 bits per byte. We manipulate the data using a set of n-qubit (n $\leq$ 10) permutation matrices and observe the expansion of the entropy in the manipulated data (to more than 7.9 bits per byte). We also observe similar behaviour with other data formats like image, audio etc. (n $\leq$ 15).

preprint2022arXiv

Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners

Traditional multi-task learning (MTL) methods use dense networks that use the same set of shared weights across several different tasks. This often creates interference where two or more tasks compete to pull model parameters in different directions. In this work, we study whether sparsely activated Mixture-of-Experts (MoE) improve multi-task learning by specializing some weights for learning shared representations and using the others for learning task-specific information. To this end, we devise task-aware gating functions to route examples from different tasks to specialized experts which share subsets of network weights conditioned on the task. This results in a sparsely activated multi-task model with a large number of parameters, but with the same computational cost as that of a dense model. We demonstrate such sparse networks to improve multi-task learning along three key dimensions: (i) transfer to low-resource tasks from related tasks in the training mixture; (ii) sample-efficient generalization to tasks not seen during training by making use of task-aware routing from seen related tasks; (iii) robustness to the addition of unrelated tasks by avoiding catastrophic forgetting of existing tasks.

preprint2021arXiv

Genuine Einstein-Podolsky-Rosen steering of three-qubit states by multiple sequential observers

We investigate the possibility of multiple use of a single copy of three-qubit states for genuine tripartite Einstein-Podolsky-Rosen (EPR) steering. A pure three-qubit state of either the Greenberger-Horne-Zeilinger (GHZ)-type or W-type is shared between two fixed observers in two wings and a sequence of multiple observers in the third wing who perform unsharp or non-projective measurements. The choice of measurement settings for each of the multiple observers in the third wing is independent and uncorrelated with the measurement settings and outcomes of the previous observers. We investigate all possible types of (2->1) and (1->2) genuine tripartite steering in the above set-up. For each case, we obtain an upper limit on the number of observers on the third wing who can demonstrate genuine EPR steering through violation of a tripartite steering inequality. We show that the GHZ state allows for a higher number of observers compared to that for W state. Additionally, (1->2) steering is possible for a larger range of the sharpness parameter compared to that for the (2->1) steering cases.

preprint2020arXiv

IITK-RSA at SemEval-2020 Task 5: Detecting Counterfactuals

This paper describes our efforts in tackling Task 5 of SemEval-2020. The task involved detecting a class of textual expressions known as counterfactuals and separating them into their constituent elements. Counterfactual statements describe events that have not or could not have occurred and the possible implications of such events. While counterfactual reasoning is natural for humans, understanding these expressions is difficult for artificial agents due to a variety of linguistic subtleties. Our final submitted approaches were an ensemble of various fine-tuned transformer-based and CNN-based models for the first subtask and a transformer model with dependency tree information for the second subtask. We ranked 4-th and 9-th in the overall leaderboard. We also explored various other approaches that involved the use of classical methods, other neural architectures and the incorporation of different linguistic features.