Researcher profile

Sachin Kumar

Sachin Kumar contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2026arXiv

Activation Differences Reveal Backdoors: A Comparison of SAE Architectures

Backdoor attacks on language models pose a significant threat to AI safety, where models behave normally on most inputs but exhibit harmful behavior when triggered by specific patterns. Detecting such backdoors through mechanistic interpretability remains an open challenge. We investigate two sparse autoencoder architectures -- Crosscoders and Differential SAEs (Diff-SAE) -- for isolating backdoor-related features in fine-tuned models. Using a controlled SQL injection backdoor triggered by year-based context ("2024" triggers vulnerable code, "2023" triggers safe code), we evaluate both approaches across LoRA and full-rank fine-tuning regimes on SmolLM2-360M. We find that Diff-SAE consistently and substantially outperforms Crosscoders for backdoor isolation. Diff-SAE achieves a Backdoor Isolation Score (BIS) of 0.40 with perfect precision (1.0) and zero false positive rate across most experimental conditions, while Crosscoders fail almost entirely with BIS below 0.02 in most cases. This performance gap holds across multiple transformer layers (14, 18, 22, 26) and both fine-tuning regimes, with full-rank fine-tuning producing particularly clean backdoor signals. Our results suggest that backdoors manifest as directional activation shifts rather than sparse feature activations, making difference-based representations fundamentally more effective for detection. These findings have important implications for AI safety monitoring and the development of interpretability tools for detecting model manipulation.

preprint2026arXiv

HalluWorld: A Controlled Benchmark for Hallucination via Reference World Models

Hallucination remains a central failure mode of large language models, but existing benchmarks operationalize it inconsistently across summarization, question answering, retrieval-augmented generation, and agentic interaction. This fragmentation makes it unclear whether a mitigation that works in one setting reduces hallucinations across contexts. Current benchmarks either require human annotation and fixed references that may be memorized, or rely on observations in settings that are difficult to reproduce. To study root causes, we introduce HalluWorld, an extensible benchmark grounded in an explicit reference-world formulation: a model hallucinates when it produces an observable claim that is false with respect to this world. Building on this view, we construct synthetic and semi-synthetic environments in which the reference world is fully specified, the model's view is controlled, and hallucination labels are generated automatically. HalluWorld spans gridworlds, chess, and realistic terminal tasks, enabling controlled variation of world complexity, observability, temporal change, and source-conflict policy, and disentangling hallucinations into fine-grained error categories. We evaluate frontier and open-weight language models across these settings and find consistent patterns: perceptual hallucination on directly observed information is near-solved for frontier models, while multi-step state tracking and causal forward simulation remain difficult and are not generally solved by extended thinking. In the terminal setting, models also struggle with when to abstain. The uneven profile of failures across probe types and domains suggests that hallucinations arise from distinct failure modes rather than a single capability. Our results suggest that controlled reference worlds offer a scalable and reproducible path toward measuring and reducing hallucinations in modern language models.

preprint2026arXiv

Leveraging Pretrained Language Models as Energy Functions for Glauber Dynamics Text Diffusion

We present a discrete diffusion-based language model using Glauber dynamics from statistical physics. Our main insight is that instead of trying to train a discrete state space diffusion model using Glauber dynamics with a uniform transition kernel as the forward process, one can set up an ``energy function'' based on pretrained causal/masked language models. When viewed as the stationary distribution, this energy function allows us to significantly improve the quality of the generated text. Incorporating UL2 as the pretrained model into our diffusion pipeline, we outperform prior diffusion based LMs and perform competitively with autoregressive models of comparable model sizes. Furthermore, our models are competitive with or outperform prior diffusion models and GPT-2 style auto-regressive models on zero-shot common sense reasoning tasks as well as planning and search tasks like Sudoku and Zebra puzzles.

preprint2022arXiv

Nonlinear Dynamic analysis of vector-host model for Zika infection with predatory fish Gambusia Affinis

In the present paper, we study the dynamics of a nine compartmental vector-host model for Zika virus infection where the predatory fish Gambusia Affinis is introduced into the system to control the zika infection by preying on the vector. The system has six practically feasible equilibrium points where four of them are disease-free, and the rest are endemic. We discuss the existence and stability conditions for the equilibria. We find that when sexual transmission of zika comes to a halt then in absence of mosquitoes infection cannot persist. Hence, one needs to eradicate mosquitoes to eradicate infection. Moreover, we deduce that in the case of zika infection pushing the basic reproduction number below unity is next to impossible. Therefore, O_0, the mosquito survival threshold parameter, and O, the mosquito survival threshold parameter with predation play a crucial role in getting rid of the infection in respective cases since mosquitoes cannot survive when these are less than unity. Sensitivity analysis shows the importance of reducing mosquito biting rate and mutual contact rates between vector and host. It exhibits the importance of increasing the natural mortality rate of vectors to reduce the basic reproduction number. Numerical simulation shows that when the basic reproduction number is close but greater than unity, the introduction of a small amount of predatory fish Gambusia Affinis can completely swipe off the infection. In case of high transmission or high basic reproduction number, this fish increases the susceptible human population and keeps the infection under control, hence, prohibiting it from becoming an epidemic.

preprint2021arXiv

CAMTA: Causal Attention Model for Multi-touch Attribution

Advertising channels have evolved from conventional print media, billboards and radio advertising to online digital advertising (ad), where the users are exposed to a sequence of ad campaigns via social networks, display ads, search etc. While advertisers revisit the design of ad campaigns to concurrently serve the requirements emerging out of new ad channels, it is also critical for advertisers to estimate the contribution from touch-points (view, clicks, converts) on different channels, based on the sequence of customer actions. This process of contribution measurement is often referred to as multi-touch attribution (MTA). In this work, we propose CAMTA, a novel deep recurrent neural network architecture which is a casual attribution mechanism for user-personalised MTA in the context of observational data. CAMTA minimizes the selection bias in channel assignment across time-steps and touchpoints. Furthermore, it utilizes the users' pre-conversion actions in a principled way in order to predict pre-channel attribution. To quantitatively benchmark the proposed MTA model, we employ the real world Criteo dataset and demonstrate the superior performance of CAMTA with respect to prediction accuracy as compared to several baselines. In addition, we provide results for budget allocation and user-behaviour modelling on the predicted channel attribution.

preprint2020arXiv

A Deep Reinforced Model for Zero-Shot Cross-Lingual Summarization with Bilingual Semantic Similarity Rewards

Cross-lingual text summarization aims at generating a document summary in one language given input in another language. It is a practically important but under-explored task, primarily due to the dearth of available data. Existing methods resort to machine translation to synthesize training data, but such pipeline approaches suffer from error propagation. In this work, we propose an end-to-end cross-lingual text summarization model. The model uses reinforcement learning to directly optimize a bilingual semantic similarity metric between the summaries generated in a target language and gold summaries in a source language. We also introduce techniques to pre-train the model leveraging monolingual summarization and machine translation objectives. Experimental results in both English--Chinese and English--German cross-lingual summarization settings demonstrate the effectiveness of our methods. In addition, we find that reinforcement learning models with bilingual semantic similarity as rewards generate more fluent sentences than strong baselines.

preprint2020arXiv

Low reflection at zero or low-energies in the well-barrier scattering potentials

Probability of reflection $R(E)$ off a finite attractive scattering potential at zero or low energies is ordinarily supposed to be 1. However, a fully attractive potential presents a paradoxical result that $R(0)=0$ or $R(0)<1$, when an effective parameter $q$ of the potential admits special discrete values. Here, we report another class of finite potentials which are well-barrier (attractive-repulsive) type and which can be made to possess much less reflection at zero and low energies for a band of low values of $q$. These well-barrier potentials have only two real turning points for $E \in(V_{min}, V_{max})$, excepting $E=0$. We present two exactly solvable and two numerically solved models to confirm this phenomenon.

preprint2020arXiv

On exact solutions, conservation laws and invariant analysis of the generalized Rosenau-Hyman equation

In this paper, the nonlinear Rosenau-Hyman equation with time dependent variable coefficients is considered for investigating its invariant properties, exact solutions and conservation laws. Using Lie classical method, we derive symmetries admitted by considered equation. Symmetry reductions are performed for each components of optimal set. Also nonclassical approach is employed on considered equation to find some additional supplementary symmetries and corresponding symmetry reductions are performed. Later three kinds of exact solutions of considered equation are presented graphically for different parameters. In addition, local conservation laws are constructed for considered equation by multiplier approach.

preprint2020arXiv

PT-symmetric potentials with imaginary asymptotic saturation

We point out that PT-symmetric potentials $V_{PT}(x)$ having imaginary asymptotic saturation: $V_{PT}(x=\pm \infty) =\pm i V_1, V_1 \in \Re$ are devoid of scattering states and spectral singularity. We show the existence of real (positive and negative) discrete spectrum both with and without complex conjugate pair(s) of eigenvalues (CCPEs). If the states are arranged in the ascending order or real part of discrete eigenvalues, the initial states have few nodes but latter ones oscillate fast. Both real and imaginary parts of $ψ(x)$ vanish asymptotically, $|ψ(x)|$ for the CCPEs are asymmetric and for real energies these are symmetric about origin. For CCPEs $E_{\pm}$ the eigenstates $ψ_{\pm}$ follow an interesting property that $|ψ_+(x)|= N |ψ_-(-x)|, N \in \Re^+$. We remark that, the fast oscillating real discrete energy states discussed are likely to be confused with: reflectionless states, one dimensional version of von Neumann states of Hermitian and spectral singularity state of complex PT-symmetric potentials.