Source author record

Chaitra Hegde

Chaitra Hegde appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Computation and Language cond-mat.stat-mech

Catalog footprint

What is connected

3works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Orthrus: Memory-Efficient Parallel Token Generation via Dual-View Diffusion

We introduce Orthrus, a simple and efficient dual-architecture framework that unifies the exact generation fidelity of autoregressive Large Language Models (LLMs) with the high-speed parallel token generation of diffusion models. The sequential nature of standard autoregressive decoding represents a fundamental bottleneck for high-throughput inference. While diffusion language models attempt to break this barrier via parallel generation, they suffer from significant performance degradation, high training costs, and a lack of rigorous convergence guarantees. Orthrus resolves this dichotomy natively. Designed to seamlessly integrate into existing Transformers, the framework augments a frozen LLM with a lightweight, trainable module to create a parallel diffusion view alongside the standard autoregressive view. In this unified system, both views attend to the exact same high-fidelity Key-Value (KV) cache; the autoregressive head executes context pre-filling to construct accurate KV representations, while the diffusion head executes parallel generation. By employing an exact consensus mechanism between the two views, Orthrus guarantees lossless inference, delivering up to a 7.8x speedup with only an O(1) memory cache overhead and minimal parameter additions.

preprint2020arXiv

Unsupervised Paraphrase Generation using Pre-trained Language Models

Large scale Pre-trained Language Models have proven to be very powerful approach in various Natural language tasks. OpenAI's GPT-2 \cite{radford2019language} is notable for its capability to generate fluent, well formulated, grammatically consistent text and for phrase completions. In this paper we leverage this generation capability of GPT-2 to generate paraphrases without any supervision from labelled data. We examine how the results compare with other supervised and unsupervised approaches and the effect of using paraphrases for data augmentation on downstream tasks such as classification. Our experiments show that paraphrases generated with our model are of good quality, are diverse and improves the downstream task performance when used for data augmentation.

preprint2014arXiv

Universal Large Deviations for the Tagged Particle in Single File Motion

We consider a gas of point particles moving in a one-dimensional channel with a hard-core inter-particle interaction that prevents particle crossings --- this is called single-file motion. Starting from equilibrium initial conditions we observe the motion of a tagged particle. It is well known that if the individual particle dynamics is diffusive, then the tagged particle motion is sub-diffusive, while for ballistic particle dynamics, the tagged particle motion is diffusive. Here we compute exactly the large deviation function for the tagged particle displacement and show that this is universal, independent of the individual dynamics.