Source author record

Andrew Thangaraj

Andrew Thangaraj appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Data Structures and Algorithms math.ST Networking and Internet Architecture Statistics Theory Applications Cryptography and Security Discrete Mathematics math.PR

Catalog footprint

What is connected

22works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Distribution Estimation with Side Information

We consider the classical problem of discrete distribution estimation using i.i.d. samples in a novel scenario where additional side information is available on the distribution. In large alphabet datasets such as text corpora, such side information arises naturally through word semantics/similarities that can be inferred by closeness of vector word embeddings, for instance. We consider two specific models for side information--a local model where the unknown distribution is in the neighborhood of a known distribution, and a partial ordering model where the alphabet is partitioned into known higher and lower probability sets. In both models, we theoretically characterize the improvement in a suitable squared-error risk because of the available side information. Simulations over natural language and synthetic data illustrate these gains.

preprint2022arXiv

Lifting Constructions of PDAs for Coded Caching with Linear Subpacketization

Coded caching is a technique where multicasting and coding opportunities are utilized to achieve better rate-memory tradeoff in cached networks. A crucial parameter in coded caching is subpacketization, which is the number of parts a file is to be split into for coding purposes. The original Maddah-Ali-Niesen scheme has order-optimal rate at a subpacketization growing exponentially with the number of users. In contrast, placement and delivery schemes in coded caching, designed using placement delivery arrays (PDAs), can have linear subpacketization with a penalty in rate. In this work, we propose several constructions of efficient PDAs through lifting, where a base PDA is expanded by replacing each entry by another PDA. By proposing and using the notion of Blackburn-compatibility of PDAs, we provide multiple lifting constructions with increasing coding gains. We compare the constructed coded caching schemes with other existing schemes for moderately high number of users and show that the proposed constructions are versatile and achieve a good rate-memory tradeoff at low subpacketizations.

preprint2022arXiv

Missing Mass Estimation from Sticky Channels

Distribution estimation under error-prone or non-ideal sampling modelled as "sticky" channels have been studied recently motivated by applications such as DNA computing. Missing mass, the sum of probabilities of missing letters, is an important quantity that plays a crucial role in distribution estimation, particularly in the large alphabet regime. In this work, we consider the problem of estimation of missing mass, which has been well-studied under independent and identically distributed (i.i.d) sampling, in the case when sampling is "sticky". Precisely, we consider the scenario where each sample from an unknown distribution gets repeated a geometrically-distributed number of times. We characterise the minimax rate of Mean Squared Error (MSE) of estimating missing mass from such sticky sampling channels. An upper bound on the minimax rate is obtained by bounding the risk of a modified Good-Turing estimator. We derive a matching lower bound on the minimax rate by extending the Le Cam method.

preprint2020arXiv

Convergence of Chao Unseen Species Estimator

Support size estimation and the related problem of unseen species estimation have wide applications in ecology and database analysis. Perhaps the most used support size estimator is the Chao estimator. Despite its wide spread use, little is known about its theoretical properties. We analyze the Chao estimator and show that its worst case mean squared error (MSE) is smaller than the MSE of the plug-in estimator by a factor of $\mathcal{O} ((k/n)^4)$, where $k$ is the maximum support size and $n$ is the number of samples. Our main technical contribution is a new method to analyze rational estimators for discrete distribution properties, which may be of independent interest.

preprint2016arXiv

Dual Capacity Upper Bounds for Noisy Runlength Constrained Channels

Binary-input memoryless channels with a runlength constrained input are considered. Upper bounds to the capacity of such noisy runlength constrained channels are derived using the dual capacity method with Markov test distributions satisfying the Karush-Kuhn-Tucker (KKT) conditions for the capacity-achieving output distribution. Simplified algebraic characterizations of the bounds are presented for the binary erasure channel (BEC) and the binary symmetric channel (BSC). These upper bounds are very close to achievable rates, and improve upon previously known feedback-based bounds for a large range of channel parameters. For the binary-input Additive White Gaussian Noise (AWGN) channel, the upper bound is simplified to a small-scale numerical optimization problem. These results provide some of the simplest upper bounds for an open capacity problem that has theoretical and practical relevance.

preprint2015arXiv

Approximation of Capacity for ISI Channels with One-bit Output Quantization

Motivated by recent high bandwidth communication systems, Inter-Symbol Interference (ISI) channels with 1-bit quantized output are considered under an average-power-constrained continuous input. While the exact capacity is difficult to characterize, an approximation that matches with the exact channel output up to a probability of error is provided. The approximation does not have additive noise, but constrains the channel output (without noise) to be above a threshold in absolute value. The capacity under the approximation is computed using methods involving standard Gibbs distributions. Markovian achievable schemes approaching the approximate capacity are provided. The methods used over the approximate ISI channel result in ideas for practical coding schemes for ISI channels with 1-bit output quantization.

preprint2015arXiv

Capacity Bounds for Discrete-Time, Amplitude-Constrained, Additive White Gaussian Noise Channels

The capacity-achieving input distribution of the discrete-time, additive white Gaussian noise (AWGN) channel with an amplitude constraint is discrete and seems difficult to characterize explicitly. A dual capacity expression is used to derive analytic capacity upper bounds for scalar and vector AWGN channels. The scalar bound improves on McKellips' bound and is within 0.1 bits of capacity for all signal-to-noise ratios (SNRs). The two-dimensional bound is within 0.15 bits of capacity provably up to 4.5 dB, and numerical evidence suggests a similar gap for all SNRs.

preprint2015arXiv

Construction of Near-Capacity Protograph LDPC Code Sequences with Block-Error Thresholds

Density evolution for protograph Low-Density Parity-Check (LDPC) codes is considered, and it is shown that the message-error rate falls double-exponentially with iterations whenever the degree-2 subgraph of the protograph is cycle-free and noise level is below threshold. Conditions for stability of protograph density evolution are established and related to the structure of the protograph. Using large-girth graphs, sequences of protograph LDPC codes with block-error threshold equal to bit-error threshold and block-error rate falling near-exponentially with blocklength are constructed deterministically. Small-sized protographs are optimized to obtain thresholds near capacity for binary erasure and binary-input Gaussian channels.

preprint2014arXiv

Secure Compute-and-Forward in a Bidirectional Relay

We consider the basic bidirectional relaying problem, in which two users in a wireless network wish to exchange messages through an intermediate relay node. In the compute-and-forward strategy, the relay computes a function of the two messages using the naturally-occurring sum of symbols simultaneously transmitted by user nodes in a Gaussian multiple access (MAC) channel, and the computed function value is forwarded to the user nodes in an ensuing broadcast phase. In this paper, we study the problem under an additional security constraint, which requires that each user's message be kept secure from the relay. We consider two types of security constraints: perfect secrecy, in which the MAC channel output seen by the relay is independent of each user's message; and strong secrecy, which is a form of asymptotic independence. We propose a coding scheme based on nested lattices, the main feature of which is that given a pair of nested lattices that satisfy certain "goodness" properties, we can explicitly specify probability distributions for randomization at the encoders to achieve the desired security criteria. In particular, our coding scheme guarantees perfect or strong secrecy even in the absence of channel noise. The noise in the channel only affects reliability of computation at the relay, and for Gaussian noise, we derive achievable rates for reliable and secure computation. We also present an application of our methods to the multi-hop line network in which a source needs to transmit messages to a destination through a series of intermediate relays.

preprint2014arXiv

Sub-Modularity of Waterfilling with Applications to Online Basestation Allocation

We show that the popular water-filling algorithm for maximizing the mutual information in parallel Gaussian channels is sub-modular. The sub-modularity of water-filling algorithm is then used to derive online basestation allocation algorithms, where mobile users are assigned to one of many possible basestations immediately and irrevocably upon arrival without knowing the future user information. The goal of the allocation is to maximize the sum-rate of the system under power allocation at each basestation. We present online algorithms with competitive ratio of at most 2 when compared to offline algorithms that have knowledge of all future user arrivals.

preprint2013arXiv

Deterministic Constructions for Large Girth Protograph LDPC Codes

The bit-error threshold of the standard ensemble of Low Density Parity Check (LDPC) codes is known to be close to capacity, if there is a non-zero fraction of degree-two bit nodes. However, the degree-two bit nodes preclude the possibility of a block-error threshold. Interestingly, LDPC codes constructed using protographs allow the possibility of having both degree-two bit nodes and a block-error threshold. In this paper, we analyze density evolution for protograph LDPC codes over the binary erasure channel and show that their bit-error probability decreases double exponentially with the number of iterations when the erasure probability is below the bit-error threshold and long chain of degree-two variable nodes are avoided in the protograph. We present deterministic constructions of such protograph LDPC codes with girth logarithmic in blocklength, resulting in an exponential fall in bit-error probability below the threshold. We provide optimized protographs, whose block-error thresholds are better than that of the standard ensemble with minimum bit-node degree three. These protograph LDPC codes are theoretically of great interest, and have applications, for instance, in coding with strong secrecy over wiretap channels.

preprint2013arXiv

Online Algorithms for Basestation Allocation

Design of {\it online algorithms} for assigning mobile users to basestations is considered with the objective of maximizing the sum-rate, when all users associated to any one basestation equally share each basestation's resources. Each user on its arrival reveals the rates it can obtain if connected to each of the basestations, and the problem is to assign each user to any one basestation irrevocably so that the sum-rate is maximized at the end of all user arrivals, without knowing the future user arrival or rate information or its statistics at each user arrival. Online algorithms with constant factor loss in comparison to offline algorithms (that know both the user arrival and user rates profile in advance) are derived. The proposed online algorithms are motivated from the famous online k-secretary problem and online maximum weight matching problem.

preprint2013arXiv

The Gaussian Two-way Diamond Channel

We consider two-way relaying in a Gaussian diamond channel, where two terminal nodes wish to exchange information using two relays. A simple baseline protocol is obtained by time-sharing between two one-way protocols. To improve upon the baseline performance, we propose two compute-and-forward (CF) protocols: Compute-and-forward Compound multiple access channel (CF-CMAC) and Compute-and-forward-Broadcast (CF-BC). These protocols mix the two flows through the two relays and achieve rates better than the simple time-sharing protocol. We derive an outer bound to the capacity region that is satisfied by any relaying protocol, and observe that the proposed protocols provide rates close to the outer bound in certain channel conditions. Both the CF-CMAC and CF-BC protocols use nested lattice codes in the compute phases. In the CF-CMAC protocol, both relays simultaneously forward to the destinations over a Compound Multiple Access Channel (CMAC). In the simpler CF-BC protocol's forward phase, one relay is selected at a time for Broadcast Channel (BC) transmission depending on the rate-pair to be achieved. We also consider the diamond channel with direct source-destination link and the diamond channel with interfering relays. Outer bounds and achievable rate regions are compared for these two channels as well. Mixing of flows using the CF-CMAC protocol is shown to be good for symmetric two-way rates.

preprint2012arXiv

Outer Bounds for the Capacity Region of a Gaussian Two-way Relay Channel

We consider a three-node half-duplex Gaussian relay network where two nodes (say $a$, $b$) want to communicate with each other and the third node acts as a relay for this twoway communication. Outer bounds and achievable rate regions for the possible rate pairs $(R_a, R_b)$ for two-way communication are investigated. The modes (transmit or receive) of the halfduplex nodes together specify the state of the network. A relaying protocol uses a specific sequence of states and a coding scheme for each state. In this paper, we first obtain an outer bound for the rate region of all achievable $(R_a,R_b)$ based on the half-duplex cut-set bound. This outer bound can be numerically computed by solving a linear program. It is proved that at any point on the boundary of the outer bound only four of the six states of the network are used. We then compare it with achievable rate regions of various known protocols. We consider two kinds of protocols: (1) protocols in which all messages transmitted in a state are decoded with the received signal in the same state, and (2) protocols where information received in one state can also be stored and used as side information to decode messages in future states. Various conclusions are drawn on the importance of using all states, use of side information, and the choice of processing at the relay. Then, two analytical outer bounds (as opposed to an optimization problem formulation) are derived. Using an analytical outer bound, we obtain the symmetric capacity within 0.5 bits for some channel conditions where the direct link between nodes a and b is weak.

preprint2012arXiv

Secure Computation in a Bidirectional Relay

Bidirectional relaying, where a relay helps two user nodes to exchange equal length binary messages, has been an active area of recent research. A popular strategy involves a modified Gaussian MAC, where the relay decodes the XOR of the two messages using the naturally-occurring sum of symbols simultaneously transmitted by user nodes. In this work, we consider the Gaussian MAC in bidirectional relaying with an additional secrecy constraint for protection against a honest but curious relay. The constraint is that, while the relay should decode the XOR, it should be fully ignorant of the individual messages of the users. We exploit the symbol addition that occurs in a Gaussian MAC to design explicit strategies that achieve perfect independence between the received symbols and individual transmitted messages. Our results actually hold for a more general scenario where the messages at the two user nodes come from a finite Abelian group, and the relay must decode the sum within the group of the two messages. We provide a lattice coding strategy and study optimal rate versus average power trade-offs for asymptotically large dimensions.

preprint2011arXiv

Strong Secrecy on the Binary Erasure Wiretap Channel Using Large-Girth LDPC Codes

For an arbitrary degree distribution pair (DDP), we construct a sequence of low-density parity-check (LDPC) code ensembles with girth growing logarithmically in block-length using Ramanujan graphs. When the DDP has minimum left degree at least three, we show using density evolution analysis that the expected bit-error probability of these ensembles, when passed through a binary erasure channel with erasure probability $ε$, decays as $\mathcal{O}(\exp(-c_1 n^{c_2}))$ with the block-length $n$ for positive constants $c_1$ and $c_2$, as long as $ε$ is lesser than the erasure threshold $ε_\mathrm{th}$ of the DDP. This guarantees that the coset coding scheme using the dual sequence provides strong secrecy over the binary erasure wiretap channel for erasure probabilities greater than $1 - ε_\mathrm{th}$.

preprint2011arXiv

The Treewidth of MDS and Reed-Muller Codes

The constraint complexity of a graphical realization of a linear code is the maximum dimension of the local constraint codes in the realization. The treewidth of a linear code is the least constraint complexity of any of its cycle-free graphical realizations. This notion provides a useful parametrization of the maximum-likelihood decoding complexity for linear codes. In this paper, we prove the surprising fact that for maximum distance separable codes and Reed-Muller codes, treewidth equals trelliswidth, which, for a code, is defined to be the least constraint complexity (or branch complexity) of any of its trellis realizations. From this, we obtain exact expressions for the treewidth of these codes, which constitute the only known explicit expressions for the treewidth of algebraic codes.

preprint2010arXiv

Dirty Paper Coding using Sign-bit Shaping and LDPC Codes

Dirty paper coding (DPC) refers to methods for pre-subtraction of known interference at the transmitter of a multiuser communication system. There are numerous applications for DPC, including coding for broadcast channels. Recently, lattice-based coding techniques have provided several designs for DPC. In lattice-based DPC, there are two codes - a convolutional code that defines a lattice used for shaping and an error correction code used for channel coding. Several specific designs have been reported in the recent literature using convolutional and graph-based codes for capacity-approaching shaping and coding gains. In most of the reported designs, either the encoder works on a joint trellis of shaping and channel codes or the decoder requires iterations between the shaping and channel decoders. This results in high complexity of implementation. In this work, we present a lattice-based DPC scheme that provides good shaping and coding gains with moderate complexity at both the encoder and the decoder. We use a convolutional code for sign-bit shaping, and a low-density parity check (LDPC) code for channel coding. The crucial idea is the introduction of a one-codeword delay and careful parsing of the bits at the transmitter, which enable an LDPC decoder to be run first at the receiver. This provides gains without the need for iterations between the shaping and channel decoders. Simulation results confirm that at high rates the proposed DPC method performs close to capacity with moderate complexity. As an application of the proposed DPC method, we show a design for superposition coding that provides rates better than time-sharing over a Gaussian broadcast channel.

preprint2010arXiv

Multistage Relaying Using Interference Networks

Wireless networks with multiple nodes that relay information from a source to a destination are expected to be deployed in many applications. Therefore, understanding their design and performance under practical constraints is important. In this work, we propose and study three multihopping decode and forward (MDF) protocols for multistage half-duplex relay networks with no direct link between the source and destination nodes. In all three protocols, we assume no cooperation across relay nodes for encoding and decoding. Numerical evaluation in illustrative example networks and comparison with cheap relay cut-set bounds for half-duplex networks show that the proposed MDF protocols approach capacity in some ranges of channel gains. The main idea in the design of the protocols is the use of coding in interference networks that are created in different states or modes of a half-duplex network. Our results suggest that multistage half-duplex relaying with practical constraints on cooperation is comparable to point-to-point links and full-duplex relay networks, if there are multiple non-overlapping paths from source to destination and if suitable coding is employed in interference network states.

preprint2010arXiv

NLHB : A Non-Linear Hopper Blum Protocol

In this paper, we propose a light-weight provably-secure authentication protocol called the NLHB protocol, which is a variant of the HB protocol. The HB protocol uses the complexity of decoding linear codes for security against passive attacks. In contrast, security for the NLHB protocol is proved by reducing passive attacks to the problem of decoding a class of non-linear codes that are provably hard. We demonstrate that the existing passive attacks on the HB protocol family, which have contributed to considerable reduction in its effective key-size, are ineffective against the NLHB protocol. From the evidence, we conclude that smaller-key sizes are sufficient for the NLHB protocol to achieve the same level of passive attack security as the HB Protocol. Further, for this choice of parameters, we provide an implementation instance for the NLHB protocol for which the Prover/Verifier complexity is lower than the HB protocol, enabling authentication on very low-cost devices like RFID tags. Finally, in the spirit of the HB$^{+}$ protocol, we extend the NLHB protocol to the NLHB$^{+}$ protocol and prove security against the class of active attacks defined in the DET Model.

preprint2010arXiv

Path Gain Algebraic Formulation for the Scalar Linear Network Coding Problem

In the algebraic view, the solution to a network coding problem is seen as a variety specified by a system of polynomial equations typically derived by using edge-to-edge gains as variables. The output from each sink is equated to its demand to obtain polynomial equations. In this work, we propose a method to derive the polynomial equations using source-to-sink path gains as the variables. In the path gain formulation, we show that linear and quadratic equations suffice; therefore, network coding becomes equivalent to a system of polynomial equations of maximum degree 2. We present algorithms for generating the equations in the path gains and for converting path gain solutions to edge-to-edge gain solutions. Because of the low degree, simplification is readily possible for the system of equations obtained using path gains. Using small-sized network coding problems, we show that the path gain approach results in simpler equations and determines solvability of the problem in certain cases. On a larger network (with 87 nodes and 161 edges), we show how the path gain approach continues to provide deterministic solutions to some network coding problems.

preprint2010arXiv

Strong Secrecy for Erasure Wiretap Channels

We show that duals of certain low-density parity-check (LDPC) codes, when used in a standard coset coding scheme, provide strong secrecy over the binary erasure wiretap channel (BEWC). This result hinges on a stopping set analysis of ensembles of LDPC codes with block length $n$ and girth $\geq 2k$, for some $k \geq 2$. We show that if the minimum left degree of the ensemble is $l_\mathrm{min}$, the expected probability of block error is $\calO(\frac{1}{n^{\lceil l_\mathrm{min} k /2 \rceil - k}})$ when the erasure probability $ε< ε_\mathrm{ef}$, where $ε_\mathrm{ef}$ depends on the degree distribution of the ensemble. As long as $l_\mathrm{min} > 2$ and $k > 2$, the dual of this LDPC code provides strong secrecy over a BEWC of erasure probability greater than $1 - ε_\mathrm{ef}$.

Andrew Thangaraj

What is connected

Connect this record

See the researcher in context

Building this map preview

22 published item(s)

Distribution Estimation with Side Information

Lifting Constructions of PDAs for Coded Caching with Linear Subpacketization

Missing Mass Estimation from Sticky Channels

Convergence of Chao Unseen Species Estimator

Dual Capacity Upper Bounds for Noisy Runlength Constrained Channels

Approximation of Capacity for ISI Channels with One-bit Output Quantization

Capacity Bounds for Discrete-Time, Amplitude-Constrained, Additive White Gaussian Noise Channels

Construction of Near-Capacity Protograph LDPC Code Sequences with Block-Error Thresholds

Secure Compute-and-Forward in a Bidirectional Relay

Sub-Modularity of Waterfilling with Applications to Online Basestation Allocation

Deterministic Constructions for Large Girth Protograph LDPC Codes

Online Algorithms for Basestation Allocation

The Gaussian Two-way Diamond Channel

Outer Bounds for the Capacity Region of a Gaussian Two-way Relay Channel

Secure Computation in a Bidirectional Relay

Strong Secrecy on the Binary Erasure Wiretap Channel Using Large-Girth LDPC Codes

The Treewidth of MDS and Reed-Muller Codes

Dirty Paper Coding using Sign-bit Shaping and LDPC Codes

Multistage Relaying Using Interference Networks

NLHB : A Non-Linear Hopper Blum Protocol

Path Gain Algebraic Formulation for the Scalar Linear Network Coding Problem

Strong Secrecy for Erasure Wiretap Channels