Source author record

Andreas Lenz

Andreas Lenz appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Computational Complexity math.CO

Catalog footprint

What is connected

5works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Concatenated Codes for Multiple Reads of a DNA Sequence

Decoding sequences that stem from multiple transmissions of a codeword over an insertion, deletion, and substitution channel is a critical component of efficient deoxyribonucleic acid (DNA) data storage systems. In this paper, we consider a concatenated coding scheme with an outer nonbinary low-density parity-check code or a polar code and either an inner convolutional code or a time-varying block code. We propose two novel decoding algorithms for inference from multiple received sequences, both combining the inner code and channel to a joint hidden Markov model to infer symbolwise a posteriori probabilities (APPs). The first decoder computes the exact APPs by jointly decoding the received sequences, whereas the second decoder approximates the APPs by combining the results of separately decoded received sequences and has a complexity that is linear with the number of sequences. Using the proposed algorithms, we evaluate the performance of decoding multiple received sequences by means of achievable information rates and Monte-Carlo simulations. We show significant performance gains compared to a single received sequence. In addition, we succeed in improving the performance of the aforementioned coding scheme by optimizing both the inner and outer codes.

preprint2020arXiv

Achievable Rates of Concatenated Codes in DNA Storage under Substitution Errors

In this paper, we study achievable rates of concatenated coding schemes over a deoxyribonucleic acid (DNA) storage channel. Our channel model incorporates the main features of DNA-based data storage. First, information is stored on many, short DNA strands. Second, the strands are stored in an unordered fashion inside the storage medium and each strand is replicated many times. Third, the data is accessed in an uncontrollable manner, i.e., random strands are drawn from the medium and received, possibly with errors. As one of our results, we show that there is a significant gap between the channel capacity and the achievable rate of a standard concatenated code in which one strand corresponds to an inner block. This is in fact surprising as for other channels, such as $q$-ary symmetric channels, concatenated codes are known to achieve the capacity. We further propose a modified concatenated coding scheme by combining several strands into one inner block, which allows to narrow the gap and achieve rates that are close to the capacity.

preprint2020arXiv

Coding over Sets for DNA Storage

In this paper we study error-correcting codes for the storage of data in synthetic deoxyribonucleic acid (DNA). We investigate a storage model where a data set is represented by an unordered set of $M$ sequences, each of length $L$. Errors within that model are a loss of whole sequences and point errors inside the sequences, such as insertions, deletions and substitutions. We derive Gilbert-Varshamov lower bounds and sphere packing upper bounds on achievable cardinalities of error-correcting codes within this storage model. We further propose explicit code constructions than can correct errors in such a storage system that can be encoded and decoded efficiently. Comparing the sizes of these codes to the upper bounds, we show that many of the constructions are close to optimal.

preprint2020arXiv

Covering Codes using Insertions or Deletions

A covering code is a set of codewords with the property that the union of balls, suitably defined, around these codewords covers an entire space. Generally, the goal is to find the covering code with the minimum size codebook. While most prior work on covering codes has focused on the Hamming metric, we consider the problem of designing covering codes defined in terms of either insertions or deletions. First, we provide new sphere-covering lower bounds on the minimum possible size of such codes. Then, we provide new existential upper bounds on the size of optimal covering codes for a single insertion or a single deletion that are tight up to a constant factor. Finally, we derive improved upper bounds for covering codes using $R\geq 2$ insertions or deletions. We prove that codes exist with density that is only a factor $O(R \log R)$ larger than the lower bounds for all fixed~$R$. In particular, our upper bounds have an optimal dependence on the word length, and we achieve asymptotic density matching the best known bounds for Hamming distance covering codes.

preprint2020arXiv

Optimal Codes Correcting a Burst of Deletions of Variable Length

In this paper, we present an efficiently encodable and decodable code construction that is capable of correction a burst of deletions of length at most $k$. The redundancy of this code is $\log n + k(k+1)/2\log \log n+c_k$ for some constant $c_k$ that only depends on $k$ and thus is scaling-optimal. The code can be split into two main components. First, we impose a constraint that allows to locate the burst of deletions up to an interval of size roughly $\log n$. Then, with the knowledge of the approximate location of the burst, we use several {shifted Varshamov-Tenengolts} codes to correct the burst of deletions, which only requires a small amount of redundancy since the location is already known up to an interval of small size. Finally, we show how to efficiently encode and decode the code.

Andreas Lenz

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Concatenated Codes for Multiple Reads of a DNA Sequence

Achievable Rates of Concatenated Codes in DNA Storage under Substitution Errors

Coding over Sets for DNA Storage

Covering Codes using Insertions or Deletions

Optimal Codes Correcting a Burst of Deletions of Variable Length