Source author record

Ryan Gabrys

Ryan Gabrys appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Applications Discrete Mathematics math.CO Quantitative Methods

Catalog footprint

What is connected

10works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Efficient Synthesis for Two-Dimensional Strand Arrays with Row Constraints

We study the theoretical problem of synthesizing multiple DNA strands under spatial constraints, motivated by large-scale DNA synthesis technologies. In this setting, strands are arranged in an array and synthesized according to a fixed global synthesis sequence, with the restriction that at most one strand per row may be synthesized in any synthesis cycle. We focus on the basic case of two strands in a single row and analyze the expected completion time under this row-constrained model. By decomposing the process into a Markov chain, we derive analytical upper and lower bounds on the expected synthesis time. We show that a simple laggard-first policy achieves an asymptotic expected completion time of (q+3)L/2 for any alphabet of size q, and that no online policy without look-ahead can asymptotically outperform this bound. For the binary case, we show that allowing a single-symbol look-ahead strictly improves performance, yielding an asymptotic expected completion time of 7L/3. Finally, we present a dynamic programming algorithm that computes the optimal offline schedule for any fixed pair of sequences. Together, these results provide the first analytical bounds for synthesis under spatial constraints and lay the groundwork for future studies of optimal synthesis policies in such settings.

preprint2022arXiv

Balanced and Swap-Robust Trades for Dynamical Distributed Storage

Trades, introduced by Hedayat, are two sets of blocks of elements which may be exchanged (traded) without altering the counts of certain subcollections of elements within their constituent blocks. They are of importance in applications where certain combinations of elements dynamically become prohibited from being placed in the same group of elements, since in this case one can trade the offending blocks with allowed ones. This is particularly the case in distributed storage systems, where due to privacy and other constraints, data of some groups of users cannot be stored together on the same server. We introduce a new class of balanced trades, important for access balancing of servers, and perturbation resilient balanced trades, important for studying the stability of server access frequencies with respect to changes in data popularity. The constructions and bounds on our new trade schemes rely on specialized selections of defining sets in minimal trades and number-theoretic analyses.

preprint2022arXiv

Low-redundancy codes for correcting multiple short-duplication and edit errors

Due to its higher data density, longevity, energy efficiency, and ease of generating copies, DNA is considered a promising storage technology for satisfying future needs. However, a diverse set of errors including deletions, insertions, duplications, and substitutions may arise in DNA at different stages of data storage and retrieval. The current paper constructs error-correcting codes for simultaneously correcting short (tandem) duplications and at most $p$ edits, where a short duplication generates a copy of a substring with length $\leq 3$ and inserts the copy following the original substring, and an edit is a substitution, deletion, or insertion. Compared to the state-of-the-art codes for duplications only, the proposed codes correct up to $p$ edits (in addition to duplications) at the additional cost of roughly $8p(\log_q n)(1+o(1))$ symbols of redundancy, thus achieving the same asymptotic rate, where $q\ge 4$ is the alphabet size and $p$ is a constant. Furthermore, the time complexities of both the encoding and decoding processes are polynomial when $p$ is a constant with respect to the code length.

preprint2022arXiv

Sub-4.7 Scaling Exponent of Polar Codes

Polar code visibly approaches channel capacity in practice and is thereby a constituent code of the 5G standard. Compared to low-density parity-check code, however, the performance of short-length polar code has rooms for improvement that could hinder its adoption by a wider class of applications. As part of the program that addresses the performance issue at short length, it is crucial to understand how fast binary memoryless symmetric channels polarize. A number, called scaling exponent, was defined to measure the speed of polarization and several estimates of the scaling exponent were given in literature. As of 2022, the tightest overestimate is 4.714 made by Mondelli, Hassani, and Urbanke in 2015. We lower the overestimate to 4.63.

preprint2022arXiv

Tropical Group Testing

Polymerase chain reaction (PCR) testing is the gold standard for diagnosing COVID-19. PCR amplifies the virus DNA 40 times to produce measurements of viral loads that span seven orders of magnitude. Unfortunately, the outputs of these tests are imprecise and therefore quantitative group testing methods, which rely on precise measurements, are not applicable. Motivated by the ever-increasing demand to identify individuals infected with SARS-CoV-19, we propose a new model that leverages tropical arithmetic to characterize the PCR testing process. Our proposed framework, termed tropical group testing, overcomes existing limitations of quantitative group testing by allowing for imprecise test measurements. In many cases, some of which are highlighted in this work, tropical group testing is provably more powerful than traditional binary group testing in that it require fewer tests than classical approaches, while additionally providing a mechanism to identify the viral load of each infected individual. It is also empirically stronger than related works that have attempted to combine PCR, quantitative group testing, and compressed sensing.

preprint2021arXiv

Semiquantitative Group Testing in at Most Two Rounds

Semiquantitative group testing (SQGT) is a pooling method in which the test outcomes represent bounded intervals for the number of defectives. Alternatively, it may be viewed as an adder channel with quantized outputs. SQGT represents a natural choice for Covid-19 group testing as it allows for a straightforward interpretation of the cycle threshold values produced by polymerase chain reactions (PCR). Prior work on SQGT did not address the need for adaptive testing with a small number of rounds as required in practice. We propose conceptually simple methods for 2-round and nonadaptive SQGT that significantly improve upon existing schemes by using ideas on nonbinary measurement matrices based on expander graphs and list-disjunct matrices.

preprint2020arXiv

Mass Error-Correction Codes for Polymer-Based Data Storage

We consider the problem of correcting mass readout errors in information encoded in binary polymer strings. Our work builds on results for string reconstruction problems using composition multisets [Acharya et al., 2015] and the unique string reconstruction framework proposed in [Pattabiraman et al., 2019]. Binary polymer-based data storage systems [Laure et al., 2016] operate by designing two molecules of significantly different masses to represent the symbols $\{0,1\}$ and perform readouts through noisy tandem mass spectrometry. Tandem mass spectrometers fragment the strings to be read into shorter substrings and only report their masses, often with errors due to imprecise ionization. Modeling the fragmentation process output in terms of composition multisets allows for designing asymptotically optimal codes capable of unique reconstruction and the correction of a single mass error [Pattabiraman et al., 2019] through the use of derivatives of Catalan paths. Nevertheless, no solutions for multiple-mass error-corrections are currently known. Our work addresses this issue by describing the first multiple-error correction codes that use the polynomial factorization approach for the Turnpike problem [Skiena et al., 1990] and the related factorization described in [Acharya et al., 2015]. Adding Reed-Solomon type coding redundancy into the corresponding polynomials allows for correcting $t$ mass errors in polynomial time using $t^2\, \log\,k$ redundant bits, where $k$ is the information string length. The redundancy can be improved to $\log\,k + t$. However, no decoding algorithm that runs polynomial-time in both $t$ and $n$ for this scheme are currently known, where $n$ is the length of the coded string.

preprint2016arXiv

Asymmetric Lee Distance Codes for DNA-Based Storage

We consider a new family of codes, termed asymmetric Lee distance codes, that arise in the design and implementation of DNA-based storage systems and systems with parallel string transmission protocols. The codewords are defined over a quaternary alphabet, although the results carry over to other alphabet sizes; furthermore, symbol confusability is dictated by their underlying binary representation. Our contributions are two-fold. First, we demonstrate that the new distance represents a linear combination of the Lee and Hamming distance and derive upper bounds on the size of the codes under this metric based on linear programming techniques. Second, we propose a number of code constructions which imply lower bounds.

preprint2016arXiv

Balanced Permutation Codes

Motivated by charge balancing constraints for rank modulation schemes, we introduce the notion of balanced permutations and derive the capacity of balanced permutation codes. We also describe simple interleaving methods for permutation code constructions and show that they approach capacity

preprint2016arXiv

Codes Correcting a Burst of Deletions or Insertions

This paper studies codes that correct bursts of deletions. Namely, a code will be called a $b$-burst-deletion-correcting code if it can correct a deletion of any $b$ consecutive bits. While the lower bound on the redundancy of such codes was shown by Levenshtein to be asymptotically $\log(n)+b-1$, the redundancy of the best code construction by Cheng et al. is $b(\log (n/b+1))$. In this paper we close on this gap and provide codes with redundancy at most $\log(n) + (b-1)\log(\log(n)) +b -\log(b)$. We also derive a non-asymptotic upper bound on the size of $b$-burst-deletion-correcting codes and extend the burst deletion model to two more cases: 1) A deletion burst of at most $b$ consecutive bits and 2) A deletion burst of size at most $b$ (not necessarily consecutive). We extend our code construction for the first case and study the second case for $b=3,4$. The equivalent models for insertions are also studied and are shown to be equivalent to correcting the corresponding burst of deletions.

Ryan Gabrys

What is connected

Connect this record

See the researcher in context

Building this map preview

10 published item(s)

Efficient Synthesis for Two-Dimensional Strand Arrays with Row Constraints

Balanced and Swap-Robust Trades for Dynamical Distributed Storage

Low-redundancy codes for correcting multiple short-duplication and edit errors

Sub-4.7 Scaling Exponent of Polar Codes

Tropical Group Testing

Semiquantitative Group Testing in at Most Two Rounds

Mass Error-Correction Codes for Polymer-Based Data Storage

Asymmetric Lee Distance Codes for DNA-Based Storage

Balanced Permutation Codes

Codes Correcting a Burst of Deletions or Insertions