Source author record

Alexios Balatsoukas-Stimming

Alexios Balatsoukas-Stimming appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT eess.SP Hardware Architecture Computer Vision Cryptography and Security Machine Learning

Catalog footprint

What is connected

24works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Multi-Factor Pruning for Recursive Projection-Aggregation Decoding of RM Codes

The recently introduced recursive projection aggregation (RPA) decoding method for Reed-Muller (RM) codes can achieve near-maximum likelihood (ML) decoding performance. However, its high computational complexity makes its implementation challenging for time- and resource-critical applications. In this work, we present a complexity reduction technique called multi-factor pruning that reduces the computational complexity of RPA significantly. Our simulation results show that the proposed pruning approach with appropriately selected factors can reduce the complexity of RPA by up to $92\%$ for $\text{RM}(8,3)$ while keeping the comparable error-correcting performance.

preprint2022arXiv

Reducing the Error Floor of the Sign-Preserving Min-Sum LDPC Decoder via Message Weighting of Low-Degree Variable Nodes

Some low-complexity LDPC decoders suffer from error floors. We apply iteration-dependent weights to the degree-3 variable nodes to solve this problem. When the 802.3ca EPON LDPC code is considered, an error floor decrease of more than 3 orders of magnitude is achieved.

preprint2021arXiv

On the Implementation Complexity of Digital Full-Duplex Self-Interference Cancellation

In-band full-duplex systems promise to further increase the throughput of wireless systems, by simultaneously transmitting and receiving on the same frequency band. However, concurrent transmission generates a strong self-interference signal at the receiver, which requires the use of cancellation techniques. A wide range of techniques for analog and digital self-interference cancellation have already been presented in the literature. However, their evaluation focuses on cases where the underlying physical parameters of the full-duplex system do not vary significantly. In this paper, we focus on adaptive digital cancellation, motivated by the fact that physical systems change over time. We examine some of the different cancellation methods in terms of their performance and implementation complexity, considering the cost of both cancellation and training. We then present a comparative analysis of all these methods to determine which perform better under different system performance requirements. We demonstrate that with a neural network approach, the reduction in arithmetic complexity for the same cancellation performance relative to a state-of-the-art polynomial model is several orders of magnitude.

preprint2020arXiv

A Standalone FPGA-based Miner for Lyra2REv2 Cryptocurrencies

Lyra2REv2 is a hashing algorithm that consists of a chain of individual hashing algorithms, and it is used as a proof-of-work function in several cryptocurrencies. The most crucial and exotic hashing algorithm in the Lyra2REv2 chain is a specific instance of the general Lyra2 algorithm. This work presents the first hardware implementation of the specific instance of Lyra2 that is used in Lyra2REv2. Several properties of the aforementioned algorithm are exploited in order to optimize the design. In addition, an FPGA-based hardware implementation of a standalone miner for Lyra2REv2 on a Xilinx Multi-Processor System on Chip is presented. The proposed Lyra2REv2 miner is shown to be significantly more energy efficient than both a GPU and a commercially available FPGA-based miner. Finally, we also explain how the simplified Lyra2 and Lyra2REv2 architectures can be modified with minimal effort to also support the recent Lyra2REv3 chained hashing algorithm.

preprint2020arXiv

An Open-Source LoRa Physical Layer Prototype on GNU Radio

LoRa is the proprietary physical layer (PHY) of LoRaWAN, which is a popular Internet-of-Things (IoT) protocol enabling low-power devices to communicate over long ranges. A number of reverse engineering attempts have been published in the last few years that helped to reveal many of the LoRa PHY details. In this work, we describe our standard compatible LoRa PHY software-defined radio (SDR) prototype based on GNU Radio. We show how this SDR prototype can be used to develop and evaluate receiver algorithms for LoRa. As an example, we describe the sampling time offset and the carrier frequency offset estimation and compensation blocks. We experimentally evaluate the error rate of LoRa, both for the uncoded and the coded cases, to illustrate that our publicly available open-source implementation is a solid basis for further research.

preprint2020arXiv

Hardware Implementation of Neural Self-Interference Cancellation

In-band full-duplex systems can transmit and receive information simultaneously on the same frequency band. However, due to the strong self-interference caused by the transmitter to its own receiver, the use of non-linear digital self-interference cancellation is essential. In this work, we describe a hardware architecture for a neural network-based non-linear self-interference (SI) canceller and we compare it with our own hardware implementation of a conventional polynomial based SI canceller. In particular, we present implementation results for a shallow and a deep neural network SI canceller as well as for a polynomial SI canceller. Our results show that the deep neural network canceller achieves a hardware efficiency of up to $312.8$ Msamples/s/mm$^2$ and an energy efficiency of up to $0.9$ nJ/sample, which is $2.1\times$ and $2\times$ better than the polynomial SI canceller, respectively. These results show that NN-based methods applied to communications are not only useful from a performance perspective, but can also be a very effective means to reduce the implementation complexity.

preprint2020arXiv

Identification of Non-Linear RF Systems Using Backpropagation

In this work, we use deep unfolding to view cascaded non-linear RF systems as model-based neural networks. This view enables the direct use of a wide range of neural network tools and optimizers to efficiently identify such cascaded models. We demonstrate the effectiveness of this approach through the example of digital self-interference cancellation in full-duplex communications where an IQ imbalance model and a non-linear PA model are cascaded in series. For a self-interference cancellation performance of approximately 44.5 dB, the number of model parameters can be reduced by 74% and the number of operations per sample can be reduced by 79% compared to an expanded linear-in-parameters polynomial model.

preprint2020arXiv

Implementation of a High-Throughput Fast-SSC Polar Decoder with Sequence Repetition Node

Even though polar codes were adopted in the latest 5G cellular standard, they still have the fundamental problem of high decoding latency. Aiming at solving this problem, a fast simplified successive cancellation (Fast-SSC) decoder based on the new class of sequence repetition (SR) nodes has been proposed recently in \cite{sr2020} and has a lower required number of time steps than other existing Fast-SSC decoders in theory. This paper focuses on the hardware implementation of this SR node-based fast-SSC (SRFSC) decoder. The implementation results for a polar code with length 1024 and code rate 1/2 show that our implementation has a throughput of $505$ Mbps on an Altera Stratix IV FPGA, which is 17.9% higher with respect to the previous work.

preprint2020arXiv

Lupulus: A Flexible Hardware Accelerator for Neural Networks

Neural networks have become indispensable for a wide range of applications, but they suffer from high computational- and memory-requirements, requiring optimizations from the algorithmic description of the network to the hardware implementation. Moreover, the high rate of innovation in machine learning makes it important that hardware implementations provide a high level of programmability to support current and future requirements of neural networks. In this work, we present a flexible hardware accelerator for neural networks, called Lupulus, supporting various methods for scheduling and mapping of operations onto the accelerator. Lupulus was implemented in a 28nm FD-SOI technology and demonstrates a peak performance of 380 GOPS/GHz with latencies of 21.4ms and 183.6ms for the convolutional layers of AlexNet and VGG-16, respectively.

preprint2020arXiv

OptComNet: Optimized Neural Networks for Low-Complexity Channel Estimation

The use of machine learning methods to tackle challenging physical layer signal processing tasks has attracted significant attention. In this work, we focus on the use of neural networks (NNs) to perform pilot-assisted channel estimation in an OFDM system in order to avoid the challenging task of estimating the channel covariance matrix. In particular, we perform a systematic design-space exploration of NN configurations, quantization, and pruning in order to improve feedforward NN architectures that are typically used in the literature for the channel estimation task. We show that choosing an appropriate NN architecture is crucial to reduce the complexity of NN-assisted channel estimation methods. Moreover, we demonstrate that, similarly to other applications and domains, careful quantization and pruning can lead to significant complexity reduction with a negligible performance degradation. Finally, we show that using a solution with multiple distinct NNs trained for different signal-to-noise ratios interestingly leads to lower overall computational complexity and storage requirements, while achieving a better performance with respect to using a single NN trained for the entire SNR range.

preprint2019arXiv

Improving HD-FEC decoding via bit marking

We review the recently introduced soft-aided bit-marking (SABM) algorithm and its suitability for product codes. Some aspects of the implementation of the SABM algorithm are discussed. The influence of suboptimal channel soft information is also analyzed.

preprint2018arXiv

On the Tradeoff Between Accuracy and Complexity in Blind Detection of Polar Codes

Polar codes are a recent family of error-correcting codes with a number of desirable characteristics. Their disruptive nature is illustrated by their rapid adoption in the $5^{th}$-generation mobile-communication standard, where they are used to protect control messages. In this work, we describe a two-stage system tasked with identifying the location of control messages that consists of a detection and selection stage followed by a decoding one. The first stage spurs the need for polar-code detection algorithms with variable effort to balance complexity between the two stages. We illustrate this idea of variable effort for multiple detection algorithms aimed at the first stage. We propose three novel blind detection methods based on belief-propagation decoding inspired by early-stopping criteria. Then we show how their reliability improves with the number of decoding iterations to highlight the possible tradeoffs between accuracy and complexity. Additionally, we show similar tradeoffs for a detection method from previous work. In a setup where only one block encoded with the polar code of interest is present among many other blocks, our results notably show that, depending on the complexity budget, a variable number of undesirable blocks can be dismissed while achieving a missed-detection rate in line with the block-error rate of a complex decoding algorithm.

preprint2016arXiv

Fast Low-Complexity Decoders for Low-Rate Polar Codes

Polar codes are capacity-achieving error-correcting codes with an explicit construction that can be decoded with low-complexity algorithms. In this work, we show how the state-of-the-art low-complexity decoding algorithm can be improved to better accommodate low-rate codes. More constituent codes are recognized in the updated algorithm and dedicated hardware is added to efficiently decode these new constituent codes. We also alter the polar code construction to further decrease the latency and increase the throughput with little to no noticeable effect on error-correction performance. Rate-flexible decoders for polar codes of length 1024 and 2048 are implemented on FPGA. Over the previous work, they are shown to have from 22% to 28% lower latency and 26% to 34% greater throughput when decoding low-rate codes. On 65 nm ASIC CMOS technology, the proposed decoder for a (1024, 512) polar code is shown to compare favorably against the state-of-the-art ASIC decoders. With a clock frequency of 400 MHz and a supply voltage of 0.8 V, it has a latency of 0.41 $μ$s and an area efficiency of 1.8 Gbps/mm$^2$ for an energy efficiency of 77 pJ/info. bit. At 600 MHz with a supply of 1 V, the latency is reduced to 0.27 $μ$s and the area efficiency increased to 2.7 Gbps/mm$^2$ at 115 pJ/info. bit.

preprint2016arXiv

Hardware Decoders for Polar Codes: An Overview

Polar codes are an exciting new class of error correcting codes that achieve the symmetric capacity of memoryless channels. Many decoding algorithms were developed and implemented, addressing various application requirements: from error-correction performance rivaling that of LDPC codes to very high throughput or low-complexity decoders. In this work, we review the state of the art in polar decoders implementing the successive-cancellation, belief propagation, and list decoding algorithms, illustrating their advantages.

preprint2016arXiv

Sliding Window Spectrum Sensing for Full-Duplex Cognitive Radios with Low Access-Latency

In a cognitive radio system the failure of secondary user (SU) transceivers to promptly vacate the channel can introduce significant access-latency for primary or high-priority users (PU). In conventional cognitive radio systems, the backoff latency is exacerbated by frame structures that only allow sensing at periodic intervals. Concurrent transmission and sensing using self-interference suppression has been suggested to improve the performance of cognitive radio systems, allowing decisions to be taken at multiple points within the frame. In this paper, we extend this approach by proposing a sliding-window full-duplex model allowing decisions to be taken on a sample-by-sample basis. We also derive the access-latency for both the existing and the proposed schemes. Our results show that the access-latency of the sliding scheme is decreased by a factor of 2.6 compared to the existing slotted full-duplex scheme and by a factor of approximately 16 compared to a half-duplex cognitive radio system. Moreover, the proposed scheme is significantly more resilient to the destructive effects of residual self-interference compared to previous approaches.

preprint2015arXiv

A Fully-Unrolled LDPC Decoder Based on Quantized Message Passing

In this paper, we propose a finite alphabet message passing algorithm for LDPC codes that replaces the standard min-sum variable node update rule by a mapping based on generic look-up tables. This mapping is designed in a way that maximizes the mutual information between the decoder messages and the codeword bits. We show that our decoder can deliver the same error rate performance as the conventional decoder with a much smaller message bit-width. Finally, we use the proposed algorithm to design a fully unrolled LDPC decoder hardware architecture.

preprint2015arXiv

Faulty Successive Cancellation Decoding of Polar Codes for the Binary Erasure Channel

We study faulty successive cancellation decoding of polar codes for the binary erasure channel. To this end, we introduce a simple erasure-based fault model and we show that, under this model, polarization does not happen, meaning that fully reliable communication is not possible at any rate. Moreover, we provide numerical results for the frame erasure rate and bit erasure rate and we study an unequal error protection scheme that can significantly improve the performance of the faulty successive cancellation decoder with negligible overhead.

preprint2015arXiv

LLR-based Successive Cancellation List Decoding of Polar Codes

We show that successive cancellation list decoding can be formulated exclusively using log-likelihood ratios. In addition to numerical stability, the log-likelihood ratio based formulation has useful properties which simplify the sorting step involved in successive cancellation list decoding. We propose a hardware architecture of the successive cancellation list decoder in the log-likelihood ratio domain which, compared to a log-likelihood domain implementation, requires less irregular and smaller memories. This simplification together with the gains in the metric sorter, lead to $56\%$ to $137\%$ higher throughput per unit area than other recently proposed architectures. We then evaluate the empirical performance of the CRC-aided successive cancellation list decoder at different list sizes using different CRCs and conclude that it is important to adapt the CRC length to the list size in order to achieve the best error-rate performance of concatenated polar codes. Finally, we synthesize conventional successive cancellation decoders at large block-lengths with the same block-error probability as our proposed CRC-aided successive cancellation list decoders to demonstrate that, while our decoders have slightly lower throughput and larger area, they have a significantly smaller decoding latency.

preprint2015arXiv

On Metric Sorting for Successive Cancellation List Decoding of Polar Codes

We focus on the metric sorter unit of successive cancellation list decoders for polar codes, which lies on the critical path in all current hardware implementations of the decoder. We review existing metric sorter architectures and we propose two new architectures that exploit the structure of the path metrics in a log-likelihood ratio based formulation of successive cancellation list decoding. Our synthesis results show that, for the list size of $L=32$, our first proposed sorter is $14\%$ faster and $45\%$ smaller than existing sorters, while for smaller list sizes, our second sorter has a higher delay in return for up to $36\%$ reduction in the area.

preprint2015arXiv

Quantized Message Passing for LDPC Codes

We propose a quantized decoding algorithm for low- density parity-check codes where the variable node update rule of the standard min-sum algorithm is replaced with a look-up table (LUT) that is designed using an information-theoretic criterion. We show that even with message resolutions as low as 3 bits, the proposed algorithm can achieve better error rates than a floating-point min-sum decoder. Moreover, we study in detail the effect of different decoder design parameters, like the design SNR and the LUT tree structure on the performance of our decoder, and we propose some complexity reduction techniques, such as LUT re-use and message alphabet downsizing.

preprint2014arXiv

A Low-Complexity Improved Successive Cancellation Decoder for Polar Codes

Under successive cancellation (SC) decoding, polar codes are inferior to other codes of similar blocklength in terms of frame error rate. While more sophisticated decoding algorithms such as list- or stack-decoding partially mitigate this performance loss, they suffer from an increase in complexity. In this paper, we describe a new flavor of the SC decoder, called the SC flip decoder. Our algorithm preserves the low memory requirements of the basic SC decoder and adjusts the required decoding effort to the signal quality. In the waterfall region, its average computational complexity is almost as low as that of the SC decoder.

preprint2014arXiv

Baseband and RF Hardware Impairments in Full-Duplex Wireless Systems: Experimental Characterisation and Suppression

Hardware imperfections can significantly reduce the performance of full-duplex wireless systems by introducing non-idealities and random effects that make it challenging to fully suppress self-interference. Previous research has mostly focused on analyzing the impact of hardware imperfections on full-duplex systems, based on simulations and theoretical models. In this paper, we follow a measurement-based approach to experimentally identify and isolate these hardware imperfections leading to residual self-interference in full-duplex nodes. Our measurements show the important role of images arising from in-phase and quadrature (IQ) imbalance in the mixers. We also observe base-band non-linearities in the digital-to-analog converters (DAC), which can introduce strong harmonic components that have not been previously considered. A corresponding general mathematical model to suppress these components of the self-interference signal arising from the hardware non-idealities is developed from the observations and measurements. Results from a 10 MHz bandwidth full-duplex OFDM system, operating at 2.48 GHz, show up to 13 dB additional suppression, relative to state-of-the-art implementations can be achieved by jointly compensating for IQ imbalance and DAC non-linearities.

preprint2014arXiv

Density Evolution for Min-Sum Decoding of LDPC Codes Under Unreliable Message Storage

We analyze the performance of quantized min-sum decoding of low-density parity-check codes under unreliable message storage. To this end, we introduce a simple bit-level error model and show that decoder symmetry is preserved under this model. Subsequently, we formulate the corresponding density evolution equations to predict the average bit error probability in the limit of infinite blocklength. We present numerical threshold results and we show that using more quantization bits is not always beneficial in the context of faulty decoders.

preprint2014arXiv

Enabling Complexity-Performance Trade-Offs for Successive Cancellation Decoding of Polar Codes

Polar codes are one of the most recent advancements in coding theory and they have attracted significant interest. While they are provably capacity achieving over various channels, they have seen limited practical applications. Unfortunately, the successive nature of successive cancellation based decoders hinders fine-grained adaptation of the decoding complexity to design constraints and operating conditions. In this paper, we propose a systematic method for enabling complexity-performance trade-offs by constructing polar codes based on an optimization problem which minimizes the complexity under a suitably defined mutual information based performance constraint. Moreover, a low-complexity greedy algorithm is proposed in order to solve the optimization problem efficiently for very large code lengths.

Alexios Balatsoukas-Stimming

What is connected

Connect this record

See the researcher in context

Building this map preview

24 published item(s)

Multi-Factor Pruning for Recursive Projection-Aggregation Decoding of RM Codes

Reducing the Error Floor of the Sign-Preserving Min-Sum LDPC Decoder via Message Weighting of Low-Degree Variable Nodes

On the Implementation Complexity of Digital Full-Duplex Self-Interference Cancellation

A Standalone FPGA-based Miner for Lyra2REv2 Cryptocurrencies

An Open-Source LoRa Physical Layer Prototype on GNU Radio

Hardware Implementation of Neural Self-Interference Cancellation

Identification of Non-Linear RF Systems Using Backpropagation

Implementation of a High-Throughput Fast-SSC Polar Decoder with Sequence Repetition Node

Lupulus: A Flexible Hardware Accelerator for Neural Networks

OptComNet: Optimized Neural Networks for Low-Complexity Channel Estimation

Improving HD-FEC decoding via bit marking

On the Tradeoff Between Accuracy and Complexity in Blind Detection of Polar Codes

Fast Low-Complexity Decoders for Low-Rate Polar Codes

Hardware Decoders for Polar Codes: An Overview

Sliding Window Spectrum Sensing for Full-Duplex Cognitive Radios with Low Access-Latency

A Fully-Unrolled LDPC Decoder Based on Quantized Message Passing

Faulty Successive Cancellation Decoding of Polar Codes for the Binary Erasure Channel

LLR-based Successive Cancellation List Decoding of Polar Codes

On Metric Sorting for Successive Cancellation List Decoding of Polar Codes

Quantized Message Passing for LDPC Codes

A Low-Complexity Improved Successive Cancellation Decoder for Polar Codes

Baseband and RF Hardware Impairments in Full-Duplex Wireless Systems: Experimental Characterisation and Suppression

Density Evolution for Min-Sum Decoding of LDPC Codes Under Unreliable Message Storage

Enabling Complexity-Performance Trade-Offs for Successive Cancellation Decoding of Polar Codes