Researcher profile

Alexios Balatsoukas-Stimming

Alexios Balatsoukas-Stimming contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2022arXiv

Multi-Factor Pruning for Recursive Projection-Aggregation Decoding of RM Codes

The recently introduced recursive projection aggregation (RPA) decoding method for Reed-Muller (RM) codes can achieve near-maximum likelihood (ML) decoding performance. However, its high computational complexity makes its implementation challenging for time- and resource-critical applications. In this work, we present a complexity reduction technique called multi-factor pruning that reduces the computational complexity of RPA significantly. Our simulation results show that the proposed pruning approach with appropriately selected factors can reduce the complexity of RPA by up to $92\%$ for $\text{RM}(8,3)$ while keeping the comparable error-correcting performance.

preprint2021arXiv

On the Implementation Complexity of Digital Full-Duplex Self-Interference Cancellation

In-band full-duplex systems promise to further increase the throughput of wireless systems, by simultaneously transmitting and receiving on the same frequency band. However, concurrent transmission generates a strong self-interference signal at the receiver, which requires the use of cancellation techniques. A wide range of techniques for analog and digital self-interference cancellation have already been presented in the literature. However, their evaluation focuses on cases where the underlying physical parameters of the full-duplex system do not vary significantly. In this paper, we focus on adaptive digital cancellation, motivated by the fact that physical systems change over time. We examine some of the different cancellation methods in terms of their performance and implementation complexity, considering the cost of both cancellation and training. We then present a comparative analysis of all these methods to determine which perform better under different system performance requirements. We demonstrate that with a neural network approach, the reduction in arithmetic complexity for the same cancellation performance relative to a state-of-the-art polynomial model is several orders of magnitude.

preprint2020arXiv

A Standalone FPGA-based Miner for Lyra2REv2 Cryptocurrencies

Lyra2REv2 is a hashing algorithm that consists of a chain of individual hashing algorithms, and it is used as a proof-of-work function in several cryptocurrencies. The most crucial and exotic hashing algorithm in the Lyra2REv2 chain is a specific instance of the general Lyra2 algorithm. This work presents the first hardware implementation of the specific instance of Lyra2 that is used in Lyra2REv2. Several properties of the aforementioned algorithm are exploited in order to optimize the design. In addition, an FPGA-based hardware implementation of a standalone miner for Lyra2REv2 on a Xilinx Multi-Processor System on Chip is presented. The proposed Lyra2REv2 miner is shown to be significantly more energy efficient than both a GPU and a commercially available FPGA-based miner. Finally, we also explain how the simplified Lyra2 and Lyra2REv2 architectures can be modified with minimal effort to also support the recent Lyra2REv3 chained hashing algorithm.

preprint2020arXiv

An Open-Source LoRa Physical Layer Prototype on GNU Radio

LoRa is the proprietary physical layer (PHY) of LoRaWAN, which is a popular Internet-of-Things (IoT) protocol enabling low-power devices to communicate over long ranges. A number of reverse engineering attempts have been published in the last few years that helped to reveal many of the LoRa PHY details. In this work, we describe our standard compatible LoRa PHY software-defined radio (SDR) prototype based on GNU Radio. We show how this SDR prototype can be used to develop and evaluate receiver algorithms for LoRa. As an example, we describe the sampling time offset and the carrier frequency offset estimation and compensation blocks. We experimentally evaluate the error rate of LoRa, both for the uncoded and the coded cases, to illustrate that our publicly available open-source implementation is a solid basis for further research.

preprint2020arXiv

Hardware Implementation of Neural Self-Interference Cancellation

In-band full-duplex systems can transmit and receive information simultaneously on the same frequency band. However, due to the strong self-interference caused by the transmitter to its own receiver, the use of non-linear digital self-interference cancellation is essential. In this work, we describe a hardware architecture for a neural network-based non-linear self-interference (SI) canceller and we compare it with our own hardware implementation of a conventional polynomial based SI canceller. In particular, we present implementation results for a shallow and a deep neural network SI canceller as well as for a polynomial SI canceller. Our results show that the deep neural network canceller achieves a hardware efficiency of up to $312.8$ Msamples/s/mm$^2$ and an energy efficiency of up to $0.9$ nJ/sample, which is $2.1\times$ and $2\times$ better than the polynomial SI canceller, respectively. These results show that NN-based methods applied to communications are not only useful from a performance perspective, but can also be a very effective means to reduce the implementation complexity.

preprint2020arXiv

Identification of Non-Linear RF Systems Using Backpropagation

In this work, we use deep unfolding to view cascaded non-linear RF systems as model-based neural networks. This view enables the direct use of a wide range of neural network tools and optimizers to efficiently identify such cascaded models. We demonstrate the effectiveness of this approach through the example of digital self-interference cancellation in full-duplex communications where an IQ imbalance model and a non-linear PA model are cascaded in series. For a self-interference cancellation performance of approximately 44.5 dB, the number of model parameters can be reduced by 74% and the number of operations per sample can be reduced by 79% compared to an expanded linear-in-parameters polynomial model.

preprint2020arXiv

Implementation of a High-Throughput Fast-SSC Polar Decoder with Sequence Repetition Node

Even though polar codes were adopted in the latest 5G cellular standard, they still have the fundamental problem of high decoding latency. Aiming at solving this problem, a fast simplified successive cancellation (Fast-SSC) decoder based on the new class of sequence repetition (SR) nodes has been proposed recently in \cite{sr2020} and has a lower required number of time steps than other existing Fast-SSC decoders in theory. This paper focuses on the hardware implementation of this SR node-based fast-SSC (SRFSC) decoder. The implementation results for a polar code with length 1024 and code rate 1/2 show that our implementation has a throughput of $505$ Mbps on an Altera Stratix IV FPGA, which is 17.9% higher with respect to the previous work.

preprint2020arXiv

Lupulus: A Flexible Hardware Accelerator for Neural Networks

Neural networks have become indispensable for a wide range of applications, but they suffer from high computational- and memory-requirements, requiring optimizations from the algorithmic description of the network to the hardware implementation. Moreover, the high rate of innovation in machine learning makes it important that hardware implementations provide a high level of programmability to support current and future requirements of neural networks. In this work, we present a flexible hardware accelerator for neural networks, called Lupulus, supporting various methods for scheduling and mapping of operations onto the accelerator. Lupulus was implemented in a 28nm FD-SOI technology and demonstrates a peak performance of 380 GOPS/GHz with latencies of 21.4ms and 183.6ms for the convolutional layers of AlexNet and VGG-16, respectively.

preprint2020arXiv

OptComNet: Optimized Neural Networks for Low-Complexity Channel Estimation

The use of machine learning methods to tackle challenging physical layer signal processing tasks has attracted significant attention. In this work, we focus on the use of neural networks (NNs) to perform pilot-assisted channel estimation in an OFDM system in order to avoid the challenging task of estimating the channel covariance matrix. In particular, we perform a systematic design-space exploration of NN configurations, quantization, and pruning in order to improve feedforward NN architectures that are typically used in the literature for the channel estimation task. We show that choosing an appropriate NN architecture is crucial to reduce the complexity of NN-assisted channel estimation methods. Moreover, we demonstrate that, similarly to other applications and domains, careful quantization and pruning can lead to significant complexity reduction with a negligible performance degradation. Finally, we show that using a solution with multiple distinct NNs trained for different signal-to-noise ratios interestingly leads to lower overall computational complexity and storage requirements, while achieving a better performance with respect to using a single NN trained for the entire SNR range.

preprint2018arXiv

On the Tradeoff Between Accuracy and Complexity in Blind Detection of Polar Codes

Polar codes are a recent family of error-correcting codes with a number of desirable characteristics. Their disruptive nature is illustrated by their rapid adoption in the $5^{th}$-generation mobile-communication standard, where they are used to protect control messages. In this work, we describe a two-stage system tasked with identifying the location of control messages that consists of a detection and selection stage followed by a decoding one. The first stage spurs the need for polar-code detection algorithms with variable effort to balance complexity between the two stages. We illustrate this idea of variable effort for multiple detection algorithms aimed at the first stage. We propose three novel blind detection methods based on belief-propagation decoding inspired by early-stopping criteria. Then we show how their reliability improves with the number of decoding iterations to highlight the possible tradeoffs between accuracy and complexity. Additionally, we show similar tradeoffs for a detection method from previous work. In a setup where only one block encoded with the polar code of interest is present among many other blocks, our results notably show that, depending on the complexity budget, a variable number of undesirable blocks can be dismissed while achieving a missed-detection rate in line with the block-error rate of a complex decoding algorithm.