Researcher profile

Janne Janhunen

Janne Janhunen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - Baseline
3works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2015arXiv

A Customized Lattice Reduction Multiprocessor for MIMO Detection

Lattice reduction (LR) is a preprocessing technique for multiple-input multiple-output (MIMO) symbol detection to achieve better bit error-rate (BER) performance. In this paper, we propose a customized homogeneous multiprocessor for LR. The processor cores are based on transport triggered architecture (TTA). We propose some modification of the popular LR algorithm, Lenstra-Lenstra-Lovasz (LLL) for high throughput. The TTA cores are programmed with high level language. Each TTA core consists of several special function units to accelerate the program code. The multiprocessor takes 187 cycles to reduce a single matrix for LR. The architecture is synthesized on 90 nm technology and takes 405 kgates at 210 MHz.

preprint2015arXiv

Design of a Transport Triggered Architecture Processor for Flexible Iterative Turbo Decoder

In order to meet the requirement of high data rates for the next generation wireless systems, the efficient implementation of receiver algorithms is essential. On the other hand, the rapid development of technology motivates the investigation of programmable implementations. This paper summarizes the design of a programmable turbo decoder as an applicationspecific instruction-set processor (ASIP) using Transport Triggered Architecture (TTA). The processor architecture is designed in such manner that it can be programmed to support other receiver algorithms, for example, decoding based on the Viterbi algorithm. Different suboptimal maximum a posteriori (MAP) algorithms are used and compared to one another for the softinput soft-output (SISO) component decoders in a single TTA processor. The max-log-MAP algorithm outperforms the other suboptimal algorithms in terms of latency. The design enables the designer to change the suboptimal algorithms according to the bit error rate (BER) performance requirement. Unlike many other programmable turbo decoder implementations, quadratic polynomial permutation (QPP) interleaver is used in this work for contention-free memory access and to make the processor 3GPP LTE compliant. Several optimization techniques to enable real time processing on programmable platforms are introduced. Using our method, with a single iteration 31.32 Mbps throughput is achieved for the max-log-MAP algorithm for a clock frequency of 200 MHz.

preprint2015arXiv

Design of a Unified Transport Triggered Processor for LDPC/Turbo Decoder

This paper summarizes the design of a programmable processor with transport triggered architecture (TTA) for decoding LDPC and turbo codes. The processor architecture is designed in such a manner that it can be programmed for LDPC or turbo decoding for the purpose of internetworking and roaming between different networks. The standard trellis based maximum a posteriori (MAP) algorithm is used for turbo decoding. Unlike most other implementations, a supercode based sum-product algorithm is used for the check node message computation for LDPC decoding. This approach ensures the highest hardware utilization of the processor architecture for the two different algorithms. Up to our knowledge, this is the first attempt to design a TTA processor for the LDPC decoder. The processor is programmed with a high level language to meet the time-to-market requirement. The optimization techniques and the usage of the function units for both algorithms are explained in detail. The processor achieves 22.64 Mbps throughput for turbo decoding with a single iteration and 10.12 Mbps throughput for LDPC decoding with five iterations for a clock frequency of 200 MHz.