Source author record

Nikolaus Binder

Nikolaus Binder appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Graphics eess.SP Information Theory Machine Learning math.IT math.NA Numerical Analysis

Catalog footprint

What is connected

3works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

GPU-accelerated partially linear multiuser detection for 5G and beyond URLLC systems

In this feasibility study, we have implemented a recently proposed partially linear multiuser detection algorithm in reproducing kernel Hilbert spaces (RKHSs) on a GPU-accelerated platform. Partially linear multiuser detection, which combines the robustness of linear detection with the power of nonlinear methods, has been proposed for a massive connectivity scenario with the non-orthogonal multiple access (NOMA). This is a promising approach, but detecting payloads within a received orthogonal frequency division multiplexing (OFDM) radio frame requires the execution of a large number of inner product operations, which are the main computational burden of the algorithm. Although inner-product operations consist of simple kernel evaluations, their vast number poses a challenge in ultra-low latency (ULL) applications, because the time needed for computing the inner products might exceed the sub-millisecond latency requirement. To address this problem, this study demonstrates the acceleration of the inner-product operations through massive parallelization. The result is a GPU-accelerated real-time OFDM receiver that enables sub-millisecond latency detection to meet the requirements of 5th generation (5G) and beyond ultra-reliable and low latency communications (URLLC) systems. Moreover, the parallelization and acceleration techniques explored and demonstrated in this study can be extended to many other signal processing algorithms in Hilbert spaces, such as those based on projection onto convex sets (POCS) and adaptive projected subgradient method (APSM) algorithms. Experimental results and comparisons with the state-of-art confirm the effectiveness of our techniques.

preprint2022arXiv

Rendering along the Hilbert Curve

Based on the seminal work on Array-RQMC methods and rank-1 lattice sequences by Pierre L'Ecuyer and collaborators, we introduce efficient deterministic algorithms for image synthesis. Enumerating a low discrepancy sequence along the Hilbert curve superimposed on the raster of pixels of an image, we achieve noise characteristics that are desirable with respect to the human visual system, especially at very low sampling rates. As compared to the state of the art, our simple algorithms neither require randomization, nor costly optimization, nor lookup tables. We analyze correlations of space-filling curves and low discrepancy sequences, and demonstrate the benefits of the new algorithms in a professional, massively parallel light transport simulation and rendering system.

preprint2021arXiv

Massively Parallel Path Space Filtering

Restricting path tracing to a small number of paths per pixel for performance reasons rarely achieves a satisfactory image quality for scenes of interest. However, path space filtering may dramatically improve the visual quality by sharing information across vertices of paths classified as proximate. Unlike screen space-based approaches, these paths neither need to be present on the screen, nor is filtering restricted to the first intersection with the scene. While searching proximate vertices had been more expensive than filtering in screen space, we greatly improve over this performance penalty by storing, updating, and looking up the required information in a hash table. The keys are constructed from jittered and quantized information, such that only a single query very likely replaces costly neighborhood searches. A massively parallel implementation of the algorithm is demonstrated on a graphics processing unit (GPU).