Source author record

Fabian Brand

Fabian Brand appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

eess.IV Computer Vision Artificial Intelligence

Catalog footprint

What is connected

6works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Learning Frequency-Specific Quantization Scaling in VVC for Standard-Compliant Task-driven Image Coding

Today, visual data is often analyzed by a neural network without any human being involved, which demands for specialized codecs. For standard-compliant codec adaptations towards certain information sinks, HEVC or VVC provide the possibility of frequency-specific quantization with scaling lists. This is a well-known method for the human visual system, where scaling lists are derived from psycho-visual models. In this work, we employ scaling lists when performing VVC intra coding for neural networks as information sink. To this end, we propose a novel data-driven method to obtain optimal scaling lists for arbitrary neural networks. Experiments with Mask R-CNN as information sink reveal that coding the Cityscapes dataset with the proposed scaling lists result in peak bitrate savings of 8.9 % over VVC with constant quantization. By that, our approach also outperforms scaling lists optimized for the human visual system. The generated scaling lists can be found under https://github.com/FAU-LMS/VCM_scaling_lists.

preprint2022arXiv

A Low-Parametric Model for Bit-Rate Estimation of VVC Residual Coding

There are many tasks within video compression which require fast bit rate estimation. As an example, rate-control algorithms are only feasible because it is possible to estimate the required bit rate without needing to encode the entire block. With residual coding technology becoming more and more sophisticated, the corresponding bit rate models require more advanced features. In this work, we propose a set of four features together with a linear model, which is able to estimate the rate of arbitrary residual blocks which were compressed using the VVC standard. Our method outperforms other methods which were used for the same task both in terms of mean absolute error and mean relative error. Our model deviates by less than 4 bit on average over a large dataset of natural images.

preprint2022arXiv

A Novel End-To-End Network for Reconstruction of Non-Regularly Sampled Image Data Using Locally Fully Connected Layers

Quarter sampling and three-quarter sampling are novel sensor concepts that enable the acquisition of higher resolution images without increasing the number of pixels. This is achieved by non-regularly covering parts of each pixel of a low-resolution sensor such that only one quadrant or three quadrants of the sensor area of each pixel is sensitive to light. Combining a properly designed mask and a high-quality reconstruction algorithm, a higher image quality can be achieved than using a low-resolution sensor and subsequent upsampling. For the latter case, the image quality can be further enhanced using super resolution algorithms such as the very deep super resolution network (VDSR). In this paper, we propose a novel end-to-end neural network to reconstruct high resolution images from non-regularly sampled sensor data. The network is a concatenation of a locally fully connected reconstruction network (LFCR) and a standard VDSR network. Altogether, using a three-quarter sampling sensor with our novel neural network layout, the image quality in terms of PSNR for the Urban100 dataset can be increased by 2.96 dB compared to the state-of-the-art approach. Compared to a low-resolution sensor with VDSR, a gain of 1.11 dB is achieved.

preprint2022arXiv

Enhanced Image Reconstruction From Quarter Sampling Measurements Using An Adapted Very Deep Super Resolution Network

Quarter sampling is a novel sensor concept that enables the acquisition of higher resolution images without increasing the number of pixels. This is achieved by covering three quarters of each pixel of a low-resolution sensor such that only one quadrant of the sensor area of each pixel is sensitive to light. By randomly masking different parts, effectively a non-regular sampling of a higher resolution image is performed. Combining a properly designed mask and a high-quality reconstruction algorithm, a higher image quality can be achieved than using a low-resolution sensor and subsequent upsampling. For the latter case, the image quality can be enhanced using super resolution algorithms. Recently, algorithms based on machine learning such as the Very Deep Super Resolution network (VDSR) proofed to be successful for this task. In this work, we transfer the concepts of VDSR to the special case of quarter sampling. Besides adapting the network layout to take advantage of the case of quarter sampling, we introduce a novel data augmentation technique enabled by quarter sampling. Altogether, using the quarter sampling sensor, the image quality in terms of PSNR can be increased by +0.67 dB for the Urban 100 dataset compared to using a low-resolution sensor with VDSR.

preprint2022arXiv

Learning True Rate-Distortion-Optimization for End-To-End Image Compression

Even though rate-distortion optimization is a crucial part of traditional image and video compression, not many approaches exist which transfer this concept to end-to-end-trained image compression. Most frameworks contain static compression and decompression models which are fixed after training, so efficient rate-distortion optimization is not possible. In a previous work, we proposed RDONet, which enables an RDO approach comparable to adaptive block partitioning in HEVC. In this paper, we enhance the training by introducing low-complexity estimations of the RDO result into the training. Additionally, we propose fast and very fast RDO inference modes. With our novel training method, we achieve average rate savings of 19.6% in MS-SSIM over the previous RDONet model, which equals rate savings of 27.3% over a comparable conventional deep image coder.

preprint2022arXiv

Video Coding for Machines with Feature-Based Rate-Distortion Optimization

Common state-of-the-art video codecs are optimized to deliver a low bitrate by providing a certain quality for the final human observer, which is achieved by rate-distortion optimization (RDO). But, with the steady improvement of neural networks solving computer vision tasks, more and more multimedia data is not observed by humans anymore, but directly analyzed by neural networks. In this paper, we propose a standard-compliant feature-based RDO (FRDO) that is designed to increase the coding performance, when the decoded frame is analyzed by a neural network in a video coding for machine scenario. To that extent, we replace the pixel-based distortion metrics in conventional RDO of VTM-8.0 with distortion metrics calculated in the feature space created by the first layers of a neural network. Throughout several tests with the segmentation network Mask R-CNN and single images from the Cityscapes dataset, we compare the proposed FRDO and its hybrid version HFRDO with different distortion measures in the feature space against the conventional RDO. With HFRDO, up to 5.49 % bitrate can be saved compared to the VTM-8.0 implementation in terms of Bjøntegaard Delta Rate and using the weighted average precision as quality metric. Additionally, allowing the encoder to vary the quantization parameter results in coding gains for the proposed HFRDO of up 9.95 % compared to conventional VTM.

Fabian Brand

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

Learning Frequency-Specific Quantization Scaling in VVC for Standard-Compliant Task-driven Image Coding

A Low-Parametric Model for Bit-Rate Estimation of VVC Residual Coding

A Novel End-To-End Network for Reconstruction of Non-Regularly Sampled Image Data Using Locally Fully Connected Layers

Enhanced Image Reconstruction From Quarter Sampling Measurements Using An Adapted Very Deep Super Resolution Network

Learning True Rate-Distortion-Optimization for End-To-End Image Compression

Video Coding for Machines with Feature-Based Rate-Distortion Optimization