Source author record

Olivier Deforges

Olivier Deforges appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

eess.IV eess.SP Computer Vision Cryptography and Security Machine Learning Neural and Evolutionary Computing

Catalog footprint

What is connected

7works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Adversarial Example Detection for DNN Models: A Review and Experimental Comparison

Deep learning (DL) has shown great success in many human-related tasks, which has led to its adoption in many computer vision based applications, such as security surveillance systems, autonomous vehicles and healthcare. Such safety-critical applications have to draw their path to success deployment once they have the capability to overcome safety-critical challenges. Among these challenges are the defense against or/and the detection of the adversarial examples (AEs). Adversaries can carefully craft small, often imperceptible, noise called perturbations to be added to the clean image to generate the AE. The aim of AE is to fool the DL model which makes it a potential risk for DL applications. Many test-time evasion attacks and countermeasures,i.e., defense or detection methods, are proposed in the literature. Moreover, few reviews and surveys were published and theoretically showed the taxonomy of the threats and the countermeasure methods with little focus in AE detection methods. In this paper, we focus on image classification task and attempt to provide a survey for detection methods of test-time evasion attacks on neural network classifiers. A detailed discussion for such methods is provided with experimental results for eight state-of-the-art detectors under different scenarios on four datasets. We also provide potential challenges and future perspectives for this research direction.

preprint2022arXiv

CAESR: Conditional Autoencoder and Super-Resolution for Learned Spatial Scalability

In this paper, we present CAESR, an hybrid learning-based coding approach for spatial scalability based on the versatile video coding (VVC) standard. Our framework considers a low-resolution signal encoded with VVC intra-mode as a base-layer (BL), and a deep conditional autoencoder with hyperprior (AE-HP) as an enhancement-layer (EL) model. The EL encoder takes as inputs both the upscaled BL reconstruction and the original image. Our approach relies on conditional coding that learns the optimal mixture of the source and the upscaled BL image, enabling better performance than residual coding. On the decoder side, a super-resolution (SR) module is used to recover high-resolution details and invert the conditional coding process. Experimental results have shown that our solution is competitive with the VVC full-resolution intra coding while being scalable.

preprint2021arXiv

Light Field Image Coding Using VVC standard and View Synthesis based on Dual Discriminator GAN

Light field (LF) technology is considered as a promising way for providing a high-quality virtual reality (VR) content. However, such an imaging technology produces a large amount of data requiring efficient LF image compression solutions. In this paper, we propose a LF image coding method based on a view synthesis and view quality enhancement techniques. Instead of transmitting all the LF views, only a sparse set of reference views are encoded and transmitted, while the remaining views are synthesized at the decoder side. The transmitted views are encoded using the versatile video coding (VVC) standard and are used as reference views to synthesize the dropped views. The selection of non-reference dropped views is performed using a rate-distortion optimization based on the VVC temporal scalability. The dropped views are reconstructed using the LF dual discriminator GAN (LF-D2GAN) model. In addition, to ensure that the quality of the views is consistent, at the decoder, a quality enhancement procedure is performed on the reconstructed views allowing smooth navigation across views. Experimental results show that the proposed method provides high coding performance and overcomes the state-of-the-art LF image compression methods by -36.22% in terms of BD-BR and 1.35 dB in BD-PSNR. The web page of this work is available at https://naderbakir79.github.io/LFD2GAN.html.

preprint2020arXiv

A Fixation-based 360° Benchmark Dataset for Salient Object Detection

Fixation prediction (FP) in panoramic contents has been widely investigated along with the booming trend of virtual reality (VR) applications. However, another issue within the field of visual saliency, salient object detection (SOD), has been seldom explored in 360° (or omnidirectional) images due to the lack of datasets representative of real scenes with pixel-level annotations. Toward this end, we collect 107 equirectangular panoramas with challenging scenes and multiple object classes. Based on the consistency between FP and explicit saliency judgements, we further manually annotate 1,165 salient objects over the collected images with precise masks under the guidance of real human eye fixation maps. Six state-of-the-art SOD models are then benchmarked on the proposed fixation-based 360° image dataset (F-360iSOD), by applying a multiple cubic projection-based fine-tuning method. Experimental results show a limitation of the current methods when used for SOD in panoramic images, which indicates the proposed dataset is challenging. Key issues for 360° SOD is also discussed. The proposed dataset is available at https://github.com/PanoAsh/F-360iSOD.

preprint2020arXiv

Binary Probability Model for Learning Based Image Compression

In this paper, we propose to enhance learned image compression systems with a richer probability model for the latent variables. Previous works model the latents with a Gaussian or a Laplace distribution. Inspired by binary arithmetic coding , we propose to signal the latents with three binary values and one integer, with different probability models. A relaxation method is designed to perform gradient-based training. The richer probability model results in a better entropy coding leading to lower rate. Experiments under the Challenge on Learned Image Compression (CLIC) test conditions demonstrate that this method achieves 18% rate saving compared to Gaussian or Laplace models.

preprint2020arXiv

Extending 2D Saliency Models for Head Movement Prediction in 360-degree Images using CNN-based Fusion

Saliency prediction can be of great benefit for 360-degree image/video applications, including compression, streaming , rendering and viewpoint guidance. It is therefore quite natural to adapt the 2D saliency prediction methods for 360-degree images. To achieve this, it is necessary to project the 360-degree image to 2D plane. However, the existing projection techniques introduce different distortions, which provides poor results and makes inefficient the direct application of 2D saliency prediction models to 360-degree content. Consequently, in this paper, we propose a new framework for effectively applying any 2D saliency prediction method to 360-degree images. The proposed framework particularly includes a novel convolutional neural network based fusion approach that provides more accurate saliency prediction while avoiding the introduction of distortions. The proposed framework has been evaluated with five 2D saliency prediction methods, and the experimental results showed the superiority of our approach compared to the use of weighted sum or pixel-wise maximum fusion methods.

preprint2020arXiv

Versatile video coding and super-resolution for efficient delivery of 8K video with 4K backward-compatibility

In this paper, we propose, through an objective study, to compare and evaluate the performance of different coding approaches allowing the delivery of an 8K video signal with 4K backward-compatibility on broadcast networks. Presented approaches include simulcast of 8K and 4K single-layer signals encoded using High-Efficiency Video Coding (HEVC) and Versatile Video Coding (VVC) standards, spatial scalability using SHVC with 4K base layer (BL) and 8K enhancement-layer (EL), and super-resolution applied on 4K VVC signal after decoding to reach 8K resolution. For up-scaling, we selected the deep-learning-based super-resolution method called Super-Resolution with Feedback Network (SRFBN) and the Lanczos interpolation filter. We show that the deep-learning-based approach achieves visual quality gain over simulcast, especially on bit-rates lower than 30Mb/s with average gain of 0.77dB, 0.015, and 7.97 for PSNR, SSIM, and VMAF, respectively and out-performs the Lanczos filter in average by 29% of BD-rate savings.

Olivier Deforges

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Adversarial Example Detection for DNN Models: A Review and Experimental Comparison

CAESR: Conditional Autoencoder and Super-Resolution for Learned Spatial Scalability

Light Field Image Coding Using VVC standard and View Synthesis based on Dual Discriminator GAN

A Fixation-based 360° Benchmark Dataset for Salient Object Detection

Binary Probability Model for Learning Based Image Compression

Extending 2D Saliency Models for Head Movement Prediction in 360-degree Images using CNN-based Fusion

Versatile video coding and super-resolution for efficient delivery of 8K video with 4K backward-compatibility