Source author record

François Pitié

François Pitié appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision eess.IV Multimedia eess.SP

Catalog footprint

What is connected

11works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Efficient Dense Matching for Enhanced Gaussian Splatting Using AV1 Motion Vectors

3D Gaussian Splatting (3DGS) has emerged as a prominent framework for real-time, photorealistic scene reconstruction, offering significant speed-ups over Neural Radiance Fields (NeRF). However, the fidelity of 3DGS representations remains heavily dependent on the quality of the initial point cloud. While standard Structure-from-Motion (SfM) pipelines using COLMAP provide adequate initialisation, they often suffer from high computational costs and sparsity in textureless regions, which degrades subsequent reconstruction accuracy and convergence speed. In this work, we introduce an AV1-based feature detection and matching pipeline that significantly reduces SfM processing overhead. By leveraging motion vectors inherent to the AV1 video codec, we bypass computationally expensive exhaustive matching while maintaining geometric robustness. Our pipeline produces substantially denser point clouds, with up to eight times as many points as classical SfM. We demonstrate that this enhanced initialisation directly improves 3DGS performance, yielding an 9-point increase in VMAF and a 63% average reduction in training time required to reach baseline quality. The project page: https://sigmedia.tv/AV1-3DGS.github.io/

preprint2022arXiv

Frame-type Sensitive RDO Control for Content-Adaptive-encoding

Video transcoding is an increasingly important application in the streaming media industry. It has become important to investigate the optimisation of transcoder parameters for a single clip simply because of the immense number of playbacks for popular clips. In this paper, we explore the use of a canned optimiser to estimate the optimal RD tradeoff achievable for a particular clip. We show that by adjusting the Lagrange multiplier in RD optimisation on keyframes alone we can achieve more than 10$\times$ the previous BD-Rate gains possible without affecting quality for any operating point.

preprint2022arXiv

Near Optimal Per-Clip Lagrangian Multiplier Prediction in HEVC

The majority of internet traffic is video content. This drives the demand for video compression to deliver high quality video at low target bitrates. Optimising the parameters of a video codec for a specific video clip (per-clip optimisation) has been shown to yield significant bitrate savings. In previous work we have shown that per-clip optimisation of the Lagrangian multiplier leads to up to 24% BD-Rate improvement. A key component of these algorithms is modeling the R-D characteristic across the appropriate bitrate range. This is computationally heavy as it usually involves repeated video encodes of the high resolution material at different parameter settings. This work focuses on reducing this computational load by deploying a NN operating on lower bandwidth features. Our system achieves BD-Rate improvement in approximately 90% of a large corpus with comparable results to previous work in direct optimisation.

preprint2022arXiv

Per Clip Lagrangian Multiplier Optimisation for HEVC

The majority of internet traffic is video content. This drives the demand for video compression in order to deliver high quality video at low target bitrates. This paper investigates the impact of adjusting the rate distortion equation on compression performance. An constant of proportionality, k, is used to modify the Lagrange multiplier used in H.265 (HEVC). Direct optimisation methods are deployed to maximise BD-Rate improvement for a particular clip. This leads to up to 21% BD-Rate improvement for an individual clip. Furthermore we use a more realistic corpus of material provided by YouTube. The results show that direct optimisation using BD-rate as the objective function can lead to further gains in bitrate savings that are not available with previous approaches.

preprint2022arXiv

Per-clip adaptive Lagrangian multiplier optimisation with low-resolution proxies

This work focuses on reducing the computational cost of repeated video encodes by using a lower resolution clip as a proxy. Features extracted from the low resolution clip are used to learn an optimal lagrange multiplier for rate control on the original resolution clip. In addition to reducing the computational cost and encode time by using lower resolution clips, we also investigate the use of older, but faster codecs such as H.264 to create proxies. This work shows that the computational load is reduced by 22 times using 144p proxies. Our tests are based on the YouTube UGC dataset, hence our results are based on a practical instance of the adaptive bitrate encoding problem. Further improvements are possible, by optimising the placement and sparsity of operating points required for the rate distortion curves.

preprint2022arXiv

Per-clip and per-bitrate adaptation of the Lagrangian multiplier in video coding

In the past ten years there have been significant developments in optimization of transcoding parameters on a per-clip rather than per-genre basis. In our recent work we have presented per-clip optimization for the Lagrangian multiplier in Rate controlled compression, which yielded BD-Rate improvements of approximately 2\% across a corpus of videos using HEVC. However, in a video streaming application, the focus is on optimizing the rate/distortion tradeoff at a particular bitrate and not on average across a range of performance. We observed in previous work that a particular multiplier might give BD rate improvements over a certain range of bitrates, but not the entire range. Using different parameters across the range would improve gains overall. Therefore here we present a framework for choosing the best Lagrangian multiplier on a per-operating point basis across a range of bitrates. In effect, we are trying to find the para-optimal gain across bitrate and distortion for a single clip. In the experiments presented we employ direct optimization techniques to estimate this Lagrangian parameter path approximately 2,000 video clips. The clips are primarily from the YouTube-UGC dataset. We optimize both for bitrate savings as well as distortion metrics (PSNR, SSIM).

preprint2020arXiv

$F$, $B$, Alpha Matting

Cutting out an object and estimating its opacity mask, known as image matting, is a key task in many image editing applications. Deep learning approaches have made significant progress by adapting the encoder-decoder architecture of segmentation networks. However, most of the existing networks only predict the alpha matte and post-processing methods must then be used to recover the original foreground and background colours in the transparent regions. Recently, two methods have shown improved results by also estimating the foreground colours, but at a significant computational and memory cost. In this paper, we propose a low-cost modification to alpha matting networks to also predict the foreground and background colours. We study variations of the training regime and explore a wide range of existing and novel loss functions for the joint prediction. Our method achieves the state of the art performance on the Adobe Composition-1k dataset for alpha matte and composite colour quality. It is also the current best performing method on the alphamatting.com online evaluation.

preprint2020arXiv

An Advert Creation System for 3D Product Placements

Over the past decade, the evolution of video-sharing platforms has attracted a significant amount of investments on contextual advertising. The common contextual advertising platforms utilize the information provided by users to integrate 2D visual ads into videos. The existing platforms face many technical challenges such as ad integration with respect to occluding objects and 3D ad placement. This paper presents a Video Advertisement Placement & Integration (Adverts) framework, which is capable of perceiving the 3D geometry of the scene and camera motion to blend 3D virtual objects in videos and create the illusion of reality. The proposed framework contains several modules such as monocular depth estimation, object segmentation, background-foreground separation, alpha matting and camera tracking. Our experiments conducted using Adverts framework indicates the significant potential of this system in contextual ad integration, and pushing the limits of advertising industry using mixed reality technologies.

preprint2020arXiv

Background Matting

The current state of the art alpha matting methods mainly rely on the trimap as the secondary and only guidance to estimate alpha. This paper investigates the effects of utilising the background information as well as trimap in the process of alpha calculation. To achieve this goal, a state of the art method, AlphaGan is adopted and modified to process the background information as an extra input channel. Extensive experiments are performed to analyse the effect of the background information in image and video matting such as training with mildly and heavily distorted backgrounds. Based on the quantitative evaluations performed on Adobe Composition-1k dataset, the proposed pipeline significantly outperforms the state of the art methods using AlphaMatting benchmark metrics.

preprint2020arXiv

Getting to 99% Accuracy in Interactive Segmentation

Interactive object cutout tools are the cornerstone of the image editing workflow. Recent deep-learning based interactive segmentation algorithms have made significant progress in handling complex images and rough binary selections can typically be obtained with just a few clicks. Yet, deep learning techniques tend to plateau once this rough selection has been reached. In this work, we interpret this plateau as the inability of current algorithms to sufficiently leverage each user interaction and also as the limitations of current training/testing datasets. We propose a novel interactive architecture and a novel training scheme that are both tailored to better exploit the user workflow. We also show that significant improvements can be further gained by introducing a synthetic training dataset that is specifically designed for complex object boundaries. Comprehensive experiments support our approach, and our network achieves state of the art performance.

preprint2016arXiv

An Alternative Matting Laplacian

Cutting out and object and estimate its transparency mask is a key task in many applications. We take on the work on closed-form matting by Levin et al., that is used at the core of many matting techniques, and propose an alternative formulation that offers more flexible controls over the matting priors. We also show that this new approach is efficient at upscaling transparency maps from coarse estimates.

François Pitié

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Efficient Dense Matching for Enhanced Gaussian Splatting Using AV1 Motion Vectors

Frame-type Sensitive RDO Control for Content-Adaptive-encoding

Near Optimal Per-Clip Lagrangian Multiplier Prediction in HEVC

Per Clip Lagrangian Multiplier Optimisation for HEVC

Per-clip adaptive Lagrangian multiplier optimisation with low-resolution proxies

Per-clip and per-bitrate adaptation of the Lagrangian multiplier in video coding

$F$, $B$, Alpha Matting

An Advert Creation System for 3D Product Placements

Background Matting

Getting to 99% Accuracy in Interactive Segmentation

An Alternative Matting Laplacian