Source author record

Zhengfang Duanmu

Zhengfang Duanmu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

eess.IV Multimedia Computer Vision

Catalog footprint

What is connected

4works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

Quantifying Visual Image Quality: A Bayesian View

Image quality assessment (IQA) models aim to establish a quantitative relationship between visual images and their perceptual quality by human observers. IQA modeling plays a special bridging role between vision science and engineering practice, both as a test-bed for vision theories and computational biovision models, and as a powerful tool that could potentially make profound impact on a broad range of image processing, computer vision, and computer graphics applications, for design, optimization, and evaluation purposes. IQA research has enjoyed an accelerated growth in the past two decades. Here we present an overview of IQA methods from a Bayesian perspective, with the goals of unifying a wide spectrum of IQA approaches under a common framework and providing useful references to fundamental concepts accessible to vision scientists and image processing practitioners. We discuss the implications of the successes and limitations of modern IQA methods for biological vision and the prospect for vision science to inform the design of future artificial vision systems.

preprint2020arXiv

Assessing the Quality-of-Experience of Adaptive Bitrate Video Streaming

The diversity of video delivery pipeline poses a grand challenge to the evaluation of adaptive bitrate (ABR) streaming algorithms and objective quality-of-experience (QoE) models. Here we introduce so-far the largest subject-rated database of its kind, namely WaterlooSQoE-IV, consisting of 1350 adaptive streaming videos created from diverse source contents, video encoders, network traces, ABR algorithms, and viewing devices. We collect human opinions for each video with a series of carefully designed subjective experiments. Subsequent data analysis and testing/comparison of ABR algorithms and QoE models using the database lead to a series of novel observations and interesting findings, in terms of the effectiveness of subjective experiment methodologies, the interactions between user experience and source content, viewing device and encoder type, the heterogeneities in the bias and preference of user experiences, the behaviors of ABR algorithms, and the performance of objective QoE models. Most importantly, our results suggest that a better objective QoE model, or a better understanding of human perceptual experience and behaviour, is the most dominating factor in improving the performance of ABR algorithms, as opposed to advanced optimization frameworks, machine learning strategies or bandwidth predictors, where a majority of ABR research has been focused on in the past decade. On the other hand, our performance evaluation of 11 QoE models shows only a moderate correlation between state-of-the-art QoE models and subjective ratings, implying rooms for improvement in both QoE modeling and ABR algorithms. The database is made publicly available at: \url{https://ece.uwaterloo.ca/~zduanmu/waterloosqoe4/}.

preprint2020arXiv

Characterizing Generalized Rate-Distortion Performance of Video Coding: An Eigen Analysis Approach

Rate-distortion (RD) theory is at the heart of lossy data compression. Here we aim to model the generalized RD (GRD) trade-off between the visual quality of a compressed video and its encoding profiles (e.g., bitrate and spatial resolution). We first define the theoretical functional space $\mathcal{W}$ of the GRD function by analyzing its mathematical properties.We show that $\mathcal{W}$ is a convex set in a Hilbert space, inspiring a computational model of the GRD function, and a method of estimating model parameters from sparse measurements. To demonstrate the feasibility of our idea, we collect a large-scale database of real-world GRD functions, which turn out to live in a low-dimensional subspace of $\mathcal{W}$. Combining the GRD reconstruction framework and the learned low-dimensional space, we create a low-parameter eigen GRD method to accurately estimate the GRD function of a source video content from only a few queries. Experimental results on the database show that the learned GRD method significantly outperforms state-of-the-art empirical RD estimation methods both in accuracy and efficiency. Last, we demonstrate the promise of the proposed model in video codec comparison.

preprint2019arXiv

Modeling Generalized Rate-Distortion Functions

Many multimedia applications require precise understanding of the rate-distortion characteristics measured by the function relating visual quality to media attributes, for which we term it the generalized rate-distortion (GRD) function. In this study, we explore the GRD behavior of compressed digital videos in a three-dimensional space of bitrate, resolution, and viewing device/condition. Our analysis on a large-scale video dataset reveals that empirical parametric models are systematically biased while exhaustive search methods require excessive computation time to depict the GRD surfaces. By exploiting the properties that all GRD functions share, we develop an Robust Axial-Monotonic Clough-Tocher (RAMCT) interpolation method to model the GRD function. This model allows us to accurately reconstruct the complete GRD function of a source video content from a moderate number of measurements. To further reduce the computational cost, we present a novel sampling scheme based on a probabilistic model and an information measure. The proposed sampling method constructs a sequence of quality queries by minimizing the overall informativeness in the remaining samples. Experimental results show that the proposed algorithm significantly outperforms state-of-the-art approaches in accuracy and efficiency. Finally, we demonstrate the usage of the proposed model in three applications: rate-distortion curve prediction, per-title encoding profile generation, and video encoder comparison.