Abstract:Video frame interpolation (VFI) offers a way to generate intermediate frames between consecutive frames of a video sequence. Although the development of advanced frame interpolation algorithms has received increased attention in recent years, assessing the perceptual quality of interpolated content remains an ongoing area of research. In this paper, we investigate simple ways to process motion fields, with the purposes of using them as video quality metric for evaluating frame interpolation algorithms. We evaluate these quality metrics using the BVI-VFI dataset which contains perceptual scores measured for interpolated sequences. From our investigation we propose a motion metric based on measuring the divergence of motion fields. This metric correlates reasonably with these perceptual scores (PLCC=0.51) and is more computationally efficient (x2.7 speedup) compared to FloLPIPS (a well known motion-based metric). We then use our new proposed metrics to evaluate a range of state of the art frame interpolation metrics and find our metrics tend to favour more perceptual pleasing interpolated frames that may not score highly in terms of PSNR or SSIM.
Abstract:The success of modern Deep Neural Network (DNN) approaches can be attributed to the use of complex optimization criteria beyond standard losses such as mean absolute error (MAE) or mean squared error (MSE). In this work, we propose a novel method of utilising a no-reference sharpness metric Q introduced by Zhu and Milanfar for removing out-of-focus blur from images. We also introduce a novel dataset of real-world out-of-focus images for assessing restoration models. Our fine-tuned method produces images with a 7.5 % increase in perceptual quality (LPIPS) as compared to a standard model trained only on MAE. Furthermore, we observe a 6.7 % increase in Q (reflecting sharper restorations) and 7.25 % increase in PSNR over most state-of-the-art (SOTA) methods.
Abstract:At practical streaming bitrates, traditional video compression pipelines frequently lead to visible artifacts that degrade perceptual quality. This submission couples the effectiveness of a neural post-processor with a different dynamic optimsation strategy for achieving an improved bitrate/quality compromise. The neural post-processor is refined via adversarial training and employs perceptual loss functions. By optimising the post-processor and encoder directly our method demonstrates significant improvement in video fidelity. The neural post-processor achieves substantial VMAF score increases of +6.72 and +1.81 at bitrates of 50 kb/s and 500 kb/s respectively.
Abstract:Image based Deep Feature Quality Metrics (DFQMs) have been shown to better correlate with subjective perceptual scores over traditional metrics. The fundamental focus of these DFQMs is to exploit internal representations from a large scale classification network as the metric feature space. Previously, no attention has been given to the problem of identifying which layers are most perceptually relevant. In this paper we present a new method for selecting perceptually relevant layers from such a network, based on a neuroscience interpretation of layer behaviour. The selected layers are treated as a hyperparameter to the critic network in a W-GAN. The critic uses the output from these layers in the preliminary stages to extract perceptual information. A video enhancement network is trained adversarially with this critic. Our results show that the introduction of these selected features into the critic yields up to 10% (FID) and 15% (KID) performance increase against other critic networks that do not exploit the idea of optimised feature selection.