Abstract:Rate control allocates bits efficiently across frames to meet a target bitrate while maintaining quality. Conventional two-pass rate control (2pRC) in Versatile Video Coding (VVC) relies on analytical rate-QP models, which often fail to capture nonlinear spatial-temporal variations, causing quality instability and high complexity due to multiple trial encodes. This paper proposes a content-adaptive framework that predicts frame-level bit consumption using lightweight features from the Video Complexity Analyzer (VCA) and quantization parameters within a Random Forest regression. On ultra-high-definition sequences encoded with VVenC, the model achieves strong correlation with ground truth, yielding R2 values of 0.93, 0.88, and 0.77 for I-, P-, and B-frames, respectively. Integrated into a rate-control loop, it achieves comparable coding efficiency to 2pRC while reducing total encoding time by 33.3%. The results show that VCA-driven bit prediction provides a computationally efficient and accurate alternative to conventional rate-QP models.
Abstract:Conventional video encoders typically employ a fixed chroma subsampling format, such as YUV420, which may not optimally reflect variations in chroma detail across different types of content. This can lead to suboptimal chroma quality and inefficiencies in bitrate allocation. We propose an Adaptive Resolution-Chroma Subsampling (ARCS) framework that jointly optimizes spatial resolution and chroma subsampling to balance perceptual quality and decoding efficiency. ARCS selects an optimal (resolution, chroma format) pair for each bitrate by maximizing a composite quality-complexity objective, while enforcing monotonicity constraints to ensure smooth transitions between representations. Experimental results using x265 show that, compared to a fixed-format encoding (YUV444), on average, ARCS achieves a 13.48 % bitrate savings and a 62.18 % reduction in decoding time, which we use as a proxy for the decoding energy, to yield the same colorVideoVDP score. The proposed framework introduces chroma adaptivity as a new control dimension for energy-efficient video streaming.
Abstract:Preparing high-quality 360-degree video for HTTP Adaptive Streaming requires encoding each sequence into multiple representations spanning different resolutions and quantization parameters (QPs). For ultra-high-resolution immersive content such as 8K 360-degree video, this process is computationally intensive due to the large number of representations and the high complexity of modern codecs. This paper investigates fast multirate encoding strategies that reduce encoding time by reusing encoder analysis information across QPs and resolutions. We evaluate two cross-resolution information-reuse pipelines that differ in how reference encodes propagate across resolutions: (i) a strict HD -> 4K -> 8K cascade with scaled analysis reuse, and (ii) a resolution-anchored scheme that initializes each resolution with its own highest-bitrate reference before guiding dependent encodes. In addition to evaluating these pipelines on standard equirectangular projection content, we also apply the same two pipelines to cubemap-projection (CMP) tiling, where each 360-degree frame is partitioned into independently encoded tiles. CMP introduces substantial parallelism, while still benefiting from the proposed multirate analysis-reuse strategies. Experimental results using the SJTU 8K 360-degree dataset show that hierarchical analysis reuse significantly accelerates HEVC encoding with minimal rate-distortion impact across both equirectangular and CMP-tiled content, yielding encoding-time reductions of roughly 33%-59% for ERP and about 51% on average for CMP, with Bjontegaard Delta Encoding Time (BDET) gains approaching -50% and wall-clock speedups of up to 4.2x.