Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Pablo Delgado

DiffHDR: Re-Exposing LDR Videos with Video Diffusion Models

Apr 07, 2026

Zhengming Yu, Li Ma, Mingming He, Leo Isikdogan, Yuancheng Xu, Dmitriy Smirnov, Pablo Salamanca, Dao Mi, Pablo Delgado, Ning Yu(+4 more)

Abstract:Most digital videos are stored in 8-bit low dynamic range (LDR) formats, where much of the original high dynamic range (HDR) scene radiance is lost due to saturation and quantization. This loss of highlight and shadow detail precludes mapping accurate luminance to HDR displays and limits meaningful re-exposure in post-production workflows. Although techniques have been proposed to convert LDR images to HDR through dynamic range expansion, they struggle to restore realistic detail in the over- and underexposed regions. To address this, we present DiffHDR, a framework that formulates LDR-to-HDR conversion as a generative radiance inpainting task within the latent space of a video diffusion model. By operating in Log-Gamma color space, DiffHDR leverages spatio-temporal generative priors from a pretrained video diffusion model to synthesize plausible HDR radiance in over- and underexposed regions while recovering the continuous scene radiance of the quantized pixels. Our framework further enables controllable LDR-to-HDR video conversion guided by text prompts or reference images. To address the scarcity of paired HDR video data, we develop a pipeline that synthesizes high-quality HDR video training data from static HDRI maps. Extensive experiments demonstrate that DiffHDR significantly outperforms state-of-the-art approaches in radiance fidelity and temporal stability, producing realistic HDR videos with considerable latitude for re-exposure.

* Project page: https://yzmblog.github.io/projects/DiffHDR/

Via

Access Paper or Ask Questions

Investigating the impact of stereo processing -- a study for extending the Open Dataset of Audio Quality (ODAQ)

Dec 16, 2025

Sascha Dick, Christoph Thompson, Chih-Wei Wu, Pablo Delgado, Phillip A. Williams, Matteo Torcoli

Abstract:In this paper, we present an initial study for extending Open Dataset of Audio Quality (ODAQ) towards the impact of stereo processing. Monaural artifacts from ODAQ were adapted in combinations with left-right (LR) and mid-side (MS) stereo processing, across stimuli including solo instruments, typical wide stereo mixes and and hard-panned mixes. Listening tests in different presentation context -- with and without direct comparison of MS and LR conditions -- were conducted to collect subjective data beyond monaural artifacts while also scrutinizing the listening test methodology. The ODAQ dataset is extended with new material along with subjective scores from 16 expert listeners. The listening test results show substantial influences of the stimuli's spatial characteristics as well as the presentation context. Notably, several significant disparities between LR and MS only occur when presented in direct comparison. The findings suggest that listeners primarily assess timbral impairments when spatial characteristics are consistent and focus on stereo image only when timbral quality is similar. The rating of an additional mono anchor was overall consistent across different stereo characteristics, averaging at 65 on the MUSHRA scale, further corroborating that listeners prioritize timbral over spatial impressions.

* Presented at the Audio Engineering Society (AES) 159th Convention, October 2025, Paper number 365, see https://aes2.org/publications/elibrary-page/?id=23039

Via

Access Paper or Ask Questions

Expanding and Analyzing ODAQ -- the Open Dataset of Audio Quality

Apr 01, 2025

Sascha Dick, Christoph Thompson, Chih-Wei Wu, Matteo Torcoli, Pablo Delgado, Phillip A. Williams, Emanuel Habets

Figure 1 for Expanding and Analyzing ODAQ -- the Open Dataset of Audio Quality

Figure 2 for Expanding and Analyzing ODAQ -- the Open Dataset of Audio Quality

Figure 3 for Expanding and Analyzing ODAQ -- the Open Dataset of Audio Quality

Figure 4 for Expanding and Analyzing ODAQ -- the Open Dataset of Audio Quality

Abstract:The Open Dataset of Audio Quality (ODAQ) was recently introduced to address the scarcity of openly available audio datasets with corresponding subjective quality scores. The dataset, released under permissive licenses, comprises audio material processed using six different signal processing methods operating at five quality levels, along with corresponding subjective test results. To expand the dataset, we provided listener training to university students to conduct further subjective tests and obtained results consistent with previous expert listeners. We also showed how different training approaches affect the use of absolute scales and anchors. The expanded dataset now comprises results from three international laboratories providing a total of 42 listeners and 10080 subjective scores. This paper provides the details of the expansion and an in-depth analysis. As part of this analysis, we initiate the use of ODAQ as a benchmark to evaluate objective audio quality metrics in their ability to predict subjective scores

* Accepted for presentation at the Audio Engineering Society (AES) 157th Convention, October 2024, New York, USA

Via

Access Paper or Ask Questions

InTune: Reinforcement Learning-based Data Pipeline Optimization for Deep Recommendation Models

Aug 13, 2023

Kabir Nagrecha, Lingyi Liu, Pablo Delgado, Prasanna Padmanabhan

Figure 1 for InTune: Reinforcement Learning-based Data Pipeline Optimization for Deep Recommendation Models

Figure 2 for InTune: Reinforcement Learning-based Data Pipeline Optimization for Deep Recommendation Models

Figure 3 for InTune: Reinforcement Learning-based Data Pipeline Optimization for Deep Recommendation Models

Figure 4 for InTune: Reinforcement Learning-based Data Pipeline Optimization for Deep Recommendation Models

Abstract:Deep learning-based recommender models (DLRMs) have become an essential component of many modern recommender systems. Several companies are now building large compute clusters reserved only for DLRM training, driving new interest in cost- and time- saving optimizations. The systems challenges faced in this setting are unique; while typical deep learning training jobs are dominated by model execution, the most important factor in DLRM training performance is often online data ingestion. In this paper, we explore the unique characteristics of this data ingestion problem and provide insights into DLRM training pipeline bottlenecks and challenges. We study real-world DLRM data processing pipelines taken from our compute cluster at Netflix to observe the performance impacts of online ingestion and to identify shortfalls in existing pipeline optimizers. We find that current tooling either yields sub-optimal performance, frequent crashes, or else requires impractical cluster re-organization to adopt. Our studies lead us to design and build a new solution for data pipeline optimization, InTune. InTune employs a reinforcement learning (RL) agent to learn how to distribute the CPU resources of a trainer machine across a DLRM data pipeline to more effectively parallelize data loading and improve throughput. Our experiments show that InTune can build an optimized data pipeline configuration within only a few minutes, and can easily be integrated into existing training workflows. By exploiting the responsiveness and adaptability of RL, InTune achieves higher online data ingestion rates than existing optimizers, thus reducing idle times in model execution and increasing efficiency. We apply InTune to our real-world cluster, and find that it increases data ingestion throughput by as much as 2.29X versus state-of-the-art data pipeline optimizers while also improving both CPU & GPU utilization.

* Accepted at RecSys 2023. 11 pages, 2 pages of references. 8 figures with 2 tables

Via

Access Paper or Ask Questions

Gradient Boosted Decision Tree Neural Network

Nov 05, 2019

Mohammad Saberian, Pablo Delgado, Yves Raimond

Figure 1 for Gradient Boosted Decision Tree Neural Network

Figure 2 for Gradient Boosted Decision Tree Neural Network

Abstract:In this paper we propose a method to build a neural network that is similar to an ensemble of decision trees. We first illustrate how to convert a learned ensemble of decision trees to a single neural network with one hidden layer and an input transformation. We then relax some properties of this network such as thresholds and activation functions to train an approximately equivalent decision tree ensemble. The final model, Hammock, is surprisingly simple: a fully connected two layers neural network where the input is quantized and one-hot encoded. Experiments on large and small datasets show this simple method can achieve performance similar to that of Gradient Boosted Decision Trees.

Via

Access Paper or Ask Questions