Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dario Allegra

Department of Mathematics and Computer Science, University of Catania, Italy

End-to-end Audio Deepfake Detection from RAW Waveforms: a RawNet-Based Approach with Cross-Dataset Evaluation

Apr 30, 2025

Andrea Di Pierno, Luca Guarnera, Dario Allegra, Sebastiano Battiato

Abstract:Audio deepfakes represent a growing threat to digital security and trust, leveraging advanced generative models to produce synthetic speech that closely mimics real human voices. Detecting such manipulations is especially challenging under open-world conditions, where spoofing methods encountered during testing may differ from those seen during training. In this work, we propose an end-to-end deep learning framework for audio deepfake detection that operates directly on raw waveforms. Our model, RawNetLite, is a lightweight convolutional-recurrent architecture designed to capture both spectral and temporal features without handcrafted preprocessing. To enhance robustness, we introduce a training strategy that combines data from multiple domains and adopts Focal Loss to emphasize difficult or ambiguous samples. We further demonstrate that incorporating codec-based manipulations and applying waveform-level audio augmentations (e.g., pitch shifting, noise, and time stretching) leads to significant generalization improvements under realistic acoustic conditions. The proposed model achieves over 99.7% F1 and 0.25% EER on in-domain data (FakeOrReal), and up to 83.4% F1 with 16.4% EER on a challenging out-of-distribution test set (AVSpoof2021 + CodecFake). These findings highlight the importance of diverse training data, tailored objective functions and audio augmentations in building resilient and generalizable audio forgery detectors. Code and pretrained models are available at https://iplab.dmi.unict.it/mfs/Deepfakes/PaperRawNet2025/.

Via

Access Paper or Ask Questions

Animated GIF optimization by adaptive color local table management

Jul 09, 2020

Oliver Giudice, Dario Allegra, Francesco Guarnera, Filippo Stanco, Sebastiano Battiato

Figure 1 for Animated GIF optimization by adaptive color local table management

Figure 2 for Animated GIF optimization by adaptive color local table management

Figure 3 for Animated GIF optimization by adaptive color local table management

Figure 4 for Animated GIF optimization by adaptive color local table management

Abstract:After thirty years of the GIF file format, today is becoming more popular than ever: being a great way of communication for friends and communities on Instant Messengers and Social Networks. While being so popular, the original compression method to encode GIF images have not changed a bit. On the other hand popularity means that storage saving becomes an issue for hosting platforms. In this paper a parametric optimization technique for animated GIFs will be presented. The proposed technique is based on Local Color Table selection and color remapping in order to create optimized animated GIFs while preserving the original format. The technique achieves good results in terms of byte reduction with limited or no loss of perceived color quality. Tests carried out on 1000 GIF files demonstrate the effectiveness of the proposed optimization strategy.

Via

Access Paper or Ask Questions

A Multi-Task Learning Approach for Meal Assessment

Jun 27, 2018

Ya Lu, Dario Allegra, Marios Anthimopoulos, Filippo Stanco, Giovanni Maria Farinella, Stavroula Mougiakakou

Figure 1 for A Multi-Task Learning Approach for Meal Assessment

Figure 2 for A Multi-Task Learning Approach for Meal Assessment

Figure 3 for A Multi-Task Learning Approach for Meal Assessment

Figure 4 for A Multi-Task Learning Approach for Meal Assessment

Abstract:Key role in the prevention of diet-related chronic diseases plays the balanced nutrition together with a proper diet. The conventional dietary assessment methods are time-consuming, expensive and prone to errors. New technology-based methods that provide reliable and convenient dietary assessment, have emerged during the last decade. The advances in the field of computer vision permitted the use of meal image to assess the nutrient content usually through three steps: food segmentation, recognition and volume estimation. In this paper, we propose a use one RGB meal image as input to a multi-task learning based Convolutional Neural Network (CNN). The proposed approach achieved outstanding performance, while a comparison with state-of-the-art methods indicated that the proposed approach exhibits clear advantage in accuracy, along with a massive reduction of processing time.

Via

Access Paper or Ask Questions