Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Muhammad Uzair

Deepfake Audio Detection Using Self-supervised Fusion Representations

May 05, 2026

Khalid Zaman, Qixuan Huang, Muhammad Uzair, Masashi Unoki

Abstract:This paper describes a submission to the Environment-Aware Speech and Sound Deepfake Detection Challenge (ESDD2) 2026, which addresses component-level deepfake detection using the CompSpoofV2 dataset, where speech and environmental sounds may be independently manipulated. To address this challenge, a dual-branch deepfake detection framework is proposed to jointly model speech and environmental contextual representations from input audio. Two pretrained models, XLS-R for speech and BEATs for environmental sound, are used to extract complementary contextual representations. A Matching Head is introduced to model representation differences through statistical normalization and representation interaction, enabling estimation of the original class. In parallel, multi-head cross-attention enables effective information exchange between speech and environmental components. The refined representations are processed with residual connections and layer normalization, and passed to an AASIST classifier to predict speech-based and environment-based spoofing probabilities. The model outputs original, speech, and environment predictions. On the test set, the proposed system achieves an F1-score of 70.20% and an environmental EER of 16.54%, outperforming the baseline system.

Via

Access Paper or Ask Questions

A Survey of Hand Crafted and Deep Learning Methods for Image Aesthetic Assessment

Mar 22, 2021

Saira Kanwal, Muhammad Uzair, Habib Ullah

Figure 1 for A Survey of Hand Crafted and Deep Learning Methods for Image Aesthetic Assessment

Figure 2 for A Survey of Hand Crafted and Deep Learning Methods for Image Aesthetic Assessment

Figure 3 for A Survey of Hand Crafted and Deep Learning Methods for Image Aesthetic Assessment

Figure 4 for A Survey of Hand Crafted and Deep Learning Methods for Image Aesthetic Assessment

Abstract:Automatic image aesthetics assessment is a computer vision problem that deals with the categorization of images into different aesthetic levels. The categorization is usually done by analyzing an input image and computing some measure of the degree to which the image adhere to the key principles of photography (balance, rhythm, harmony, contrast, unity, look, feel, tone and texture). Owing to its diverse applications in many areas, automatic image aesthetic assessment has gained significant research attention in recent years. This paper presents a literature review of the recent techniques of automatic image aesthetics assessment. A large number of traditional hand crafted and deep learning based approaches are reviewed. Key problem aspects are discussed such as why some features or models perform better than others and what are the limitations. A comparison of the quantitative results of different methods is also provided at the end.

Via

Access Paper or Ask Questions

Anomalous entities detection using a cascade of deep learning models

Mar 09, 2021

Hamza Riaz, Muhammad Uzair, Habib Ullah

Figure 1 for Anomalous entities detection using a cascade of deep learning models

Figure 2 for Anomalous entities detection using a cascade of deep learning models

Figure 3 for Anomalous entities detection using a cascade of deep learning models

Figure 4 for Anomalous entities detection using a cascade of deep learning models

Abstract:Human actions that do not conform to usual behavior are considered as anomalous and such actors are called anomalous entities. Detection of anomalous entities using visual data is a challenging problem in computer vision. This paper presents a new approach to detect anomalous entities in complex situations of examination halls. The proposed method uses a cascade of deep convolutional neural network models. In the first stage, we apply a pretrained model of human pose estimation on frames of videos to extract key feature points of body. Patches extracted from each key point are utilized in the second stage to build a densely connected deep convolutional neural network model for detecting anomalous entities. For experiments we collect a video database of students undertaking examination in a hall. Our results show that the proposed method can detect anomalous entities and warrant unusual behavior with high accuracy.

Via

Access Paper or Ask Questions

Representation Learning with Deep Extreme Learning Machines for Efficient Image Set Classification

Apr 01, 2015

Muhammad Uzair, Faisal Shafait, Bernard Ghanem, Ajmal Mian

Figure 1 for Representation Learning with Deep Extreme Learning Machines for Efficient Image Set Classification

Figure 2 for Representation Learning with Deep Extreme Learning Machines for Efficient Image Set Classification

Figure 3 for Representation Learning with Deep Extreme Learning Machines for Efficient Image Set Classification

Figure 4 for Representation Learning with Deep Extreme Learning Machines for Efficient Image Set Classification

Abstract:Efficient and accurate joint representation of a collection of images, that belong to the same class, is a major research challenge for practical image set classification. Existing methods either make prior assumptions about the data structure, or perform heavy computations to learn structure from the data itself. In this paper, we propose an efficient image set representation that does not make any prior assumptions about the structure of the underlying data. We learn the non-linear structure of image sets with Deep Extreme Learning Machines (DELM) that are very efficient and generalize well even on a limited number of training samples. Extensive experiments on a broad range of public datasets for image set classification (Honda/UCSD, CMU Mobo, YouTube Celebrities, Celebrity-1000, ETH-80) show that the proposed algorithm consistently outperforms state-of-the-art image set classification methods both in terms of speed and accuracy.

Via

Access Paper or Ask Questions