Picture for Girish

Girish

Bridging Attribution and Open-Set Detection using Graph-Augmented Instance Learning in Synthetic Speech

Add code
Jan 11, 2026
Viaarxiv icon

DIVINE: Coordinating Multimodal Disentangled Representations for Oro-Facial Neurological Disorder Assessment

Add code
Jan 11, 2026
Viaarxiv icon

Curved Worlds, Clear Boundaries: Generalizing Speech Deepfake Detection using Hyperbolic and Spherical Geometry Spaces

Add code
Nov 13, 2025
Viaarxiv icon

Towards Attribution of Generators and Emotional Manipulation in Cross-Lingual Synthetic Speech using Geometric Learning

Add code
Nov 13, 2025
Figure 1 for Towards Attribution of Generators and Emotional Manipulation in Cross-Lingual Synthetic Speech using Geometric Learning
Figure 2 for Towards Attribution of Generators and Emotional Manipulation in Cross-Lingual Synthetic Speech using Geometric Learning
Figure 3 for Towards Attribution of Generators and Emotional Manipulation in Cross-Lingual Synthetic Speech using Geometric Learning
Figure 4 for Towards Attribution of Generators and Emotional Manipulation in Cross-Lingual Synthetic Speech using Geometric Learning
Viaarxiv icon

Are Multimodal Foundation Models All That Is Needed for Emofake Detection?

Add code
Sep 19, 2025
Viaarxiv icon

Rethinking Cross-Corpus Speech Emotion Recognition Benchmarking: Are Paralinguistic Pre-Trained Representations Sufficient?

Add code
Sep 19, 2025
Viaarxiv icon

Towards Neural Audio Codec Source Parsing

Add code
Jun 14, 2025
Figure 1 for Towards Neural Audio Codec Source Parsing
Figure 2 for Towards Neural Audio Codec Source Parsing
Figure 3 for Towards Neural Audio Codec Source Parsing
Figure 4 for Towards Neural Audio Codec Source Parsing
Viaarxiv icon

Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals

Add code
Oct 16, 2024
Figure 1 for Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals
Figure 2 for Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals
Figure 3 for Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals
Figure 4 for Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals
Viaarxiv icon

Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection

Add code
Sep 24, 2024
Figure 1 for Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection
Figure 2 for Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection
Figure 3 for Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection
Figure 4 for Representation Loss Minimization with Randomized Selection Strategy for Efficient Environmental Fake Audio Detection
Viaarxiv icon

Strong Alone, Stronger Together: Synergizing Modality-Binding Foundation Models with Optimal Transport for Non-Verbal Emotion Recognition

Add code
Sep 21, 2024
Figure 1 for Strong Alone, Stronger Together: Synergizing Modality-Binding Foundation Models with Optimal Transport for Non-Verbal Emotion Recognition
Figure 2 for Strong Alone, Stronger Together: Synergizing Modality-Binding Foundation Models with Optimal Transport for Non-Verbal Emotion Recognition
Figure 3 for Strong Alone, Stronger Together: Synergizing Modality-Binding Foundation Models with Optimal Transport for Non-Verbal Emotion Recognition
Figure 4 for Strong Alone, Stronger Together: Synergizing Modality-Binding Foundation Models with Optimal Transport for Non-Verbal Emotion Recognition
Viaarxiv icon