Picture for Ricard Marxer

Ricard Marxer

DYNI

On the Use of Self-Supervised Representation Learning for Speaker Diarization and Separation

Add code
Dec 17, 2025
Figure 1 for On the Use of Self-Supervised Representation Learning for Speaker Diarization and Separation
Figure 2 for On the Use of Self-Supervised Representation Learning for Speaker Diarization and Separation
Figure 3 for On the Use of Self-Supervised Representation Learning for Speaker Diarization and Separation
Figure 4 for On the Use of Self-Supervised Representation Learning for Speaker Diarization and Separation
Viaarxiv icon

SDialog: A Python Toolkit for End-to-End Agent Building, User Simulation, Dialog Generation, and Evaluation

Add code
Dec 12, 2025
Viaarxiv icon

Crossing the Species Divide: Transfer Learning from Speech to Animal Sounds

Add code
Sep 04, 2025
Viaarxiv icon

Depth Jitter: Seeing through the Depth

Add code
Aug 08, 2025
Viaarxiv icon

Factorized RVQ-GAN For Disentangled Speech Tokenization

Add code
Jun 18, 2025
Viaarxiv icon

Discrete Audio Tokens: More Than a Survey!

Add code
Jun 12, 2025
Viaarxiv icon

Aligning Multimodal Representations through an Information Bottleneck

Add code
Jun 05, 2025
Viaarxiv icon

Text-Speech Language Models with Improved Cross-Modal Transfer by Aligning Abstraction Levels

Add code
Mar 08, 2025
Figure 1 for Text-Speech Language Models with Improved Cross-Modal Transfer by Aligning Abstraction Levels
Figure 2 for Text-Speech Language Models with Improved Cross-Modal Transfer by Aligning Abstraction Levels
Figure 3 for Text-Speech Language Models with Improved Cross-Modal Transfer by Aligning Abstraction Levels
Figure 4 for Text-Speech Language Models with Improved Cross-Modal Transfer by Aligning Abstraction Levels
Viaarxiv icon

TalTech-IRIT-LIS Speaker and Language Diarization Systems for DISPLACE 2024

Add code
Jul 17, 2024
Figure 1 for TalTech-IRIT-LIS Speaker and Language Diarization Systems for DISPLACE 2024
Figure 2 for TalTech-IRIT-LIS Speaker and Language Diarization Systems for DISPLACE 2024
Figure 3 for TalTech-IRIT-LIS Speaker and Language Diarization Systems for DISPLACE 2024
Figure 4 for TalTech-IRIT-LIS Speaker and Language Diarization Systems for DISPLACE 2024
Viaarxiv icon

Transfer Learning from Whisper for Microscopic Intelligibility Prediction

Add code
Apr 02, 2024
Figure 1 for Transfer Learning from Whisper for Microscopic Intelligibility Prediction
Figure 2 for Transfer Learning from Whisper for Microscopic Intelligibility Prediction
Figure 3 for Transfer Learning from Whisper for Microscopic Intelligibility Prediction
Figure 4 for Transfer Learning from Whisper for Microscopic Intelligibility Prediction
Viaarxiv icon