Picture for Kazi Tamanna Alam

Kazi Tamanna Alam

WhisQ: Cross-Modal Representation Learning for Text-to-Music MOS Prediction

Add code
Jun 06, 2025
Figure 1 for WhisQ: Cross-Modal Representation Learning for Text-to-Music MOS Prediction
Figure 2 for WhisQ: Cross-Modal Representation Learning for Text-to-Music MOS Prediction
Figure 3 for WhisQ: Cross-Modal Representation Learning for Text-to-Music MOS Prediction
Viaarxiv icon

Whisper Speaker Identification: Leveraging Pre-Trained Multilingual Transformers for Robust Speaker Embeddings

Add code
Mar 13, 2025
Figure 1 for Whisper Speaker Identification: Leveraging Pre-Trained Multilingual Transformers for Robust Speaker Embeddings
Figure 2 for Whisper Speaker Identification: Leveraging Pre-Trained Multilingual Transformers for Robust Speaker Embeddings
Figure 3 for Whisper Speaker Identification: Leveraging Pre-Trained Multilingual Transformers for Robust Speaker Embeddings
Figure 4 for Whisper Speaker Identification: Leveraging Pre-Trained Multilingual Transformers for Robust Speaker Embeddings
Viaarxiv icon