Picture for Jakaria Islam Emon

Jakaria Islam Emon

WhisQ: Cross-Modal Representation Learning for Text-to-Music MOS Prediction

Add code
Jun 06, 2025
Figure 1 for WhisQ: Cross-Modal Representation Learning for Text-to-Music MOS Prediction
Figure 2 for WhisQ: Cross-Modal Representation Learning for Text-to-Music MOS Prediction
Figure 3 for WhisQ: Cross-Modal Representation Learning for Text-to-Music MOS Prediction
Viaarxiv icon

Whisper Speaker Identification: Leveraging Pre-Trained Multilingual Transformers for Robust Speaker Embeddings

Add code
Mar 13, 2025
Figure 1 for Whisper Speaker Identification: Leveraging Pre-Trained Multilingual Transformers for Robust Speaker Embeddings
Figure 2 for Whisper Speaker Identification: Leveraging Pre-Trained Multilingual Transformers for Robust Speaker Embeddings
Figure 3 for Whisper Speaker Identification: Leveraging Pre-Trained Multilingual Transformers for Robust Speaker Embeddings
Figure 4 for Whisper Speaker Identification: Leveraging Pre-Trained Multilingual Transformers for Robust Speaker Embeddings
Viaarxiv icon

Detecting the Undetectable: Combining Kolmogorov-Arnold Networks and MLP for AI-Generated Image Detection

Add code
Aug 18, 2024
Viaarxiv icon