Picture for Bhiksha Raj

Bhiksha Raj

Language Technologies Institute, Carnegie Mellon University, Mohammed bin Zayed University of AI

CoLMbo: Speaker Language Model for Descriptive Profiling

Add code
Jun 11, 2025
Viaarxiv icon

Total-Editing: Head Avatar with Editable Appearance, Motion, and Lighting

Add code
May 26, 2025
Viaarxiv icon

CAARMA: Class Augmentation with Adversarial Mixup Regularization

Add code
Mar 20, 2025
Figure 1 for CAARMA: Class Augmentation with Adversarial Mixup Regularization
Figure 2 for CAARMA: Class Augmentation with Adversarial Mixup Regularization
Figure 3 for CAARMA: Class Augmentation with Adversarial Mixup Regularization
Figure 4 for CAARMA: Class Augmentation with Adversarial Mixup Regularization
Viaarxiv icon

Robust Latent Matters: Boosting Image Generation with Sampling Error

Add code
Mar 11, 2025
Figure 1 for Robust Latent Matters: Boosting Image Generation with Sampling Error
Figure 2 for Robust Latent Matters: Boosting Image Generation with Sampling Error
Figure 3 for Robust Latent Matters: Boosting Image Generation with Sampling Error
Figure 4 for Robust Latent Matters: Boosting Image Generation with Sampling Error
Viaarxiv icon

Mellow: a small audio language model for reasoning

Add code
Mar 11, 2025
Figure 1 for Mellow: a small audio language model for reasoning
Figure 2 for Mellow: a small audio language model for reasoning
Figure 3 for Mellow: a small audio language model for reasoning
Figure 4 for Mellow: a small audio language model for reasoning
Viaarxiv icon

On the Robust Approximation of ASR Metrics

Add code
Feb 18, 2025
Figure 1 for On the Robust Approximation of ASR Metrics
Figure 2 for On the Robust Approximation of ASR Metrics
Figure 3 for On the Robust Approximation of ASR Metrics
Figure 4 for On the Robust Approximation of ASR Metrics
Viaarxiv icon

Lost in Transcription, Found in Distribution Shift: Demystifying Hallucination in Speech Foundation Models

Add code
Feb 18, 2025
Figure 1 for Lost in Transcription, Found in Distribution Shift: Demystifying Hallucination in Speech Foundation Models
Figure 2 for Lost in Transcription, Found in Distribution Shift: Demystifying Hallucination in Speech Foundation Models
Figure 3 for Lost in Transcription, Found in Distribution Shift: Demystifying Hallucination in Speech Foundation Models
Figure 4 for Lost in Transcription, Found in Distribution Shift: Demystifying Hallucination in Speech Foundation Models
Viaarxiv icon

ADIFF: Explaining audio difference using natural language

Add code
Feb 06, 2025
Figure 1 for ADIFF: Explaining audio difference using natural language
Figure 2 for ADIFF: Explaining audio difference using natural language
Figure 3 for ADIFF: Explaining audio difference using natural language
Figure 4 for ADIFF: Explaining audio difference using natural language
Viaarxiv icon

Masked Autoencoders Are Effective Tokenizers for Diffusion Models

Add code
Feb 05, 2025
Viaarxiv icon

Scalable Benchmarking and Robust Learning for Noise-Free Ego-Motion and 3D Reconstruction from Noisy Video

Add code
Jan 24, 2025
Viaarxiv icon