Picture for Navonil Majumder

Navonil Majumder

JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment

Add code
Jul 28, 2025
Viaarxiv icon

Lessons from Training Grounded LLMs with Verifiable Rewards

Add code
Jun 18, 2025
Viaarxiv icon

NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks

Add code
Apr 28, 2025
Viaarxiv icon

TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization

Add code
Dec 30, 2024
Viaarxiv icon

Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse

Add code
Sep 17, 2024
Figure 1 for Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
Figure 2 for Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
Figure 3 for Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
Figure 4 for Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
Viaarxiv icon

Reward Steering with Evolutionary Heuristics for Decoding-time Alignment

Add code
Jun 25, 2024
Figure 1 for Reward Steering with Evolutionary Heuristics for Decoding-time Alignment
Figure 2 for Reward Steering with Evolutionary Heuristics for Decoding-time Alignment
Figure 3 for Reward Steering with Evolutionary Heuristics for Decoding-time Alignment
Figure 4 for Reward Steering with Evolutionary Heuristics for Decoding-time Alignment
Viaarxiv icon

Improving Text-To-Audio Models with Synthetic Captions

Add code
Jun 18, 2024
Figure 1 for Improving Text-To-Audio Models with Synthetic Captions
Figure 2 for Improving Text-To-Audio Models with Synthetic Captions
Figure 3 for Improving Text-To-Audio Models with Synthetic Captions
Figure 4 for Improving Text-To-Audio Models with Synthetic Captions
Viaarxiv icon

Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization

Add code
Apr 16, 2024
Figure 1 for Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization
Figure 2 for Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization
Figure 3 for Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization
Figure 4 for Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization
Viaarxiv icon

Stuck in the Quicksand of Numeracy, Far from AGI Summit: Evaluating LLMs' Mathematical Competency through Ontology-guided Perturbations

Add code
Jan 17, 2024
Viaarxiv icon

Mustango: Toward Controllable Text-to-Music Generation

Add code
Nov 14, 2023
Figure 1 for Mustango: Toward Controllable Text-to-Music Generation
Figure 2 for Mustango: Toward Controllable Text-to-Music Generation
Figure 3 for Mustango: Toward Controllable Text-to-Music Generation
Figure 4 for Mustango: Toward Controllable Text-to-Music Generation
Viaarxiv icon