Picture for Sambit Sahu

Sambit Sahu

Leveraging Parameter Space Symmetries for Reasoning Skill Transfer in LLMs

Add code
Nov 13, 2025
Viaarxiv icon

SPEAR-MM: Selective Parameter Evaluation and Restoration via Model Merging for Efficient Financial LLM Adaptation

Add code
Nov 11, 2025
Viaarxiv icon

Optimizing Reasoning Efficiency through Prompt Difficulty Prediction

Add code
Nov 05, 2025
Viaarxiv icon

T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning

Add code
May 22, 2025
Figure 1 for T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning
Figure 2 for T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning
Figure 3 for T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning
Figure 4 for T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning
Viaarxiv icon

Critique-Guided Distillation: Improving Supervised Fine-tuning via Better Distillation

Add code
May 16, 2025
Viaarxiv icon

Continual Pre-training of MoEs: How robust is your router?

Add code
Mar 06, 2025
Figure 1 for Continual Pre-training of MoEs: How robust is your router?
Figure 2 for Continual Pre-training of MoEs: How robust is your router?
Figure 3 for Continual Pre-training of MoEs: How robust is your router?
Figure 4 for Continual Pre-training of MoEs: How robust is your router?
Viaarxiv icon

RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization

Add code
Oct 05, 2024
Figure 1 for RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Figure 2 for RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Figure 3 for RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Figure 4 for RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Viaarxiv icon

Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey

Add code
Sep 17, 2024
Figure 1 for Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey
Figure 2 for Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey
Figure 3 for Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey
Figure 4 for Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey
Viaarxiv icon