Picture for Rohan Deb

Rohan Deb

FisherSFT: Data-Efficient Supervised Fine-Tuning of Language Models Using Information Gain

Add code
May 20, 2025
Viaarxiv icon

Conservative Contextual Bandits: Beyond Linear Representations

Add code
Dec 09, 2024
Figure 1 for Conservative Contextual Bandits: Beyond Linear Representations
Figure 2 for Conservative Contextual Bandits: Beyond Linear Representations
Viaarxiv icon

Think Before You Duel: Understanding Complexities of Preference Learning under Constrained Resources

Add code
Dec 28, 2023
Figure 1 for Think Before You Duel: Understanding Complexities of Preference Learning under Constrained Resources
Figure 2 for Think Before You Duel: Understanding Complexities of Preference Learning under Constrained Resources
Viaarxiv icon

Contextual Bandits with Online Neural Regression

Add code
Dec 12, 2023
Figure 1 for Contextual Bandits with Online Neural Regression
Figure 2 for Contextual Bandits with Online Neural Regression
Figure 3 for Contextual Bandits with Online Neural Regression
Figure 4 for Contextual Bandits with Online Neural Regression
Viaarxiv icon

Schedule Based Temporal Difference Algorithms

Add code
Nov 23, 2021
Figure 1 for Schedule Based Temporal Difference Algorithms
Figure 2 for Schedule Based Temporal Difference Algorithms
Figure 3 for Schedule Based Temporal Difference Algorithms
Figure 4 for Schedule Based Temporal Difference Algorithms
Viaarxiv icon

Gradient Temporal Difference with Momentum: Stability and Convergence

Add code
Nov 22, 2021
Figure 1 for Gradient Temporal Difference with Momentum: Stability and Convergence
Figure 2 for Gradient Temporal Difference with Momentum: Stability and Convergence
Figure 3 for Gradient Temporal Difference with Momentum: Stability and Convergence
Figure 4 for Gradient Temporal Difference with Momentum: Stability and Convergence
Viaarxiv icon

Does Momentum Help? A Sample Complexity Analysis

Add code
Oct 29, 2021
Figure 1 for Does Momentum Help? A Sample Complexity Analysis
Figure 2 for Does Momentum Help? A Sample Complexity Analysis
Viaarxiv icon