Alert button
Picture for Dipendra Misra

Dipendra Misra

Alert button

Dataset Reset Policy Optimization for RLHF

Add code
Bookmark button
Alert button
Apr 16, 2024
Jonathan D. Chang, Wenhao Zhan, Owen Oertell, Kianté Brantley, Dipendra Misra, Jason D. Lee, Wen Sun

Viaarxiv icon

Provable Interactive Learning with Hindsight Instruction Feedback

Add code
Bookmark button
Alert button
Apr 14, 2024
Dipendra Misra, Aldo Pacchiano, Robert E. Schapire

Viaarxiv icon

Towards Principled Representation Learning from Videos for Reinforcement Learning

Add code
Bookmark button
Alert button
Mar 20, 2024
Dipendra Misra, Akanksha Saran, Tengyang Xie, Alex Lamb, John Langford

Figure 1 for Towards Principled Representation Learning from Videos for Reinforcement Learning
Figure 2 for Towards Principled Representation Learning from Videos for Reinforcement Learning
Figure 3 for Towards Principled Representation Learning from Videos for Reinforcement Learning
Figure 4 for Towards Principled Representation Learning from Videos for Reinforcement Learning
Viaarxiv icon

Policy Improvement using Language Feedback Models

Add code
Bookmark button
Alert button
Feb 25, 2024
Victor Zhong, Dipendra Misra, Xingdi Yuan, Marc-Alexandre Côté

Viaarxiv icon

The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction

Add code
Bookmark button
Alert button
Dec 21, 2023
Pratyusha Sharma, Jordan T. Ash, Dipendra Misra

Viaarxiv icon

LLF-Bench: Benchmark for Interactive Learning from Language Feedback

Add code
Bookmark button
Alert button
Dec 13, 2023
Ching-An Cheng, Andrey Kolobov, Dipendra Misra, Allen Nie, Adith Swaminathan

Figure 1 for LLF-Bench: Benchmark for Interactive Learning from Language Feedback
Figure 2 for LLF-Bench: Benchmark for Interactive Learning from Language Feedback
Figure 3 for LLF-Bench: Benchmark for Interactive Learning from Language Feedback
Figure 4 for LLF-Bench: Benchmark for Interactive Learning from Language Feedback
Viaarxiv icon

Learning to Generate Better Than Your LLM

Add code
Bookmark button
Alert button
Jun 20, 2023
Jonathan D. Chang, Kiante Brantley, Rajkumar Ramamurthy, Dipendra Misra, Wen Sun

Figure 1 for Learning to Generate Better Than Your LLM
Figure 2 for Learning to Generate Better Than Your LLM
Figure 3 for Learning to Generate Better Than Your LLM
Figure 4 for Learning to Generate Better Than Your LLM
Viaarxiv icon