Picture for Corby Rosset

Corby Rosset

Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF

May 31, 2024
Viaarxiv icon

MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels

Add code
May 13, 2024
Viaarxiv icon

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Apr 23, 2024
Viaarxiv icon

Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

Add code
Apr 04, 2024
Figure 1 for Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Figure 2 for Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Figure 3 for Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Figure 4 for Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Viaarxiv icon

Researchy Questions: A Dataset of Multi-Perspective, Decompositional Questions for LLM Web Agents

Add code
Feb 27, 2024
Viaarxiv icon

Orca-Math: Unlocking the potential of SLMs in Grade School Math

Feb 16, 2024
Viaarxiv icon

Axiomatic Preference Modeling for Longform Question Answering

Dec 02, 2023
Viaarxiv icon

Orca 2: Teaching Small Language Models How to Reason

Add code
Nov 21, 2023
Viaarxiv icon

Overview of the TREC 2023 Product Product Search Track

Add code
Nov 15, 2023
Viaarxiv icon

Contrastive Post-training Large Language Models on Data Curriculum

Add code
Oct 03, 2023
Figure 1 for Contrastive Post-training Large Language Models on Data Curriculum
Figure 2 for Contrastive Post-training Large Language Models on Data Curriculum
Figure 3 for Contrastive Post-training Large Language Models on Data Curriculum
Figure 4 for Contrastive Post-training Large Language Models on Data Curriculum
Viaarxiv icon