Alert button
Picture for Michael Santacroce

Michael Santacroce

Alert button

Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

Add code
Bookmark button
Alert button
Apr 04, 2024
Corby Rosset, Ching-An Cheng, Arindam Mitra, Michael Santacroce, Ahmed Awadallah, Tengyang Xie

Viaarxiv icon

Adapting LLM Agents Through Communication

Add code
Bookmark button
Alert button
Oct 10, 2023
Kuan Wang, Yadong Lu, Michael Santacroce, Yeyun Gong, Chao Zhang, Yelong Shen

Figure 1 for Adapting LLM Agents Through Communication
Figure 2 for Adapting LLM Agents Through Communication
Figure 3 for Adapting LLM Agents Through Communication
Figure 4 for Adapting LLM Agents Through Communication
Viaarxiv icon

Efficient RLHF: Reducing the Memory Usage of PPO

Add code
Bookmark button
Alert button
Sep 01, 2023
Michael Santacroce, Yadong Lu, Han Yu, Yuanzhi Li, Yelong Shen

Figure 1 for Efficient RLHF: Reducing the Memory Usage of PPO
Figure 2 for Efficient RLHF: Reducing the Memory Usage of PPO
Figure 3 for Efficient RLHF: Reducing the Memory Usage of PPO
Figure 4 for Efficient RLHF: Reducing the Memory Usage of PPO
Viaarxiv icon

What Matters In The Structured Pruning of Generative Language Models?

Add code
Bookmark button
Alert button
Feb 07, 2023
Michael Santacroce, Zixin Wen, Yelong Shen, Yuanzhi Li

Figure 1 for What Matters In The Structured Pruning of Generative Language Models?
Figure 2 for What Matters In The Structured Pruning of Generative Language Models?
Figure 3 for What Matters In The Structured Pruning of Generative Language Models?
Figure 4 for What Matters In The Structured Pruning of Generative Language Models?
Viaarxiv icon