Alert button
Picture for Yao Fu

Yao Fu

Alert button

Data Engineering for Scaling Language Models to 128K Context

Feb 15, 2024
Yao Fu, Rameswar Panda, Xinyao Niu, Xiang Yue, Hannaneh Hajishirzi, Yoon Kim, Hao Peng

Viaarxiv icon

Critical Data Size of Language Models from a Grokking Perspective

Feb 06, 2024
Xuekai Zhu, Yao Fu, Bowen Zhou, Zhouhan Lin

Viaarxiv icon

OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models

Jan 29, 2024
Fuzhao Xue, Zian Zheng, Yao Fu, Jinjie Ni, Zangwei Zheng, Wangchunshu Zhou, Yang You

Viaarxiv icon

MoE-Infinity: Activation-Aware Expert Offloading for Efficient MoE Serving

Jan 25, 2024
Leyang Xue, Yao Fu, Zhan Lu, Luo Mai, Mahesh Marina

Viaarxiv icon

ServerlessLLM: Locality-Enhanced Serverless Inference for Large Language Models

Jan 25, 2024
Yao Fu, Leyang Xue, Yeqi Huang, Andrei-Octavian Brabete, Dmitrii Ustiugov, Yuvraj Patel, Luo Mai

Viaarxiv icon

FiLM: Fill-in Language Models for Any-Order Generation

Oct 15, 2023
Tianxiao Shen, Hao Peng, Ruoqi Shen, Yao Fu, Zaid Harchaoui, Yejin Choi

Viaarxiv icon

MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning

Oct 03, 2023
Xiang Yue, Xingwei Qu, Ge Zhang, Yao Fu, Wenhao Huang, Huan Sun, Yu Su, Wenhu Chen

Figure 1 for MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning
Figure 2 for MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning
Figure 3 for MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning
Figure 4 for MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning
Viaarxiv icon

Go Beyond Imagination: Maximizing Episodic Reachability with World Models

Aug 25, 2023
Yao Fu, Run Peng, Honglak Lee

Figure 1 for Go Beyond Imagination: Maximizing Episodic Reachability with World Models
Figure 2 for Go Beyond Imagination: Maximizing Episodic Reachability with World Models
Figure 3 for Go Beyond Imagination: Maximizing Episodic Reachability with World Models
Figure 4 for Go Beyond Imagination: Maximizing Episodic Reachability with World Models
Viaarxiv icon