Alert button

FFSplit: Split Feed-Forward Network For Optimizing Accuracy-Efficiency Trade-off in Language Model Inference

Jan 08, 2024
Zirui Liu, Qingquan Song, Qiang Charles Xiao, Sathiya Keerthi Selvaraj, Rahul Mazumder, Aman Gupta, Xia Hu

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: