Alert button
Picture for Bingyang Wu

Bingyang Wu

Alert button

LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism

Add code
Bookmark button
Alert button
Apr 15, 2024
Bingyang Wu, Shengyu Liu, Yinmin Zhong, Peng Sun, Xuanzhe Liu, Xin Jin

Viaarxiv icon

A Survey of Resource-efficient LLM and Multimodal Foundation Models

Add code
Bookmark button
Alert button
Jan 16, 2024
Mengwei Xu, Wangsong Yin, Dongqi Cai, Rongjie Yi, Daliang Xu, Qipeng Wang, Bingyang Wu, Yihao Zhao, Chen Yang, Shihe Wang, Qiyang Zhang, Zhenyan Lu, Li Zhang, Shangguang Wang, Yuanchun Li, Yunxin Liu, Xin Jin, Xuanzhe Liu

Viaarxiv icon

Fast Distributed Inference Serving for Large Language Models

Add code
Bookmark button
Alert button
May 10, 2023
Bingyang Wu, Yinmin Zhong, Zili Zhang, Gang Huang, Xuanzhe Liu, Xin Jin

Figure 1 for Fast Distributed Inference Serving for Large Language Models
Figure 2 for Fast Distributed Inference Serving for Large Language Models
Figure 3 for Fast Distributed Inference Serving for Large Language Models
Figure 4 for Fast Distributed Inference Serving for Large Language Models
Viaarxiv icon