Picture for Bingyang Wu

Bingyang Wu

StreamRL: Scalable, Heterogeneous, and Elastic RL for LLMs with Disaggregated Stream Generation

Add code
Apr 22, 2025
Viaarxiv icon

LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism

Add code
Apr 15, 2024
Viaarxiv icon

A Survey of Resource-efficient LLM and Multimodal Foundation Models

Add code
Jan 16, 2024
Figure 1 for A Survey of Resource-efficient LLM and Multimodal Foundation Models
Figure 2 for A Survey of Resource-efficient LLM and Multimodal Foundation Models
Figure 3 for A Survey of Resource-efficient LLM and Multimodal Foundation Models
Figure 4 for A Survey of Resource-efficient LLM and Multimodal Foundation Models
Viaarxiv icon

Fast Distributed Inference Serving for Large Language Models

Add code
May 10, 2023
Viaarxiv icon