Alert button

Inferflow: an Efficient and Highly Configurable Inference Engine for Large Language Models

Jan 16, 2024
Shuming Shi, Enbo Zhao, Deng Cai, Leyang Cui, Xinting Huang, Huayang Li

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: