Picture for Chengruidong Zhang

Chengruidong Zhang

MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention

Add code
Jul 02, 2024
Viaarxiv icon

Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

Add code
May 30, 2024
Viaarxiv icon

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Add code
Apr 23, 2024
Figure 1 for Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Figure 2 for Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Figure 3 for Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Figure 4 for Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Viaarxiv icon

LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Add code
Feb 21, 2024
Figure 1 for LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Figure 2 for LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Figure 3 for LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Figure 4 for LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Viaarxiv icon