Picture for Xuanzhe Liu

Xuanzhe Liu

Empowering 1000 tokens/second on-device LLM prefilling with mllm-NPU

Add code
Jul 08, 2024
Viaarxiv icon

RAGCache: Efficient Knowledge Caching for Retrieval-Augmented Generation

Add code
Apr 18, 2024
Viaarxiv icon

LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism

Add code
Apr 15, 2024
Figure 1 for LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism
Figure 2 for LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism
Figure 3 for LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism
Figure 4 for LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism
Viaarxiv icon

Exploring the Impact of In-Browser Deep Learning Inference on Quality of User Experience and Performance

Add code
Feb 08, 2024
Figure 1 for Exploring the Impact of In-Browser Deep Learning Inference on Quality of User Experience and Performance
Figure 2 for Exploring the Impact of In-Browser Deep Learning Inference on Quality of User Experience and Performance
Figure 3 for Exploring the Impact of In-Browser Deep Learning Inference on Quality of User Experience and Performance
Figure 4 for Exploring the Impact of In-Browser Deep Learning Inference on Quality of User Experience and Performance
Viaarxiv icon

A Survey of Resource-efficient LLM and Multimodal Foundation Models

Add code
Jan 16, 2024
Viaarxiv icon

LLMCad: Fast and Scalable On-device Large Language Model Inference

Add code
Sep 08, 2023
Figure 1 for LLMCad: Fast and Scalable On-device Large Language Model Inference
Figure 2 for LLMCad: Fast and Scalable On-device Large Language Model Inference
Figure 3 for LLMCad: Fast and Scalable On-device Large Language Model Inference
Figure 4 for LLMCad: Fast and Scalable On-device Large Language Model Inference
Viaarxiv icon

Dark-Skin Individuals Are at More Risk on the Street: Unmasking Fairness Issues of Autonomous Driving Systems

Add code
Aug 05, 2023
Figure 1 for Dark-Skin Individuals Are at More Risk on the Street: Unmasking Fairness Issues of Autonomous Driving Systems
Figure 2 for Dark-Skin Individuals Are at More Risk on the Street: Unmasking Fairness Issues of Autonomous Driving Systems
Figure 3 for Dark-Skin Individuals Are at More Risk on the Street: Unmasking Fairness Issues of Autonomous Driving Systems
Figure 4 for Dark-Skin Individuals Are at More Risk on the Street: Unmasking Fairness Issues of Autonomous Driving Systems
Viaarxiv icon

Fast Distributed Inference Serving for Large Language Models

Add code
May 10, 2023
Figure 1 for Fast Distributed Inference Serving for Large Language Models
Figure 2 for Fast Distributed Inference Serving for Large Language Models
Figure 3 for Fast Distributed Inference Serving for Large Language Models
Figure 4 for Fast Distributed Inference Serving for Large Language Models
Viaarxiv icon

A Comprehensive Benchmark of Deep Learning Libraries on Mobile Devices

Add code
Feb 14, 2022
Figure 1 for A Comprehensive Benchmark of Deep Learning Libraries on Mobile Devices
Figure 2 for A Comprehensive Benchmark of Deep Learning Libraries on Mobile Devices
Figure 3 for A Comprehensive Benchmark of Deep Learning Libraries on Mobile Devices
Figure 4 for A Comprehensive Benchmark of Deep Learning Libraries on Mobile Devices
Viaarxiv icon

Emojis Predict Dropouts of Remote Workers: An Empirical Study of Emoji Usage on GitHub

Add code
Feb 10, 2021
Figure 1 for Emojis Predict Dropouts of Remote Workers: An Empirical Study of Emoji Usage on GitHub
Figure 2 for Emojis Predict Dropouts of Remote Workers: An Empirical Study of Emoji Usage on GitHub
Figure 3 for Emojis Predict Dropouts of Remote Workers: An Empirical Study of Emoji Usage on GitHub
Figure 4 for Emojis Predict Dropouts of Remote Workers: An Empirical Study of Emoji Usage on GitHub
Viaarxiv icon