Picture for Tianle Cai

Tianle Cai

Incremental Comprehension of Garden-Path Sentences by Large Language Models: Semantic Interpretation, Syntactic Re-Analysis, and Attention

Add code
May 25, 2024
Viaarxiv icon

SnapKV: LLM Knows What You are Looking for Before Generation

Add code
Apr 22, 2024
Figure 1 for SnapKV: LLM Knows What You are Looking for Before Generation
Figure 2 for SnapKV: LLM Knows What You are Looking for Before Generation
Figure 3 for SnapKV: LLM Knows What You are Looking for Before Generation
Figure 4 for SnapKV: LLM Knows What You are Looking for Before Generation
Viaarxiv icon

JetMoE: Reaching Llama2 Performance with 0.1M Dollars

Add code
Apr 11, 2024
Figure 1 for JetMoE: Reaching Llama2 Performance with 0.1M Dollars
Figure 2 for JetMoE: Reaching Llama2 Performance with 0.1M Dollars
Figure 3 for JetMoE: Reaching Llama2 Performance with 0.1M Dollars
Figure 4 for JetMoE: Reaching Llama2 Performance with 0.1M Dollars
Viaarxiv icon

DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

Add code
Mar 07, 2024
Viaarxiv icon

Accelerating Greedy Coordinate Gradient via Probe Sampling

Add code
Mar 02, 2024
Figure 1 for Accelerating Greedy Coordinate Gradient via Probe Sampling
Figure 2 for Accelerating Greedy Coordinate Gradient via Probe Sampling
Figure 3 for Accelerating Greedy Coordinate Gradient via Probe Sampling
Figure 4 for Accelerating Greedy Coordinate Gradient via Probe Sampling
Viaarxiv icon

BitDelta: Your Fine-Tune May Only Be Worth One Bit

Add code
Feb 28, 2024
Viaarxiv icon

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Add code
Jan 19, 2024
Viaarxiv icon

REST: Retrieval-Based Speculative Decoding

Add code
Nov 14, 2023
Viaarxiv icon

Scaling In-Context Demonstrations with Structured Attention

Add code
Jul 05, 2023
Figure 1 for Scaling In-Context Demonstrations with Structured Attention
Figure 2 for Scaling In-Context Demonstrations with Structured Attention
Figure 3 for Scaling In-Context Demonstrations with Structured Attention
Figure 4 for Scaling In-Context Demonstrations with Structured Attention
Viaarxiv icon

Reward Collapse in Aligning Large Language Models

Add code
May 28, 2023
Figure 1 for Reward Collapse in Aligning Large Language Models
Figure 2 for Reward Collapse in Aligning Large Language Models
Figure 3 for Reward Collapse in Aligning Large Language Models
Figure 4 for Reward Collapse in Aligning Large Language Models
Viaarxiv icon