Picture for Tianle Cai

Tianle Cai

Accelerating Greedy Coordinate Gradient via Probe Sampling

Add code
Mar 02, 2024
Viaarxiv icon

BitDelta: Your Fine-Tune May Only Be Worth One Bit

Add code
Feb 28, 2024
Viaarxiv icon

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Add code
Jan 19, 2024
Figure 1 for Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Figure 2 for Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Figure 3 for Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Figure 4 for Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Viaarxiv icon

REST: Retrieval-Based Speculative Decoding

Add code
Nov 14, 2023
Viaarxiv icon

Scaling In-Context Demonstrations with Structured Attention

Add code
Jul 05, 2023
Viaarxiv icon

Reward Collapse in Aligning Large Language Models

Add code
May 28, 2023
Viaarxiv icon

Large Language Models as Tool Makers

Add code
May 26, 2023
Viaarxiv icon

What Makes Convolutional Models Great on Long Sequence Modeling?

Add code
Oct 17, 2022
Figure 1 for What Makes Convolutional Models Great on Long Sequence Modeling?
Figure 2 for What Makes Convolutional Models Great on Long Sequence Modeling?
Figure 3 for What Makes Convolutional Models Great on Long Sequence Modeling?
Figure 4 for What Makes Convolutional Models Great on Long Sequence Modeling?
Viaarxiv icon

Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive Privacy Analysis and Beyond

Add code
Jul 19, 2022
Figure 1 for Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive Privacy Analysis and Beyond
Figure 2 for Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive Privacy Analysis and Beyond
Figure 3 for Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive Privacy Analysis and Beyond
Figure 4 for Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive Privacy Analysis and Beyond
Viaarxiv icon

Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding

Add code
Jun 23, 2021
Figure 1 for Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding
Figure 2 for Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding
Figure 3 for Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding
Figure 4 for Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding
Viaarxiv icon