Picture for Song Han

Song Han

University of Connecticut

Machine learning's own Industrial Revolution

Add code
Nov 04, 2023
Viaarxiv icon

PockEngine: Sparse and Efficient Fine-tuning in a Pocket

Add code
Oct 26, 2023
Figure 1 for PockEngine: Sparse and Efficient Fine-tuning in a Pocket
Figure 2 for PockEngine: Sparse and Efficient Fine-tuning in a Pocket
Figure 3 for PockEngine: Sparse and Efficient Fine-tuning in a Pocket
Figure 4 for PockEngine: Sparse and Efficient Fine-tuning in a Pocket
Viaarxiv icon

Efficient Streaming Language Models with Attention Sinks

Add code
Sep 29, 2023
Figure 1 for Efficient Streaming Language Models with Attention Sinks
Figure 2 for Efficient Streaming Language Models with Attention Sinks
Figure 3 for Efficient Streaming Language Models with Attention Sinks
Figure 4 for Efficient Streaming Language Models with Attention Sinks
Viaarxiv icon

LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

Add code
Sep 21, 2023
Figure 1 for LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Figure 2 for LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Figure 3 for LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Figure 4 for LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Viaarxiv icon

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Add code
Jun 01, 2023
Figure 1 for AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Figure 2 for AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Figure 3 for AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Figure 4 for AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Viaarxiv icon

FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention

Add code
May 21, 2023
Figure 1 for FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention
Figure 2 for FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention
Figure 3 for FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention
Figure 4 for FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention
Viaarxiv icon

SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer

Add code
Mar 30, 2023
Figure 1 for SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
Figure 2 for SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
Figure 3 for SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
Figure 4 for SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
Viaarxiv icon

Offsite-Tuning: Transfer Learning without Full Model

Add code
Feb 09, 2023
Figure 1 for Offsite-Tuning: Transfer Learning without Full Model
Figure 2 for Offsite-Tuning: Transfer Learning without Full Model
Figure 3 for Offsite-Tuning: Transfer Learning without Full Model
Figure 4 for Offsite-Tuning: Transfer Learning without Full Model
Viaarxiv icon

FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer

Add code
Jan 20, 2023
Figure 1 for FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer
Figure 2 for FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer
Figure 3 for FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer
Figure 4 for FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer
Viaarxiv icon

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Add code
Nov 28, 2022
Viaarxiv icon