Alert button
Picture for Luis Ceze

Luis Ceze

Alert button

University of Washington

Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

Add code
Bookmark button
Alert button
Nov 07, 2023
Yilong Zhao, Chien-Yu Lin, Kan Zhu, Zihao Ye, Lequn Chen, Size Zheng, Luis Ceze, Arvind Krishnamurthy, Tianqi Chen, Baris Kasikci

Figure 1 for Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
Figure 2 for Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
Figure 3 for Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
Figure 4 for Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
Viaarxiv icon

Punica: Multi-Tenant LoRA Serving

Add code
Bookmark button
Alert button
Oct 28, 2023
Lequn Chen, Zihao Ye, Yongji Wu, Danyang Zhuo, Luis Ceze, Arvind Krishnamurthy

Figure 1 for Punica: Multi-Tenant LoRA Serving
Figure 2 for Punica: Multi-Tenant LoRA Serving
Figure 3 for Punica: Multi-Tenant LoRA Serving
Figure 4 for Punica: Multi-Tenant LoRA Serving
Viaarxiv icon

SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning

Add code
Bookmark button
Alert button
Jul 11, 2022
Zihao Ye, Ruihang Lai, Junru Shao, Tianqi Chen, Luis Ceze

Figure 1 for SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning
Figure 2 for SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning
Figure 3 for SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning
Figure 4 for SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning
Viaarxiv icon

Characterizing and Taming Resolution in Convolutional Neural Networks

Add code
Bookmark button
Alert button
Oct 28, 2021
Eddie Yan, Liang Luo, Luis Ceze

Figure 1 for Characterizing and Taming Resolution in Convolutional Neural Networks
Figure 2 for Characterizing and Taming Resolution in Convolutional Neural Networks
Figure 3 for Characterizing and Taming Resolution in Convolutional Neural Networks
Figure 4 for Characterizing and Taming Resolution in Convolutional Neural Networks
Viaarxiv icon

Cloud Collectives: Towards Cloud-aware Collectives forML Workloads with Rank Reordering

Add code
Bookmark button
Alert button
May 28, 2021
Liang Luo, Jacob Nelson, Arvind Krishnamurthy, Luis Ceze

Figure 1 for Cloud Collectives: Towards Cloud-aware Collectives forML Workloads with Rank Reordering
Figure 2 for Cloud Collectives: Towards Cloud-aware Collectives forML Workloads with Rank Reordering
Figure 3 for Cloud Collectives: Towards Cloud-aware Collectives forML Workloads with Rank Reordering
Figure 4 for Cloud Collectives: Towards Cloud-aware Collectives forML Workloads with Rank Reordering
Viaarxiv icon

Accelerating SpMM Kernel with Cache-First Edge Sampling for Graph Neural Networks

Add code
Bookmark button
Alert button
Apr 23, 2021
Chien-Yu Lin, Liang Luo, Luis Ceze

Figure 1 for Accelerating SpMM Kernel with Cache-First Edge Sampling for Graph Neural Networks
Figure 2 for Accelerating SpMM Kernel with Cache-First Edge Sampling for Graph Neural Networks
Figure 3 for Accelerating SpMM Kernel with Cache-First Edge Sampling for Graph Neural Networks
Figure 4 for Accelerating SpMM Kernel with Cache-First Edge Sampling for Graph Neural Networks
Viaarxiv icon

Automated Backend-Aware Post-Training Quantization

Add code
Bookmark button
Alert button
Mar 27, 2021
Ziheng Jiang, Animesh Jain, Andrew Liu, Josh Fromm, Chengqian Ma, Tianqi Chen, Luis Ceze

Figure 1 for Automated Backend-Aware Post-Training Quantization
Figure 2 for Automated Backend-Aware Post-Training Quantization
Figure 3 for Automated Backend-Aware Post-Training Quantization
Figure 4 for Automated Backend-Aware Post-Training Quantization
Viaarxiv icon

Learning to Optimize Tensor Programs

Add code
Bookmark button
Alert button
Oct 27, 2018
Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy

Figure 1 for Learning to Optimize Tensor Programs
Figure 2 for Learning to Optimize Tensor Programs
Figure 3 for Learning to Optimize Tensor Programs
Figure 4 for Learning to Optimize Tensor Programs
Viaarxiv icon

Automating Generation of Low Precision Deep Learning Operators

Add code
Bookmark button
Alert button
Oct 25, 2018
Meghan Cowan, Thierry Moreau, Tianqi Chen, Luis Ceze

Figure 1 for Automating Generation of Low Precision Deep Learning Operators
Figure 2 for Automating Generation of Low Precision Deep Learning Operators
Figure 3 for Automating Generation of Low Precision Deep Learning Operators
Figure 4 for Automating Generation of Low Precision Deep Learning Operators
Viaarxiv icon