Picture for Ruihao Gong

Ruihao Gong

ME-Switch: A Memory-Efficient Expert Switching Framework for Large Language Models

Add code
Jun 13, 2024
Viaarxiv icon

Selective Focus: Investigating Semantics Sensitivity in Post-training Quantization for Lane Detection

Add code
May 10, 2024
Viaarxiv icon

LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models

Add code
May 09, 2024
Figure 1 for LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models
Figure 2 for LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models
Figure 3 for LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models
Figure 4 for LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models
Viaarxiv icon

Fast and Controllable Post-training Sparsity: Learning Optimal Sparsity Allocation with Global Constraint in Minutes

Add code
May 09, 2024
Figure 1 for Fast and Controllable Post-training Sparsity: Learning Optimal Sparsity Allocation with Global Constraint in Minutes
Figure 2 for Fast and Controllable Post-training Sparsity: Learning Optimal Sparsity Allocation with Global Constraint in Minutes
Figure 3 for Fast and Controllable Post-training Sparsity: Learning Optimal Sparsity Allocation with Global Constraint in Minutes
Figure 4 for Fast and Controllable Post-training Sparsity: Learning Optimal Sparsity Allocation with Global Constraint in Minutes
Viaarxiv icon

2023 Low-Power Computer Vision Challenge (LPCVC) Summary

Add code
Mar 11, 2024
Figure 1 for 2023 Low-Power Computer Vision Challenge (LPCVC) Summary
Figure 2 for 2023 Low-Power Computer Vision Challenge (LPCVC) Summary
Figure 3 for 2023 Low-Power Computer Vision Challenge (LPCVC) Summary
Figure 4 for 2023 Low-Power Computer Vision Challenge (LPCVC) Summary
Viaarxiv icon

ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding

Add code
Feb 21, 2024
Viaarxiv icon

TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models

Add code
Nov 27, 2023
Figure 1 for TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models
Figure 2 for TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models
Figure 3 for TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models
Figure 4 for TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models
Viaarxiv icon

QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models

Add code
Oct 12, 2023
Figure 1 for QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models
Figure 2 for QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models
Figure 3 for QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models
Figure 4 for QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models
Viaarxiv icon

Lossy and Lossless Post-training Model Size Compression

Add code
Aug 08, 2023
Figure 1 for Lossy and Lossless  Post-training Model Size Compression
Figure 2 for Lossy and Lossless  Post-training Model Size Compression
Figure 3 for Lossy and Lossless  Post-training Model Size Compression
Figure 4 for Lossy and Lossless  Post-training Model Size Compression
Viaarxiv icon

SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency

Add code
Jul 01, 2023
Figure 1 for SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency
Figure 2 for SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency
Figure 3 for SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency
Figure 4 for SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency
Viaarxiv icon