Alert button
Picture for Byeongwook Kim

Byeongwook Kim

Alert button

No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization

Feb 28, 2024
June Yong Yang, Byeongwook Kim, Jeongin Bae, Beomseok Kwon, Gunho Park, Eunho Yang, Se Jung Kwon, Dongsoo Lee

Viaarxiv icon

DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation

Feb 27, 2024
Sunghyeon Woo, Baeseong Park, Byeongwook Kim, Minjung Jo, Sejung Kwon, Dongsuk Jeon, Dongsoo Lee

Viaarxiv icon

Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models

Sep 27, 2023
Jung Hwan Heo, Jeonghoon Kim, Beomseok Kwon, Byeongwook Kim, Se Jung Kwon, Dongsoo Lee

Figure 1 for Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models
Figure 2 for Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models
Figure 3 for Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models
Figure 4 for Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models
Viaarxiv icon

AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models

Oct 08, 2022
Se Jung Kwon, Jeonghoon Kim, Jeongin Bae, Kang Min Yoo, Jin-Hwa Kim, Baeseong Park, Byeongwook Kim, Jung-Woo Ha, Nako Sung, Dongsoo Lee

Figure 1 for AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models
Figure 2 for AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models
Figure 3 for AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models
Figure 4 for AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models
Viaarxiv icon

nuQmm: Quantized MatMul for Efficient Inference of Large-Scale Generative Language Models

Jun 20, 2022
Gunho Park, Baeseong Park, Se Jung Kwon, Byeongwook Kim, Youngjoo Lee, Dongsoo Lee

Figure 1 for nuQmm: Quantized MatMul for Efficient Inference of Large-Scale Generative Language Models
Figure 2 for nuQmm: Quantized MatMul for Efficient Inference of Large-Scale Generative Language Models
Figure 3 for nuQmm: Quantized MatMul for Efficient Inference of Large-Scale Generative Language Models
Figure 4 for nuQmm: Quantized MatMul for Efficient Inference of Large-Scale Generative Language Models
Viaarxiv icon

Modulating Regularization Frequency for Efficient Compression-Aware Model Training

May 05, 2021
Dongsoo Lee, Se Jung Kwon, Byeongwook Kim, Jeongin Yun, Baeseong Park, Yongkweon Jeon

Figure 1 for Modulating Regularization Frequency for Efficient Compression-Aware Model Training
Figure 2 for Modulating Regularization Frequency for Efficient Compression-Aware Model Training
Figure 3 for Modulating Regularization Frequency for Efficient Compression-Aware Model Training
Figure 4 for Modulating Regularization Frequency for Efficient Compression-Aware Model Training
Viaarxiv icon

Sequential Encryption of Sparse Neural Networks Toward Optimum Representation of Irregular Sparsity

May 05, 2021
Baeseong Park, Se Jung Kwon, Dongsoo Lee, Daehwan Oh, Byeongwook Kim, Yongkweon Jeon, Yeonju Ro

Figure 1 for Sequential Encryption of Sparse Neural Networks Toward Optimum Representation of Irregular Sparsity
Figure 2 for Sequential Encryption of Sparse Neural Networks Toward Optimum Representation of Irregular Sparsity
Figure 3 for Sequential Encryption of Sparse Neural Networks Toward Optimum Representation of Irregular Sparsity
Figure 4 for Sequential Encryption of Sparse Neural Networks Toward Optimum Representation of Irregular Sparsity
Viaarxiv icon

Q-Rater: Non-Convex Optimization for Post-Training Uniform Quantization

May 05, 2021
Byeongwook Kim, Dongsoo Lee, Yeonju Ro, Yongkweon Jeon, Se Jung Kwon, Baeseong Park, Daehwan Oh

Figure 1 for Q-Rater: Non-Convex Optimization for Post-Training Uniform Quantization
Figure 2 for Q-Rater: Non-Convex Optimization for Post-Training Uniform Quantization
Figure 3 for Q-Rater: Non-Convex Optimization for Post-Training Uniform Quantization
Figure 4 for Q-Rater: Non-Convex Optimization for Post-Training Uniform Quantization
Viaarxiv icon

Extremely Low Bit Transformer Quantization for On-Device Neural Machine Translation

Oct 13, 2020
Insoo Chung, Byeongwook Kim, Yoonjung Choi, Se Jung Kwon, Yongkweon Jeon, Baeseong Park, Sangha Kim, Dongsoo Lee

Figure 1 for Extremely Low Bit Transformer Quantization for On-Device Neural Machine Translation
Figure 2 for Extremely Low Bit Transformer Quantization for On-Device Neural Machine Translation
Figure 3 for Extremely Low Bit Transformer Quantization for On-Device Neural Machine Translation
Figure 4 for Extremely Low Bit Transformer Quantization for On-Device Neural Machine Translation
Viaarxiv icon