Picture for Lu Hou

Lu Hou

Huawei Noah's Ark Lab

DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models

Add code
May 31, 2024
Viaarxiv icon

OAC: Output-adaptive Calibration for Accurate Post-training Quantization

May 23, 2024
Viaarxiv icon

Towards Multimodal Video Paragraph Captioning Models Robust to Missing Modality

Add code
Mar 28, 2024
Figure 1 for Towards Multimodal Video Paragraph Captioning Models Robust to Missing Modality
Figure 2 for Towards Multimodal Video Paragraph Captioning Models Robust to Missing Modality
Figure 3 for Towards Multimodal Video Paragraph Captioning Models Robust to Missing Modality
Figure 4 for Towards Multimodal Video Paragraph Captioning Models Robust to Missing Modality
Viaarxiv icon

Visually Guided Generative Text-Layout Pre-training for Document Intelligence

Add code
Mar 27, 2024
Viaarxiv icon

MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric

Mar 12, 2024
Figure 1 for MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric
Figure 2 for MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric
Figure 3 for MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric
Figure 4 for MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric
Viaarxiv icon

IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact

Add code
Mar 02, 2024
Figure 1 for IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact
Figure 2 for IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact
Figure 3 for IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact
Figure 4 for IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact
Viaarxiv icon

TempCompass: Do Video LLMs Really Understand Videos?

Add code
Mar 01, 2024
Figure 1 for TempCompass: Do Video LLMs Really Understand Videos?
Figure 2 for TempCompass: Do Video LLMs Really Understand Videos?
Figure 3 for TempCompass: Do Video LLMs Really Understand Videos?
Figure 4 for TempCompass: Do Video LLMs Really Understand Videos?
Viaarxiv icon

Extending Context Window of Large Language Models via Semantic Compression

Dec 15, 2023
Viaarxiv icon

TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

Add code
Dec 04, 2023
Viaarxiv icon

VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models

Add code
Nov 29, 2023
Viaarxiv icon