Picture for Haoran Wei

Haoran Wei

Qwen2 Technical Report

Add code
Jul 16, 2024
Figure 1 for Qwen2 Technical Report
Figure 2 for Qwen2 Technical Report
Figure 3 for Qwen2 Technical Report
Figure 4 for Qwen2 Technical Report
Viaarxiv icon

Focus Anywhere for Fine-grained Multi-page Document Understanding

Add code
May 23, 2024
Figure 1 for Focus Anywhere for Fine-grained Multi-page Document Understanding
Figure 2 for Focus Anywhere for Fine-grained Multi-page Document Understanding
Figure 3 for Focus Anywhere for Fine-grained Multi-page Document Understanding
Figure 4 for Focus Anywhere for Fine-grained Multi-page Document Understanding
Viaarxiv icon

On the Adversarial Robustness of Learning-based Image Compression Against Rate-Distortion Attacks

Add code
May 13, 2024
Figure 1 for On the Adversarial Robustness of Learning-based Image Compression Against Rate-Distortion Attacks
Figure 2 for On the Adversarial Robustness of Learning-based Image Compression Against Rate-Distortion Attacks
Figure 3 for On the Adversarial Robustness of Learning-based Image Compression Against Rate-Distortion Attacks
Figure 4 for On the Adversarial Robustness of Learning-based Image Compression Against Rate-Distortion Attacks
Viaarxiv icon

OneChart: Purify the Chart Structural Extraction via One Auxiliary Token

Add code
Apr 15, 2024
Figure 1 for OneChart: Purify the Chart Structural Extraction via One Auxiliary Token
Figure 2 for OneChart: Purify the Chart Structural Extraction via One Auxiliary Token
Figure 3 for OneChart: Purify the Chart Structural Extraction via One Auxiliary Token
Figure 4 for OneChart: Purify the Chart Structural Extraction via One Auxiliary Token
Viaarxiv icon

MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs

Add code
Feb 23, 2024
Viaarxiv icon

Small Language Model Meets with Reinforced Vision Vocabulary

Add code
Jan 23, 2024
Viaarxiv icon

Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models

Add code
Dec 11, 2023
Figure 1 for Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models
Figure 2 for Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models
Figure 3 for Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models
Figure 4 for Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models
Viaarxiv icon

Merlin:Empowering Multimodal LLMs with Foresight Minds

Add code
Nov 30, 2023
Figure 1 for Merlin:Empowering Multimodal LLMs with Foresight Minds
Figure 2 for Merlin:Empowering Multimodal LLMs with Foresight Minds
Figure 3 for Merlin:Empowering Multimodal LLMs with Foresight Minds
Figure 4 for Merlin:Empowering Multimodal LLMs with Foresight Minds
Viaarxiv icon

Autoencoder with Group-based Decoder and Multi-task Optimization for Anomalous Sound Detection

Add code
Nov 15, 2023
Figure 1 for Autoencoder with Group-based Decoder and Multi-task Optimization for Anomalous Sound Detection
Figure 2 for Autoencoder with Group-based Decoder and Multi-task Optimization for Anomalous Sound Detection
Figure 3 for Autoencoder with Group-based Decoder and Multi-task Optimization for Anomalous Sound Detection
Viaarxiv icon

DreamLLM: Synergistic Multimodal Comprehension and Creation

Add code
Sep 20, 2023
Figure 1 for DreamLLM: Synergistic Multimodal Comprehension and Creation
Figure 2 for DreamLLM: Synergistic Multimodal Comprehension and Creation
Figure 3 for DreamLLM: Synergistic Multimodal Comprehension and Creation
Figure 4 for DreamLLM: Synergistic Multimodal Comprehension and Creation
Viaarxiv icon