Picture for Chunrui Han

Chunrui Han

DGAE: Diffusion-Guided Autoencoder for Efficient Latent Representation Learning

Add code
Jun 11, 2025
Viaarxiv icon

Step1X-Edit: A Practical Framework for General Image Editing

Add code
Apr 24, 2025
Viaarxiv icon

Perception-R1: Pioneering Perception Policy with Reinforcement Learning

Add code
Apr 10, 2025
Viaarxiv icon

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Add code
Sep 03, 2024
Figure 1 for General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Figure 2 for General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Figure 3 for General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Figure 4 for General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Viaarxiv icon

DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation

Add code
Jun 24, 2024
Figure 1 for DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
Figure 2 for DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
Figure 3 for DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
Figure 4 for DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
Viaarxiv icon

Focus Anywhere for Fine-grained Multi-page Document Understanding

Add code
May 23, 2024
Figure 1 for Focus Anywhere for Fine-grained Multi-page Document Understanding
Figure 2 for Focus Anywhere for Fine-grained Multi-page Document Understanding
Figure 3 for Focus Anywhere for Fine-grained Multi-page Document Understanding
Figure 4 for Focus Anywhere for Fine-grained Multi-page Document Understanding
Viaarxiv icon

OneChart: Purify the Chart Structural Extraction via One Auxiliary Token

Add code
Apr 15, 2024
Figure 1 for OneChart: Purify the Chart Structural Extraction via One Auxiliary Token
Figure 2 for OneChart: Purify the Chart Structural Extraction via One Auxiliary Token
Figure 3 for OneChart: Purify the Chart Structural Extraction via One Auxiliary Token
Figure 4 for OneChart: Purify the Chart Structural Extraction via One Auxiliary Token
Viaarxiv icon

ShapeLLM: Universal 3D Object Understanding for Embodied Interaction

Add code
Mar 06, 2024
Figure 1 for ShapeLLM: Universal 3D Object Understanding for Embodied Interaction
Figure 2 for ShapeLLM: Universal 3D Object Understanding for Embodied Interaction
Figure 3 for ShapeLLM: Universal 3D Object Understanding for Embodied Interaction
Figure 4 for ShapeLLM: Universal 3D Object Understanding for Embodied Interaction
Viaarxiv icon

Small Language Model Meets with Reinforced Vision Vocabulary

Add code
Jan 23, 2024
Viaarxiv icon

Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models

Add code
Dec 11, 2023
Figure 1 for Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models
Figure 2 for Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models
Figure 3 for Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models
Figure 4 for Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models
Viaarxiv icon