Picture for Yuyi Zhang

Yuyi Zhang

Doc-V*:Coarse-to-Fine Interactive Visual Reasoning for Multi-Page Document VQA

Add code
Apr 15, 2026
Viaarxiv icon

DocSeeker: Structured Visual Reasoning with Evidence Grounding for Long Document Understanding

Add code
Apr 14, 2026
Viaarxiv icon

NTIRE 2026 Challenge on Single Image Reflection Removal in the Wild: Datasets, Results, and Methods

Add code
Apr 11, 2026
Viaarxiv icon

PosterVerse: A Full-Workflow Framework for Commercial-Grade Poster Generation with HTML-Based Scalable Typography

Add code
Jan 07, 2026
Viaarxiv icon

Do Latent Tokens Think? A Causal and Adversarial Analysis of Chain-of-Continuous-Thought

Add code
Dec 25, 2025
Viaarxiv icon

Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations

Add code
Jul 16, 2025
Figure 1 for Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations
Figure 2 for Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations
Figure 3 for Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations
Figure 4 for Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations
Viaarxiv icon

MCCD: A Multi-Attribute Chinese Calligraphy Character Dataset Annotated with Script Styles, Dynasties, and Calligraphers

Add code
Jul 09, 2025
Figure 1 for MCCD: A Multi-Attribute Chinese Calligraphy Character Dataset Annotated with Script Styles, Dynasties, and Calligraphers
Figure 2 for MCCD: A Multi-Attribute Chinese Calligraphy Character Dataset Annotated with Script Styles, Dynasties, and Calligraphers
Figure 3 for MCCD: A Multi-Attribute Chinese Calligraphy Character Dataset Annotated with Script Styles, Dynasties, and Calligraphers
Figure 4 for MCCD: A Multi-Attribute Chinese Calligraphy Character Dataset Annotated with Script Styles, Dynasties, and Calligraphers
Viaarxiv icon

MegaHan97K: A Large-Scale Dataset for Mega-Category Chinese Character Recognition with over 97K Categories

Add code
Jun 05, 2025
Figure 1 for MegaHan97K: A Large-Scale Dataset for Mega-Category Chinese Character Recognition with over 97K Categories
Figure 2 for MegaHan97K: A Large-Scale Dataset for Mega-Category Chinese Character Recognition with over 97K Categories
Figure 3 for MegaHan97K: A Large-Scale Dataset for Mega-Category Chinese Character Recognition with over 97K Categories
Figure 4 for MegaHan97K: A Large-Scale Dataset for Mega-Category Chinese Character Recognition with over 97K Categories
Viaarxiv icon

NTIRE 2025 Challenge on Image Super-Resolution ($\times$4): Methods and Results

Add code
Apr 20, 2025
Viaarxiv icon

Predicting the Original Appearance of Damaged Historical Documents

Add code
Dec 16, 2024
Figure 1 for Predicting the Original Appearance of Damaged Historical Documents
Figure 2 for Predicting the Original Appearance of Damaged Historical Documents
Figure 3 for Predicting the Original Appearance of Damaged Historical Documents
Figure 4 for Predicting the Original Appearance of Damaged Historical Documents
Viaarxiv icon