Picture for Junzhe Zhang

Junzhe Zhang

Selftok: Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning

Add code
May 18, 2025
Viaarxiv icon

Automatic Reward Shaping from Confounded Offline Data

Add code
May 16, 2025
Viaarxiv icon

Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning

Add code
May 12, 2025
Viaarxiv icon

C-FAITH: A Chinese Fine-Grained Benchmark for Automated Hallucination Evaluation

Add code
Apr 14, 2025
Viaarxiv icon

Causally Aligned Curriculum Learning

Add code
Mar 21, 2025
Viaarxiv icon

MC-MKE: A Fine-Grained Multimodal Knowledge Editing Benchmark Emphasizing Modality Consistency

Add code
Jun 19, 2024
Figure 1 for MC-MKE: A Fine-Grained Multimodal Knowledge Editing Benchmark Emphasizing Modality Consistency
Figure 2 for MC-MKE: A Fine-Grained Multimodal Knowledge Editing Benchmark Emphasizing Modality Consistency
Figure 3 for MC-MKE: A Fine-Grained Multimodal Knowledge Editing Benchmark Emphasizing Modality Consistency
Figure 4 for MC-MKE: A Fine-Grained Multimodal Knowledge Editing Benchmark Emphasizing Modality Consistency
Viaarxiv icon

Evaluating and Mitigating Number Hallucinations in Large Vision-Language Models: A Consistency Perspective

Add code
Mar 03, 2024
Figure 1 for Evaluating and Mitigating Number Hallucinations in Large Vision-Language Models: A Consistency Perspective
Figure 2 for Evaluating and Mitigating Number Hallucinations in Large Vision-Language Models: A Consistency Perspective
Figure 3 for Evaluating and Mitigating Number Hallucinations in Large Vision-Language Models: A Consistency Perspective
Figure 4 for Evaluating and Mitigating Number Hallucinations in Large Vision-Language Models: A Consistency Perspective
Viaarxiv icon

Entity-Aware Multimodal Alignment Framework for News Image Captioning

Add code
Feb 29, 2024
Figure 1 for Entity-Aware Multimodal Alignment Framework for News Image Captioning
Figure 2 for Entity-Aware Multimodal Alignment Framework for News Image Captioning
Figure 3 for Entity-Aware Multimodal Alignment Framework for News Image Captioning
Figure 4 for Entity-Aware Multimodal Alignment Framework for News Image Captioning
Viaarxiv icon

DeformToon3D: Deformable 3D Toonification from Neural Radiance Fields

Add code
Sep 08, 2023
Viaarxiv icon

Variational Relational Point Completion Network for Robust 3D Classification

Add code
Apr 18, 2023
Viaarxiv icon