Picture for Chenhui Li

Chenhui Li

VizDefender: Unmasking Visualization Tampering through Proactive Localization and Intent Inference

Add code
Dec 21, 2025
Figure 1 for VizDefender: Unmasking Visualization Tampering through Proactive Localization and Intent Inference
Figure 2 for VizDefender: Unmasking Visualization Tampering through Proactive Localization and Intent Inference
Figure 3 for VizDefender: Unmasking Visualization Tampering through Proactive Localization and Intent Inference
Figure 4 for VizDefender: Unmasking Visualization Tampering through Proactive Localization and Intent Inference
Viaarxiv icon

Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows

Add code
Dec 18, 2025
Viaarxiv icon

MPJudge: Towards Perceptual Assessment of Music-Induced Paintings

Add code
Nov 10, 2025
Viaarxiv icon

IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards

Add code
Aug 06, 2025
Viaarxiv icon

Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning

Add code
Jun 12, 2025
Viaarxiv icon

Consensus Entropy: Harnessing Multi-VLM Agreement for Self-Verifying and Self-Improving OCR

Add code
Apr 16, 2025
Figure 1 for Consensus Entropy: Harnessing Multi-VLM Agreement for Self-Verifying and Self-Improving OCR
Figure 2 for Consensus Entropy: Harnessing Multi-VLM Agreement for Self-Verifying and Self-Improving OCR
Figure 3 for Consensus Entropy: Harnessing Multi-VLM Agreement for Self-Verifying and Self-Improving OCR
Figure 4 for Consensus Entropy: Harnessing Multi-VLM Agreement for Self-Verifying and Self-Improving OCR
Viaarxiv icon

OpenFly: A Versatile Toolchain and Large-scale Benchmark for Aerial Vision-Language Navigation

Add code
Feb 25, 2025
Viaarxiv icon

Open-Vocabulary Octree-Graph for 3D Scene Understanding

Add code
Nov 25, 2024
Figure 1 for Open-Vocabulary Octree-Graph for 3D Scene Understanding
Figure 2 for Open-Vocabulary Octree-Graph for 3D Scene Understanding
Figure 3 for Open-Vocabulary Octree-Graph for 3D Scene Understanding
Figure 4 for Open-Vocabulary Octree-Graph for 3D Scene Understanding
Viaarxiv icon

ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model

Add code
Nov 04, 2024
Figure 1 for ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model
Figure 2 for ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model
Figure 3 for ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model
Figure 4 for ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model
Viaarxiv icon

PPRSteg: Printing and Photography Robust QR Code Steganography via Attention Flow-Based Model

Add code
May 26, 2024
Figure 1 for PPRSteg: Printing and Photography Robust QR Code Steganography via Attention Flow-Based Model
Figure 2 for PPRSteg: Printing and Photography Robust QR Code Steganography via Attention Flow-Based Model
Figure 3 for PPRSteg: Printing and Photography Robust QR Code Steganography via Attention Flow-Based Model
Figure 4 for PPRSteg: Printing and Photography Robust QR Code Steganography via Attention Flow-Based Model
Viaarxiv icon