Picture for Linquan Wu

Linquan Wu

Imagine Before You Predict: Interleaved Latent Visual Reasoning for Video Event Prediction

Add code
Jun 04, 2026
Viaarxiv icon

LaViT: Aligning Latent Visual Thoughts for Multi-modal Reasoning

Add code
Jan 15, 2026
Viaarxiv icon

HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks

Add code
Oct 16, 2024
Figure 1 for HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks
Figure 2 for HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks
Figure 3 for HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks
Figure 4 for HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks
Viaarxiv icon