Picture for Furong Huang

Furong Huang

MomaGraph: State-Aware Unified Scene Graphs with Vision-Language Model for Embodied Task Planning

Add code
Dec 18, 2025
Viaarxiv icon

Lemon: A Unified and Scalable 3D Multimodal Model for Universal Spatial Understanding

Add code
Dec 14, 2025
Viaarxiv icon

Hold Onto That Thought: Assessing KV Cache Compression On Reasoning

Add code
Dec 12, 2025
Viaarxiv icon

Towards Mitigating Hallucinations in Large Vision-Language Models by Refining Textual Embeddings

Add code
Nov 07, 2025
Viaarxiv icon

MIRA: Towards Mitigating Reward Hacking in Inference-Time Alignment of T2I Diffusion Models

Add code
Oct 02, 2025
Viaarxiv icon

Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning

Add code
Jul 22, 2025
Figure 1 for Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning
Figure 2 for Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning
Figure 3 for Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning
Figure 4 for Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning
Viaarxiv icon

Reward Models Can Improve Themselves: Reward-Guided Adversarial Failure Mode Discovery for Robust Reward Modeling

Add code
Jul 08, 2025
Viaarxiv icon

ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs

Add code
Jun 11, 2025
Viaarxiv icon

MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning

Add code
Jun 05, 2025
Viaarxiv icon

Does Thinking More always Help? Understanding Test-Time Scaling in Reasoning Models

Add code
Jun 04, 2025
Figure 1 for Does Thinking More always Help? Understanding Test-Time Scaling in Reasoning Models
Figure 2 for Does Thinking More always Help? Understanding Test-Time Scaling in Reasoning Models
Figure 3 for Does Thinking More always Help? Understanding Test-Time Scaling in Reasoning Models
Figure 4 for Does Thinking More always Help? Understanding Test-Time Scaling in Reasoning Models
Viaarxiv icon