Picture for Furong Huang

Furong Huang

MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning

Add code
Jun 05, 2025
Viaarxiv icon

Does Thinking More always Help? Understanding Test-Time Scaling in Reasoning Models

Add code
Jun 04, 2025
Viaarxiv icon

Zero-Shot Vision Encoder Grafting via LLM Surrogates

Add code
May 28, 2025
Viaarxiv icon

EnsemW2S: Enhancing Weak-to-Strong Generalization with Large Language Model Ensembles

Add code
May 28, 2025
Viaarxiv icon

Effort-aware Fairness: Incorporating a Philosophy-informed, Human-centered Notion of Effort into Algorithmic Fairness Metrics

Add code
May 25, 2025
Viaarxiv icon

DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data

Add code
May 21, 2025
Viaarxiv icon

FLARE: Robot Learning with Implicit World Modeling

Add code
May 21, 2025
Viaarxiv icon

Imagine, Verify, Execute: Memory-Guided Agentic Exploration with Vision-Language Models

Add code
May 12, 2025
Viaarxiv icon

AegisLLM: Scaling Agentic Systems for Self-Reflective Defense in LLM Security

Add code
Apr 29, 2025
Viaarxiv icon

SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement

Add code
Apr 10, 2025
Viaarxiv icon