Picture for Huaibo Huang

Huaibo Huang

MVPBench: A Multi-Video Perception Evaluation Benchmark for Multi-Modal Video Understanding

Add code
Mar 24, 2026
Viaarxiv icon

Think 360°: Evaluating the Width-centric Reasoning Capability of MLLMs Beyond Depth

Add code
Mar 24, 2026
Viaarxiv icon

Tuning Real-World Image Restoration at Inference: A Test-Time Scaling Paradigm for Flow Matching Models

Add code
Mar 23, 2026
Viaarxiv icon

GenVideoLens: Where LVLMs Fall Short in AI-Generated Video Detection?

Add code
Mar 19, 2026
Viaarxiv icon

Random Wins All: Rethinking Grouping Strategies for Vision Tokens

Add code
Feb 28, 2026
Viaarxiv icon

UniWeTok: An Unified Binary Tokenizer with Codebook Size $\mathit{2^{128}}$ for Unified Multimodal Large Language Model

Add code
Feb 15, 2026
Viaarxiv icon

BitDance: Scaling Autoregressive Generative Models with Binary Tokens

Add code
Feb 15, 2026
Viaarxiv icon

Expand and Prune: Maximizing Trajectory Diversity for Effective GRPO in Generative Models

Add code
Dec 17, 2025
Viaarxiv icon

HAD: Hybrid Architecture Distillation Outperforms Teacher in Genomic Sequence Modeling

Add code
May 27, 2025
Viaarxiv icon

T^2Agent A Tool-augmented Multimodal Misinformation Detection Agent with Monte Carlo Tree Search

Add code
May 26, 2025
Viaarxiv icon