Picture for Qibin Hou

Qibin Hou

Rethinking Token-Level Policy Optimization for Multimodal Chain-of-Thought

Add code
Mar 24, 2026
Viaarxiv icon

Mixture of Style Experts for Diverse Image Stylization

Add code
Mar 17, 2026
Viaarxiv icon

GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics

Add code
Feb 13, 2026
Viaarxiv icon

Towards Universal Video MLLMs with Attribute-Structured and Quality-Verified Instructions

Add code
Feb 13, 2026
Viaarxiv icon

InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery

Add code
Feb 09, 2026
Viaarxiv icon

Trust but Verify: Adaptive Conditioning for Reference-Based Diffusion Super-Resolution via Implicit Reference Correlation Modeling

Add code
Feb 02, 2026
Viaarxiv icon

OmniSegmentor: A Flexible Multi-Modal Learning Framework for Semantic Segmentation

Add code
Sep 18, 2025
Viaarxiv icon

Revisiting Efficient Semantic Segmentation: Learning Offsets for Better Spatial and Class Feature Alignment

Add code
Aug 12, 2025
Viaarxiv icon

Depth Anything at Any Condition

Add code
Jul 02, 2025
Viaarxiv icon

Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning

Add code
May 18, 2025
Figure 1 for Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning
Figure 2 for Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning
Figure 3 for Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning
Figure 4 for Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning
Viaarxiv icon