Picture for Jiebo Luo

Jiebo Luo

SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning

Add code
Mar 24, 2026
Viaarxiv icon

VideoSeek: Long-Horizon Video Agent with Tool-Guided Seeking

Add code
Mar 20, 2026
Viaarxiv icon

SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models

Add code
Mar 17, 2026
Viaarxiv icon

Tri-Prompting: Video Diffusion with Unified Control over Scene, Subject, and Motion

Add code
Mar 16, 2026
Viaarxiv icon

JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation

Add code
Feb 22, 2026
Viaarxiv icon

UniSurg: A Video-Native Foundation Model for Universal Understanding of Surgical Videos

Add code
Feb 05, 2026
Viaarxiv icon

StreamSense: Streaming Social Task Detection with Selective Vision-Language Model Routing

Add code
Jan 30, 2026
Viaarxiv icon

Sphinx: Benchmarking and Modeling for LLM-Driven Pull Request Review

Add code
Jan 06, 2026
Viaarxiv icon

A Versatile Multimodal Agent for Multimedia Content Generation

Add code
Jan 06, 2026
Viaarxiv icon

LibContinual: A Comprehensive Library towards Realistic Continual Learning

Add code
Dec 26, 2025
Viaarxiv icon