Picture for Jiebo Luo

Jiebo Luo

VisualActBench: Can VLMs See and Act like a Human?

Add code
Dec 10, 2025
Viaarxiv icon

UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist

Add code
Nov 11, 2025
Viaarxiv icon

Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models

Add code
Oct 06, 2025
Figure 1 for Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models
Figure 2 for Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models
Figure 3 for Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models
Figure 4 for Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models
Viaarxiv icon

Assessing Historical Structural Oppression Worldwide via Rule-Guided Prompting of Large Language Models

Add code
Sep 18, 2025
Figure 1 for Assessing Historical Structural Oppression Worldwide via Rule-Guided Prompting of Large Language Models
Figure 2 for Assessing Historical Structural Oppression Worldwide via Rule-Guided Prompting of Large Language Models
Figure 3 for Assessing Historical Structural Oppression Worldwide via Rule-Guided Prompting of Large Language Models
Figure 4 for Assessing Historical Structural Oppression Worldwide via Rule-Guided Prompting of Large Language Models
Viaarxiv icon

PP-Motion: Physical-Perceptual Fidelity Evaluation for Human Motion Generation

Add code
Aug 11, 2025
Viaarxiv icon

EndoMatcher: Generalizable Endoscopic Image Matcher via Multi-Domain Pre-training for Robot-Assisted Surgery

Add code
Aug 07, 2025
Viaarxiv icon

Multimodal Causal-Driven Representation Learning for Generalizable Medical Image Segmentation

Add code
Aug 07, 2025
Viaarxiv icon

The Docking Game: Loop Self-Play for Fast, Dynamic, and Accurate Prediction of Flexible Protein--Ligand Binding

Add code
Aug 07, 2025
Viaarxiv icon

Diffusion Transformer-to-Mamba Distillation for High-Resolution Image Generation

Add code
Jun 23, 2025
Figure 1 for Diffusion Transformer-to-Mamba Distillation for High-Resolution Image Generation
Figure 2 for Diffusion Transformer-to-Mamba Distillation for High-Resolution Image Generation
Figure 3 for Diffusion Transformer-to-Mamba Distillation for High-Resolution Image Generation
Figure 4 for Diffusion Transformer-to-Mamba Distillation for High-Resolution Image Generation
Viaarxiv icon

Unleashing Hour-Scale Video Training for Long Video-Language Understanding

Add code
Jun 05, 2025
Viaarxiv icon