Picture for Linli Yao

Linli Yao

DiffCap-Bench: A Comprehensive, Challenging, Robust Benchmark for Image Difference Captioning

Add code
May 06, 2026
Viaarxiv icon

Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents

Add code
Apr 07, 2026
Viaarxiv icon

TimeChat-Captioner: Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions

Add code
Feb 09, 2026
Viaarxiv icon

DiaDem: Advancing Dialogue Descriptions in Audiovisual Video Captioning for Multimodal Large Language Models

Add code
Jan 27, 2026
Viaarxiv icon

Conan: Progressive Learning to Reason Like a Detective over Multi-Scale Visual Evidence

Add code
Oct 23, 2025
Viaarxiv icon

Mitigating Overthinking through Reasoning Shaping

Add code
Oct 10, 2025
Viaarxiv icon

RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction

Add code
May 28, 2025
Viaarxiv icon

RICo: Refined In-Context Contribution for Automatic Instruction-Tuning Data Selection

Add code
May 18, 2025
Figure 1 for RICo: Refined In-Context Contribution for Automatic Instruction-Tuning Data Selection
Figure 2 for RICo: Refined In-Context Contribution for Automatic Instruction-Tuning Data Selection
Figure 3 for RICo: Refined In-Context Contribution for Automatic Instruction-Tuning Data Selection
Figure 4 for RICo: Refined In-Context Contribution for Automatic Instruction-Tuning Data Selection
Viaarxiv icon

ICon: In-Context Contribution for Automatic Data Selection

Add code
May 08, 2025
Figure 1 for ICon: In-Context Contribution for Automatic Data Selection
Figure 2 for ICon: In-Context Contribution for Automatic Data Selection
Figure 3 for ICon: In-Context Contribution for Automatic Data Selection
Figure 4 for ICon: In-Context Contribution for Automatic Data Selection
Viaarxiv icon

TimeChat-Online: 80% Visual Tokens are Naturally Redundant in Streaming Videos

Add code
Apr 24, 2025
Viaarxiv icon