Picture for Siyou Li

Siyou Li

GlazyBench: A Benchmark for Ceramic Glaze Property Prediction and Image Generation

Add code
May 07, 2026
Viaarxiv icon

ViewSAM: Learning View-aware Cross-modal Semantics for Weakly Supervised Cross-view Referring Multi-Object Tracking

Add code
May 04, 2026
Viaarxiv icon

NovBench: Evaluating Large Language Models on Academic Paper Novelty Assessment

Add code
Apr 13, 2026
Viaarxiv icon

Seeing the Forest and the Trees: Query-Aware Tokenizer for Long-Video Multimodal Language Models

Add code
Nov 14, 2025
Viaarxiv icon

CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following

Add code
Jun 14, 2025
Viaarxiv icon

MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix

Add code
May 19, 2025
Figure 1 for MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
Figure 2 for MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
Figure 3 for MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
Figure 4 for MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
Viaarxiv icon

ViT3D Alignment of LLaMA3: 3D Medical Image Report Generation

Add code
Oct 11, 2024
Figure 1 for ViT3D Alignment of LLaMA3: 3D Medical Image Report Generation
Figure 2 for ViT3D Alignment of LLaMA3: 3D Medical Image Report Generation
Figure 3 for ViT3D Alignment of LLaMA3: 3D Medical Image Report Generation
Figure 4 for ViT3D Alignment of LLaMA3: 3D Medical Image Report Generation
Viaarxiv icon