Picture for Zongxia Li

Zongxia Li

First Frame Is the Place to Go for Video Content Customization

Add code
Nov 19, 2025
Viaarxiv icon

VisPlay: Self-Evolving Vision-Language Models from Images

Add code
Nov 19, 2025
Viaarxiv icon

VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning

Add code
Oct 01, 2025
Figure 1 for VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning
Figure 2 for VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning
Figure 3 for VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning
Figure 4 for VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning
Viaarxiv icon

Self-Rewarding Vision-Language Model via Reasoning Decomposition

Add code
Aug 27, 2025
Figure 1 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Figure 2 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Figure 3 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Figure 4 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Viaarxiv icon

R-Zero: Self-Evolving Reasoning LLM from Zero Data

Add code
Aug 07, 2025
Figure 1 for R-Zero: Self-Evolving Reasoning LLM from Zero Data
Figure 2 for R-Zero: Self-Evolving Reasoning LLM from Zero Data
Figure 3 for R-Zero: Self-Evolving Reasoning LLM from Zero Data
Figure 4 for R-Zero: Self-Evolving Reasoning LLM from Zero Data
Viaarxiv icon

Semantically-Aware Rewards for Open-Ended R1 Training in Free-Form Generation

Add code
Jun 18, 2025
Viaarxiv icon

VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations for Synthetic Videos

Add code
May 02, 2025
Viaarxiv icon

Large Language Models Are Effective Human Annotation Assistants, But Not Good Independent Annotators

Add code
Mar 09, 2025
Figure 1 for Large Language Models Are Effective Human Annotation Assistants, But Not Good Independent Annotators
Figure 2 for Large Language Models Are Effective Human Annotation Assistants, But Not Good Independent Annotators
Figure 3 for Large Language Models Are Effective Human Annotation Assistants, But Not Good Independent Annotators
Figure 4 for Large Language Models Are Effective Human Annotation Assistants, But Not Good Independent Annotators
Viaarxiv icon

Large Language Models Struggle to Describe the Haystack without Human Help: Human-in-the-loop Evaluation of LLMs

Add code
Feb 20, 2025
Viaarxiv icon

Benchmark Evaluations, Applications, and Challenges of Large Vision Language Models: A Survey

Add code
Jan 04, 2025
Figure 1 for Benchmark Evaluations, Applications, and Challenges of Large Vision Language Models: A Survey
Figure 2 for Benchmark Evaluations, Applications, and Challenges of Large Vision Language Models: A Survey
Figure 3 for Benchmark Evaluations, Applications, and Challenges of Large Vision Language Models: A Survey
Figure 4 for Benchmark Evaluations, Applications, and Challenges of Large Vision Language Models: A Survey
Viaarxiv icon