Picture for Zongxia Li

Zongxia Li

A Cookbook of 3D Vision: Data, Learning Paradigms, and Application

Add code
Jun 02, 2026
Viaarxiv icon

G-Zero: Self-Play for Open-Ended Generation from Zero Data

Add code
May 11, 2026
Viaarxiv icon

Co-Evolving LLM Decision and Skill Bank Agents for Long-Horizon Tasks

Add code
Apr 22, 2026
Viaarxiv icon

Graph of Skills: Dependency-Aware Structural Retrieval for Massive Agent Skills

Add code
Apr 07, 2026
Viaarxiv icon

SABER: A Stealthy Agentic Black-Box Attack Framework for Vision-Language-Action Models

Add code
Mar 26, 2026
Viaarxiv icon

MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data

Add code
Mar 10, 2026
Viaarxiv icon

VisPlay: Self-Evolving Vision-Language Models from Images

Add code
Nov 19, 2025
Viaarxiv icon

First Frame Is the Place to Go for Video Content Customization

Add code
Nov 19, 2025
Viaarxiv icon

VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning

Add code
Oct 01, 2025
Figure 1 for VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning
Figure 2 for VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning
Figure 3 for VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning
Figure 4 for VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning
Viaarxiv icon

Self-Rewarding Vision-Language Model via Reasoning Decomposition

Add code
Aug 27, 2025
Figure 1 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Figure 2 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Figure 3 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Figure 4 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Viaarxiv icon