Picture for Wentao Zhang

Wentao Zhang

CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation

Add code
Jan 15, 2026
Viaarxiv icon

Panning for Gold: Expanding Domain-Specific Knowledge Graphs with General Knowledge

Add code
Jan 15, 2026
Viaarxiv icon

GIFT: Unlocking Global Optimality in Post-Training via Finite-Temperature Gibbs Initialization

Add code
Jan 14, 2026
Viaarxiv icon

RAGShaper: Eliciting Sophisticated Agentic RAG Skills via Automated Data Synthesis

Add code
Jan 13, 2026
Viaarxiv icon

Advancing ESG Intelligence: An Expert-level Agent and Comprehensive Benchmark for Sustainable Finance

Add code
Jan 13, 2026
Viaarxiv icon

DocDancer: Towards Agentic Document-Grounded Information Seeking

Add code
Jan 08, 2026
Viaarxiv icon

Agri-R1: Empowering Generalizable Agricultural Reasoning in Vision-Language Models with Reinforcement Learning

Add code
Jan 08, 2026
Viaarxiv icon

CPJ: Explainable Agricultural Pest Diagnosis via Caption-Prompt-Judge with LLM-Judged Refinement

Add code
Dec 31, 2025
Viaarxiv icon

Leash: Adaptive Length Penalty and Reward Shaping for Efficient Large Reasoning Model

Add code
Dec 25, 2025
Figure 1 for Leash: Adaptive Length Penalty and Reward Shaping for Efficient Large Reasoning Model
Figure 2 for Leash: Adaptive Length Penalty and Reward Shaping for Efficient Large Reasoning Model
Figure 3 for Leash: Adaptive Length Penalty and Reward Shaping for Efficient Large Reasoning Model
Figure 4 for Leash: Adaptive Length Penalty and Reward Shaping for Efficient Large Reasoning Model
Viaarxiv icon

Generative Giants, Retrieval Weaklings: Why do Multimodal Large Language Models Fail at Multimodal Retrieval?

Add code
Dec 22, 2025
Viaarxiv icon