Picture for Wenqi Shao

Wenqi Shao

CiQi-Agent: Aligning Vision, Tools and Aesthetics in Multimodal Agent for Cultural Reasoning on Chinese Porcelains

Add code
Mar 30, 2026
Viaarxiv icon

Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development

Add code
Mar 29, 2026
Viaarxiv icon

RetroAgent: From Solving to Evolving via Retrospective Dual Intrinsic Feedback

Add code
Mar 12, 2026
Viaarxiv icon

AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security

Add code
Jan 26, 2026
Viaarxiv icon

More Than One Teacher: Adaptive Multi-Guidance Policy Optimization for Diverse Exploration

Add code
Oct 02, 2025
Figure 1 for More Than One Teacher: Adaptive Multi-Guidance Policy Optimization for Diverse Exploration
Figure 2 for More Than One Teacher: Adaptive Multi-Guidance Policy Optimization for Diverse Exploration
Figure 3 for More Than One Teacher: Adaptive Multi-Guidance Policy Optimization for Diverse Exploration
Figure 4 for More Than One Teacher: Adaptive Multi-Guidance Policy Optimization for Diverse Exploration
Viaarxiv icon

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Add code
Aug 25, 2025
Figure 1 for InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Figure 2 for InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Figure 3 for InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Figure 4 for InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Viaarxiv icon

MDK12-Bench: A Comprehensive Evaluation of Multimodal Large Language Models on Multidisciplinary Exams

Add code
Aug 09, 2025
Viaarxiv icon

Cardiac-CLIP: A Vision-Language Foundation Model for 3D Cardiac CT Images

Add code
Jul 29, 2025
Figure 1 for Cardiac-CLIP: A Vision-Language Foundation Model for 3D Cardiac CT Images
Figure 2 for Cardiac-CLIP: A Vision-Language Foundation Model for 3D Cardiac CT Images
Figure 3 for Cardiac-CLIP: A Vision-Language Foundation Model for 3D Cardiac CT Images
Figure 4 for Cardiac-CLIP: A Vision-Language Foundation Model for 3D Cardiac CT Images
Viaarxiv icon

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law

Add code
Jul 24, 2025
Figure 1 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law
Figure 2 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law
Figure 3 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law
Figure 4 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law
Viaarxiv icon

Flow-Anything: Learning Real-World Optical Flow Estimation from Large-Scale Single-view Images

Add code
Jun 09, 2025
Viaarxiv icon