Picture for Zhengzhong Tu

Zhengzhong Tu

Ben

AirV2X: Unified Air-Ground Vehicle-to-Everything Collaboration

Add code
Jun 24, 2025
Viaarxiv icon

Demystifying the Visual Quality Paradox in Multimodal Large Language Models

Add code
Jun 18, 2025
Viaarxiv icon

SAFEFLOW: A Principled Protocol for Trustworthy and Transactional Autonomous Agent Systems

Add code
Jun 09, 2025
Viaarxiv icon

MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning

Add code
May 30, 2025
Viaarxiv icon

mRAG: Elucidating the Design Space of Multi-modal Retrieval-Augmented Generation

Add code
May 29, 2025
Viaarxiv icon

DINO-R1: Incentivizing Reasoning Capability in Vision Foundation Models

Add code
May 29, 2025
Viaarxiv icon

CAST: Contrastive Adaptation and Distillation for Semi-Supervised Instance Segmentation

Add code
May 29, 2025
Viaarxiv icon

Simulating the Unseen: Crash Prediction Must Learn from What Did Not Happen

Add code
May 27, 2025
Viaarxiv icon

VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction

Add code
May 26, 2025
Viaarxiv icon

SounDiT: Geo-Contextual Soundscape-to-Landscape Generation

Add code
May 19, 2025
Viaarxiv icon