Picture for Hang Song

Hang Song

NTIRE 2026 Challenge on Short-form UGC Video Restoration in the Wild with Generative Models: Datasets, Methods and Results

Add code
Apr 12, 2026
Viaarxiv icon

Hybrid Physical and Geometrical Optics Method for Modeling Subsurface Imaging Using mmWave FMCW Radar

Add code
Apr 11, 2026
Viaarxiv icon

NTIRE 2026 3D Restoration and Reconstruction in Real-world Adverse Conditions: RealX3D Challenge Results

Add code
Apr 05, 2026
Viaarxiv icon

The RoboSense Challenge: Sense Anything, Navigate Anywhere, Adapt Across Platforms

Add code
Jan 08, 2026
Viaarxiv icon

Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future

Add code
Dec 18, 2025
Viaarxiv icon

EditMGT: Unleashing Potentials of Masked Generative Transformers in Image Editing

Add code
Dec 12, 2025
Viaarxiv icon

Astra: A Multi-Agent System for GPU Kernel Performance Optimization

Add code
Sep 09, 2025
Viaarxiv icon

Mixed-R1: Unified Reward Perspective For Reasoning Capability in Multimodal Large Language Models

Add code
May 30, 2025
Figure 1 for Mixed-R1: Unified Reward Perspective For Reasoning Capability in Multimodal Large Language Models
Figure 2 for Mixed-R1: Unified Reward Perspective For Reasoning Capability in Multimodal Large Language Models
Figure 3 for Mixed-R1: Unified Reward Perspective For Reasoning Capability in Multimodal Large Language Models
Figure 4 for Mixed-R1: Unified Reward Perspective For Reasoning Capability in Multimodal Large Language Models
Viaarxiv icon

Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models

Add code
Nov 14, 2024
Figure 1 for Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models
Figure 2 for Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models
Figure 3 for Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models
Figure 4 for Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models
Viaarxiv icon

UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model

Add code
Aug 05, 2024
Figure 1 for UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model
Figure 2 for UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model
Figure 3 for UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model
Figure 4 for UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model
Viaarxiv icon