Picture for Jincheng Gao

Jincheng Gao

When to Lock Attention: Training-Free KV Control in Video Diffusion

Add code
Mar 10, 2026
Viaarxiv icon

AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios

Add code
Feb 26, 2026
Viaarxiv icon

DTU-Net: A Multi-Scale Dilated Transformer Network for Nonlinear Hyperspectral Unmixing

Add code
Mar 06, 2025
Figure 1 for DTU-Net: A Multi-Scale Dilated Transformer Network for Nonlinear Hyperspectral Unmixing
Figure 2 for DTU-Net: A Multi-Scale Dilated Transformer Network for Nonlinear Hyperspectral Unmixing
Figure 3 for DTU-Net: A Multi-Scale Dilated Transformer Network for Nonlinear Hyperspectral Unmixing
Figure 4 for DTU-Net: A Multi-Scale Dilated Transformer Network for Nonlinear Hyperspectral Unmixing
Viaarxiv icon

MLLM-SUL: Multimodal Large Language Model for Semantic Scene Understanding and Localization in Traffic Scenarios

Add code
Dec 27, 2024
Figure 1 for MLLM-SUL: Multimodal Large Language Model for Semantic Scene Understanding and Localization in Traffic Scenarios
Figure 2 for MLLM-SUL: Multimodal Large Language Model for Semantic Scene Understanding and Localization in Traffic Scenarios
Figure 3 for MLLM-SUL: Multimodal Large Language Model for Semantic Scene Understanding and Localization in Traffic Scenarios
Figure 4 for MLLM-SUL: Multimodal Large Language Model for Semantic Scene Understanding and Localization in Traffic Scenarios
Viaarxiv icon