Picture for Zhuofan Xia

Zhuofan Xia

Towards Sparse Video Understanding and Reasoning

Add code
Feb 14, 2026
Viaarxiv icon

Step by Step Network

Add code
Nov 18, 2025
Viaarxiv icon

Emulating Human-like Adaptive Vision for Efficient and Flexible Machine Visual Perception

Add code
Sep 18, 2025
Viaarxiv icon

Bridging the Divide: Reconsidering Softmax and Linear Attention

Add code
Dec 09, 2024
Figure 1 for Bridging the Divide: Reconsidering Softmax and Linear Attention
Figure 2 for Bridging the Divide: Reconsidering Softmax and Linear Attention
Figure 3 for Bridging the Divide: Reconsidering Softmax and Linear Attention
Figure 4 for Bridging the Divide: Reconsidering Softmax and Linear Attention
Viaarxiv icon

Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data

Add code
Nov 23, 2024
Figure 1 for Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data
Figure 2 for Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data
Figure 3 for Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data
Figure 4 for Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data
Viaarxiv icon

Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators

Add code
Aug 11, 2024
Figure 1 for Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators
Figure 2 for Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators
Figure 3 for Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators
Figure 4 for Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators
Viaarxiv icon

Demystify Mamba in Vision: A Linear Attention Perspective

Add code
May 26, 2024
Figure 1 for Demystify Mamba in Vision: A Linear Attention Perspective
Figure 2 for Demystify Mamba in Vision: A Linear Attention Perspective
Figure 3 for Demystify Mamba in Vision: A Linear Attention Perspective
Figure 4 for Demystify Mamba in Vision: A Linear Attention Perspective
Viaarxiv icon

Agent Attention: On the Integration of Softmax and Linear Attention

Add code
Dec 22, 2023
Figure 1 for Agent Attention: On the Integration of Softmax and Linear Attention
Figure 2 for Agent Attention: On the Integration of Softmax and Linear Attention
Figure 3 for Agent Attention: On the Integration of Softmax and Linear Attention
Figure 4 for Agent Attention: On the Integration of Softmax and Linear Attention
Viaarxiv icon

GSVA: Generalized Segmentation via Multimodal Large Language Models

Add code
Dec 15, 2023
Figure 1 for GSVA: Generalized Segmentation via Multimodal Large Language Models
Figure 2 for GSVA: Generalized Segmentation via Multimodal Large Language Models
Figure 3 for GSVA: Generalized Segmentation via Multimodal Large Language Models
Figure 4 for GSVA: Generalized Segmentation via Multimodal Large Language Models
Viaarxiv icon

Generalized Activation via Multivariate Projection

Add code
Sep 29, 2023
Figure 1 for Generalized Activation via Multivariate Projection
Figure 2 for Generalized Activation via Multivariate Projection
Figure 3 for Generalized Activation via Multivariate Projection
Figure 4 for Generalized Activation via Multivariate Projection
Viaarxiv icon