Picture for Junyan Lin

Junyan Lin

Speak While Watching: Unleashing TRUE Real-Time Video Understanding Capability of Multimodal Large Language Models

Add code
Jan 11, 2026
Viaarxiv icon

Rethinking Visual Layer Selection in Multimodal LLMs

Add code
Apr 30, 2025
Viaarxiv icon

Dynamic Cross-Modal Feature Interaction Network for Hyperspectral and LiDAR Data Classification

Add code
Mar 10, 2025
Viaarxiv icon

Multi-Layer Visual Feature Fusion in Multimodal LLMs: Methods, Analysis, and Best Practices

Add code
Mar 08, 2025
Viaarxiv icon

Semantics Disentanglement and Composition for Versatile Codec toward both Human-eye Perception and Machine Vision Task

Add code
Dec 24, 2024
Figure 1 for Semantics Disentanglement and Composition for Versatile Codec toward both Human-eye Perception and Machine Vision Task
Figure 2 for Semantics Disentanglement and Composition for Versatile Codec toward both Human-eye Perception and Machine Vision Task
Figure 3 for Semantics Disentanglement and Composition for Versatile Codec toward both Human-eye Perception and Machine Vision Task
Figure 4 for Semantics Disentanglement and Composition for Versatile Codec toward both Human-eye Perception and Machine Vision Task
Viaarxiv icon

To Preserve or To Compress: An In-Depth Study of Connector Selection in Multimodal Large Language Models

Add code
Oct 09, 2024
Figure 1 for To Preserve or To Compress: An In-Depth Study of Connector Selection in Multimodal Large Language Models
Figure 2 for To Preserve or To Compress: An In-Depth Study of Connector Selection in Multimodal Large Language Models
Figure 3 for To Preserve or To Compress: An In-Depth Study of Connector Selection in Multimodal Large Language Models
Figure 4 for To Preserve or To Compress: An In-Depth Study of Connector Selection in Multimodal Large Language Models
Viaarxiv icon

Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs

Add code
Aug 16, 2024
Figure 1 for Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs
Figure 2 for Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs
Figure 3 for Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs
Figure 4 for Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs
Viaarxiv icon

Sparse Focus Network for Multi-Source Remote Sensing Data Classification

Add code
Jun 03, 2024
Figure 1 for Sparse Focus Network for Multi-Source Remote Sensing Data Classification
Figure 2 for Sparse Focus Network for Multi-Source Remote Sensing Data Classification
Figure 3 for Sparse Focus Network for Multi-Source Remote Sensing Data Classification
Figure 4 for Sparse Focus Network for Multi-Source Remote Sensing Data Classification
Viaarxiv icon

Boosting Spatial-Spectral Masked Auto-Encoder Through Mining Redundant Spectra for HSI-SAR/LiDAR Classification

Add code
Jun 03, 2024
Viaarxiv icon

SS-MAE: Spatial-Spectral Masked Auto-Encoder for Multi-Source Remote Sensing Image Classification

Add code
Nov 08, 2023
Figure 1 for SS-MAE: Spatial-Spectral Masked Auto-Encoder for Multi-Source Remote Sensing Image Classification
Figure 2 for SS-MAE: Spatial-Spectral Masked Auto-Encoder for Multi-Source Remote Sensing Image Classification
Figure 3 for SS-MAE: Spatial-Spectral Masked Auto-Encoder for Multi-Source Remote Sensing Image Classification
Figure 4 for SS-MAE: Spatial-Spectral Masked Auto-Encoder for Multi-Source Remote Sensing Image Classification
Viaarxiv icon