Picture for Shen Yan

Shen Yan

Generalizable Multispectral Land Cover Classification via Frequency-Aware Mixture of Low-Rank Token Experts

Add code
May 20, 2025
Viaarxiv icon

Model Merging in Pre-training of Large Language Models

Add code
May 17, 2025
Viaarxiv icon

Seed1.5-VL Technical Report

Add code
May 11, 2025
Viaarxiv icon

Efficient Pretraining Length Scaling

Add code
Apr 21, 2025
Viaarxiv icon

CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning

Add code
Mar 25, 2025
Viaarxiv icon

NTR-Gaussian: Nighttime Dynamic Thermal Reconstruction with 4D Gaussian Splatting Based on Thermodynamics

Add code
Mar 05, 2025
Viaarxiv icon

MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency

Add code
Feb 13, 2025
Figure 1 for MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency
Figure 2 for MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency
Figure 3 for MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency
Figure 4 for MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency
Viaarxiv icon

SparseFocus: Learning-based One-shot Autofocus for Microscopy with Sparse Content

Add code
Feb 10, 2025
Figure 1 for SparseFocus: Learning-based One-shot Autofocus for Microscopy with Sparse Content
Figure 2 for SparseFocus: Learning-based One-shot Autofocus for Microscopy with Sparse Content
Figure 3 for SparseFocus: Learning-based One-shot Autofocus for Microscopy with Sparse Content
Figure 4 for SparseFocus: Learning-based One-shot Autofocus for Microscopy with Sparse Content
Viaarxiv icon

CompCap: Improving Multimodal Large Language Models with Composite Captions

Add code
Dec 06, 2024
Figure 1 for CompCap: Improving Multimodal Large Language Models with Composite Captions
Figure 2 for CompCap: Improving Multimodal Large Language Models with Composite Captions
Figure 3 for CompCap: Improving Multimodal Large Language Models with Composite Captions
Figure 4 for CompCap: Improving Multimodal Large Language Models with Composite Captions
Viaarxiv icon

Autoregressive Models in Vision: A Survey

Add code
Nov 08, 2024
Figure 1 for Autoregressive Models in Vision: A Survey
Figure 2 for Autoregressive Models in Vision: A Survey
Figure 3 for Autoregressive Models in Vision: A Survey
Figure 4 for Autoregressive Models in Vision: A Survey
Viaarxiv icon