Picture for Yulin Wang

Yulin Wang

CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning

Add code
Apr 18, 2025
Viaarxiv icon

EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance

Add code
Apr 17, 2025
Viaarxiv icon

XLRS-Bench: Could Your Multimodal LLMs Understand Extremely Large Ultra-High-Resolution Remote Sensing Imagery?

Add code
Mar 31, 2025
Viaarxiv icon

MAST-Pro: Dynamic Mixture-of-Experts for Adaptive Segmentation of Pan-Tumors with Knowledge-Driven Prompts

Add code
Mar 18, 2025
Viaarxiv icon

LazyMAR: Accelerating Masked Autoregressive Models via Feature Caching

Add code
Mar 16, 2025
Viaarxiv icon

RoMA: Scaling up Mamba-based Foundation Models for Remote Sensing

Add code
Mar 13, 2025
Viaarxiv icon

Uni-AdaFocus: Spatial-temporal Dynamic Computation for Video Recognition

Add code
Dec 15, 2024
Viaarxiv icon

ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis

Add code
Nov 11, 2024
Figure 1 for ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
Figure 2 for ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
Figure 3 for ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
Figure 4 for ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
Viaarxiv icon

DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution

Add code
Nov 04, 2024
Figure 1 for DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution
Figure 2 for DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution
Figure 3 for DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution
Figure 4 for DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution
Viaarxiv icon

Towards General Text-guided Image Synthesis for Customized Multimodal Brain MRI Generation

Add code
Sep 25, 2024
Figure 1 for Towards General Text-guided Image Synthesis for Customized Multimodal Brain MRI Generation
Figure 2 for Towards General Text-guided Image Synthesis for Customized Multimodal Brain MRI Generation
Figure 3 for Towards General Text-guided Image Synthesis for Customized Multimodal Brain MRI Generation
Figure 4 for Towards General Text-guided Image Synthesis for Customized Multimodal Brain MRI Generation
Viaarxiv icon