Picture for Shen Zhao

Shen Zhao

A1: A Fully Transparent Open-Source, Adaptive and Efficient Truncated Vision-Language-Action Model

Add code
Apr 07, 2026
Viaarxiv icon

ATM-Net: Anatomy-Aware Text-Guided Multi-Modal Fusion for Fine-Grained Lumbar Spine Segmentation

Add code
Apr 04, 2025
Viaarxiv icon

AnomalyControl: Learning Cross-modal Semantic Features for Controllable Anomaly Synthesis

Add code
Dec 10, 2024
Figure 1 for AnomalyControl: Learning Cross-modal Semantic Features for Controllable Anomaly Synthesis
Figure 2 for AnomalyControl: Learning Cross-modal Semantic Features for Controllable Anomaly Synthesis
Figure 3 for AnomalyControl: Learning Cross-modal Semantic Features for Controllable Anomaly Synthesis
Figure 4 for AnomalyControl: Learning Cross-modal Semantic Features for Controllable Anomaly Synthesis
Viaarxiv icon

VidMan: Exploiting Implicit Dynamics from Video Diffusion Model for Effective Robot Manipulation

Add code
Nov 14, 2024
Viaarxiv icon

Whole Heart Perfusion with High-Multiband Simultaneous Multislice Imaging via Linear Phase Modulated Extended Field of View (SMILE)

Add code
Sep 06, 2024
Viaarxiv icon

A Population-to-individual Tuning Framework for Adapting Pretrained LM to On-device User Intent Prediction

Add code
Aug 19, 2024
Figure 1 for A Population-to-individual Tuning Framework for Adapting Pretrained LM to On-device User Intent Prediction
Figure 2 for A Population-to-individual Tuning Framework for Adapting Pretrained LM to On-device User Intent Prediction
Figure 3 for A Population-to-individual Tuning Framework for Adapting Pretrained LM to On-device User Intent Prediction
Figure 4 for A Population-to-individual Tuning Framework for Adapting Pretrained LM to On-device User Intent Prediction
Viaarxiv icon

Predicting Genetic Mutation from Whole Slide Images via Biomedical-Linguistic Knowledge Enhanced Multi-label Classification

Add code
Jun 05, 2024
Viaarxiv icon

VG4D: Vision-Language Model Goes 4D Video Recognition

Add code
Apr 17, 2024
Viaarxiv icon

ModelNet-O: A Large-Scale Synthetic Dataset for Occlusion-Aware Point Cloud Classification

Add code
Jan 16, 2024
Figure 1 for ModelNet-O: A Large-Scale Synthetic Dataset for Occlusion-Aware Point Cloud Classification
Figure 2 for ModelNet-O: A Large-Scale Synthetic Dataset for Occlusion-Aware Point Cloud Classification
Figure 3 for ModelNet-O: A Large-Scale Synthetic Dataset for Occlusion-Aware Point Cloud Classification
Figure 4 for ModelNet-O: A Large-Scale Synthetic Dataset for Occlusion-Aware Point Cloud Classification
Viaarxiv icon

Explore Human Parsing Modality for Action Recognition

Add code
Jan 04, 2024
Figure 1 for Explore Human Parsing Modality for Action Recognition
Figure 2 for Explore Human Parsing Modality for Action Recognition
Viaarxiv icon