Picture for Guankun Wang

Guankun Wang

TMR-VLA:Vision-Language-Action Model for Magnetic Motion Control of Tri-leg Silicone-based Soft Robot

Add code
Feb 28, 2026
Viaarxiv icon

SurgAtt-Tracker: Online Surgical Attention Tracking via Temporal Proposal Reranking and Motion-Aware Refinement

Add code
Feb 24, 2026
Viaarxiv icon

MedScope: Incentivizing "Think with Videos" for Clinical Reasoning via Coarse-to-Fine Tool Calling

Add code
Feb 11, 2026
Viaarxiv icon

GeoLanG: Geometry-Aware Language-Guided Grasping with Unified RGB-D Multimodal Learning

Add code
Feb 04, 2026
Viaarxiv icon

EndoARSS: Adapting Spatially-Aware Foundation Model for Efficient Activity Recognition and Semantic Segmentation in Endoscopic Surgery

Add code
Jun 07, 2025
Viaarxiv icon

EndoVLA: Dual-Phase Vision-Language-Action Model for Autonomous Tracking in Endoscopy

Add code
May 21, 2025
Viaarxiv icon

Can DeepSeek Reason Like a Surgeon? An Empirical Evaluation for Vision-Language Understanding in Robotic-Assisted Surgery

Add code
Apr 02, 2025
Viaarxiv icon

EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery

Add code
Jan 20, 2025
Figure 1 for EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery
Figure 2 for EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery
Figure 3 for EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery
Figure 4 for EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery
Viaarxiv icon

TSUBF-Net: Trans-Spatial UNet-like Network with Bi-direction Fusion for Segmentation of Adenoid Hypertrophy in CT

Add code
Dec 01, 2024
Viaarxiv icon

ETSM: Automating Dissection Trajectory Suggestion and Confidence Map-Based Safety Margin Prediction for Robot-assisted Endoscopic Submucosal Dissection

Add code
Nov 28, 2024
Viaarxiv icon