Picture for Xiaohan Wang

Xiaohan Wang

Transfer Learning for Classification under Decision Rule Drift with Application to Optimal Individualized Treatment Rule Estimation

Add code
Aug 28, 2025
Viaarxiv icon

Topology Guidance: Controlling the Outputs of Generative Models via Vector Field Topology

Add code
May 11, 2025
Viaarxiv icon

Video Action Differencing

Add code
Mar 10, 2025
Viaarxiv icon

SurgiSAM2: Fine-tuning a foundational model for surgical video anatomy segmentation and detection

Add code
Mar 05, 2025
Viaarxiv icon

Temporal Preference Optimization for Long-Form Video Understanding

Add code
Jan 23, 2025
Figure 1 for Temporal Preference Optimization for Long-Form Video Understanding
Figure 2 for Temporal Preference Optimization for Long-Form Video Understanding
Figure 3 for Temporal Preference Optimization for Long-Form Video Understanding
Figure 4 for Temporal Preference Optimization for Long-Form Video Understanding
Viaarxiv icon

BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature

Add code
Jan 14, 2025
Viaarxiv icon

Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation

Add code
Jan 06, 2025
Figure 1 for Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation
Figure 2 for Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation
Figure 3 for Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation
Figure 4 for Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation
Viaarxiv icon

DeepSeek-V3 Technical Report

Add code
Dec 27, 2024
Figure 1 for DeepSeek-V3 Technical Report
Figure 2 for DeepSeek-V3 Technical Report
Figure 3 for DeepSeek-V3 Technical Report
Figure 4 for DeepSeek-V3 Technical Report
Viaarxiv icon

Feather the Throttle: Revisiting Visual Token Pruning for Vision-Language Model Acceleration

Add code
Dec 17, 2024
Viaarxiv icon

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Add code
Dec 13, 2024
Figure 1 for Apollo: An Exploration of Video Understanding in Large Multimodal Models
Figure 2 for Apollo: An Exploration of Video Understanding in Large Multimodal Models
Figure 3 for Apollo: An Exploration of Video Understanding in Large Multimodal Models
Figure 4 for Apollo: An Exploration of Video Understanding in Large Multimodal Models
Viaarxiv icon