Picture for Miso Choi

Miso Choi

SpatiO: Adaptive Test-Time Orchestration of Vision-Language Agents for Spatial Reasoning

Add code
Apr 23, 2026
Viaarxiv icon

Focus, Don't Prune: Identifying Instruction-Relevant Regions for Information-Rich Image Understanding

Add code
Mar 24, 2026
Viaarxiv icon

Transferable Model-agnostic Vision-Language Model Adaptation for Efficient Weak-to-Strong Generalization

Add code
Aug 13, 2025
Figure 1 for Transferable Model-agnostic Vision-Language Model Adaptation for Efficient Weak-to-Strong Generalization
Figure 2 for Transferable Model-agnostic Vision-Language Model Adaptation for Efficient Weak-to-Strong Generalization
Figure 3 for Transferable Model-agnostic Vision-Language Model Adaptation for Efficient Weak-to-Strong Generalization
Figure 4 for Transferable Model-agnostic Vision-Language Model Adaptation for Efficient Weak-to-Strong Generalization
Viaarxiv icon

Open-vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models

Add code
Aug 18, 2023
Viaarxiv icon