Picture for Jaemin Cho

Jaemin Cho

CAPTURe: Evaluating Spatial Reasoning in Vision Language Models via Occluded Object Counting

Add code
Apr 21, 2025
Viaarxiv icon

Executable Functional Abstractions: Inferring Generative Programs for Advanced Math Problems

Add code
Apr 14, 2025
Viaarxiv icon

Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization

Add code
Apr 11, 2025
Viaarxiv icon

Polarimetric BSSRDF Acquisition of Dynamic Faces

Add code
Dec 29, 2024
Figure 1 for Polarimetric BSSRDF Acquisition of Dynamic Faces
Figure 2 for Polarimetric BSSRDF Acquisition of Dynamic Faces
Figure 3 for Polarimetric BSSRDF Acquisition of Dynamic Faces
Figure 4 for Polarimetric BSSRDF Acquisition of Dynamic Faces
Viaarxiv icon

VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement

Add code
Nov 22, 2024
Figure 1 for VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement
Figure 2 for VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement
Figure 3 for VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement
Figure 4 for VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement
Viaarxiv icon

M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding

Add code
Nov 07, 2024
Figure 1 for M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding
Figure 2 for M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding
Figure 3 for M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding
Figure 4 for M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding
Viaarxiv icon

DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback

Add code
Oct 08, 2024
Viaarxiv icon

DOCCI: Descriptions of Connected and Contrasting Images

Add code
Apr 30, 2024
Figure 1 for DOCCI: Descriptions of Connected and Contrasting Images
Figure 2 for DOCCI: Descriptions of Connected and Contrasting Images
Figure 3 for DOCCI: Descriptions of Connected and Contrasting Images
Figure 4 for DOCCI: Descriptions of Connected and Contrasting Images
Viaarxiv icon

Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model

Add code
Apr 15, 2024
Viaarxiv icon

Rethinking Interactive Image Segmentation with Low Latency, High Quality, and Diverse Prompts

Add code
Mar 31, 2024
Viaarxiv icon