Picture for Kwonjoon Lee

Kwonjoon Lee

Learning Physical Interaction Skills from Human Demonstrations

Add code
Jul 28, 2025
Viaarxiv icon

Efficient MAP Estimation of LLM Judgment Performance with Prior Transfer

Add code
Apr 17, 2025
Figure 1 for Efficient MAP Estimation of LLM Judgment Performance with Prior Transfer
Figure 2 for Efficient MAP Estimation of LLM Judgment Performance with Prior Transfer
Figure 3 for Efficient MAP Estimation of LLM Judgment Performance with Prior Transfer
Figure 4 for Efficient MAP Estimation of LLM Judgment Performance with Prior Transfer
Viaarxiv icon

GFlowVLM: Enhancing Multi-step Reasoning in Vision-Language Models with Generative Flow Networks

Add code
Mar 09, 2025
Figure 1 for GFlowVLM: Enhancing Multi-step Reasoning in Vision-Language Models with Generative Flow Networks
Figure 2 for GFlowVLM: Enhancing Multi-step Reasoning in Vision-Language Models with Generative Flow Networks
Figure 3 for GFlowVLM: Enhancing Multi-step Reasoning in Vision-Language Models with Generative Flow Networks
Figure 4 for GFlowVLM: Enhancing Multi-step Reasoning in Vision-Language Models with Generative Flow Networks
Viaarxiv icon

Can Hallucination Correction Improve Video-Language Alignment?

Add code
Feb 20, 2025
Viaarxiv icon

Generalized Mission Planning for Heterogeneous Multi-Robot Teams via LLM-constructed Hierarchical Trees

Add code
Jan 27, 2025
Viaarxiv icon

Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge

Add code
Nov 05, 2024
Figure 1 for Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge
Figure 2 for Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge
Figure 3 for Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge
Figure 4 for Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge
Viaarxiv icon

Estimating Ego-Body Pose from Doubly Sparse Egocentric Video Data

Add code
Nov 05, 2024
Figure 1 for Estimating Ego-Body Pose from Doubly Sparse Egocentric Video Data
Figure 2 for Estimating Ego-Body Pose from Doubly Sparse Egocentric Video Data
Figure 3 for Estimating Ego-Body Pose from Doubly Sparse Egocentric Video Data
Figure 4 for Estimating Ego-Body Pose from Doubly Sparse Egocentric Video Data
Viaarxiv icon

Symbolic Graph Inference for Compound Scene Understanding

Add code
Oct 30, 2024
Figure 1 for Symbolic Graph Inference for Compound Scene Understanding
Figure 2 for Symbolic Graph Inference for Compound Scene Understanding
Figure 3 for Symbolic Graph Inference for Compound Scene Understanding
Viaarxiv icon

M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models

Add code
Jul 19, 2024
Figure 1 for M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models
Figure 2 for M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models
Figure 3 for M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models
Figure 4 for M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models
Viaarxiv icon

Follow the Rules: Reasoning for Video Anomaly Detection with Large Language Models

Add code
Jul 14, 2024
Viaarxiv icon