Picture for Jinwoo Choi

Jinwoo Choi

Why Can't I Open My Drawer? Mitigating Object-Driven Shortcuts in Zero-Shot Compositional Action Recognition

Add code
Jan 22, 2026
Viaarxiv icon

NOAH: Benchmarking Narrative Prior driven Hallucination and Omission in Video Large Language Models

Add code
Nov 09, 2025
Figure 1 for NOAH: Benchmarking Narrative Prior driven Hallucination and Omission in Video Large Language Models
Figure 2 for NOAH: Benchmarking Narrative Prior driven Hallucination and Omission in Video Large Language Models
Figure 3 for NOAH: Benchmarking Narrative Prior driven Hallucination and Omission in Video Large Language Models
Figure 4 for NOAH: Benchmarking Narrative Prior driven Hallucination and Omission in Video Large Language Models
Viaarxiv icon

Disentangled Concepts Speak Louder Than Words:Explainable Video Action Recognition

Add code
Nov 05, 2025
Viaarxiv icon

ESSENTIAL: Episodic and Semantic Memory Integration for Video Class-Incremental Learning

Add code
Aug 14, 2025
Viaarxiv icon

Universal Domain Adaptation for Semantic Segmentation

Add code
May 28, 2025
Viaarxiv icon

Dynamic Contrastive Skill Learning with State-Transition Based Skill Clustering and Dynamic Length Adjustment

Add code
Apr 21, 2025
Viaarxiv icon

PCBEAR: Pose Concept Bottleneck for Explainable Action Recognition

Add code
Apr 17, 2025
Viaarxiv icon

CA^2ST: Cross-Attention in Audio, Space, and Time for Holistic Video Recognition

Add code
Mar 30, 2025
Figure 1 for CA^2ST: Cross-Attention in Audio, Space, and Time for Holistic Video Recognition
Figure 2 for CA^2ST: Cross-Attention in Audio, Space, and Time for Holistic Video Recognition
Figure 3 for CA^2ST: Cross-Attention in Audio, Space, and Time for Holistic Video Recognition
Figure 4 for CA^2ST: Cross-Attention in Audio, Space, and Time for Holistic Video Recognition
Viaarxiv icon

MASH-VLM: Mitigating Action-Scene Hallucination in Video-LLMs through Disentangled Spatial-Temporal Representations

Add code
Mar 20, 2025
Figure 1 for MASH-VLM: Mitigating Action-Scene Hallucination in Video-LLMs through Disentangled Spatial-Temporal Representations
Figure 2 for MASH-VLM: Mitigating Action-Scene Hallucination in Video-LLMs through Disentangled Spatial-Temporal Representations
Figure 3 for MASH-VLM: Mitigating Action-Scene Hallucination in Video-LLMs through Disentangled Spatial-Temporal Representations
Figure 4 for MASH-VLM: Mitigating Action-Scene Hallucination in Video-LLMs through Disentangled Spatial-Temporal Representations
Viaarxiv icon

The Geometry of Optimal Gait Families for Steering Kinematic Locomoting Systems

Add code
Feb 24, 2025
Viaarxiv icon