Picture for Jae Sung Park

Jae Sung Park

Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness

Add code
Jul 02, 2024
Figure 1 for Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness
Figure 2 for Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness
Figure 3 for Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness
Figure 4 for Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness
Viaarxiv icon

Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass

Add code
May 29, 2024
Viaarxiv icon

Agent AI: Surveying the Horizons of Multimodal Interaction

Add code
Jan 07, 2024
Figure 1 for Agent AI: Surveying the Horizons of Multimodal Interaction
Figure 2 for Agent AI: Surveying the Horizons of Multimodal Interaction
Figure 3 for Agent AI: Surveying the Horizons of Multimodal Interaction
Figure 4 for Agent AI: Surveying the Horizons of Multimodal Interaction
Viaarxiv icon

Localized Symbolic Knowledge Distillation for Visual Commonsense Models

Add code
Dec 12, 2023
Figure 1 for Localized Symbolic Knowledge Distillation for Visual Commonsense Models
Figure 2 for Localized Symbolic Knowledge Distillation for Visual Commonsense Models
Figure 3 for Localized Symbolic Knowledge Distillation for Visual Commonsense Models
Figure 4 for Localized Symbolic Knowledge Distillation for Visual Commonsense Models
Viaarxiv icon

ArK: Augmented Reality with Knowledge Interactive Emergent Ability

Add code
May 01, 2023
Figure 1 for ArK: Augmented Reality with Knowledge Interactive Emergent Ability
Figure 2 for ArK: Augmented Reality with Knowledge Interactive Emergent Ability
Figure 3 for ArK: Augmented Reality with Knowledge Interactive Emergent Ability
Figure 4 for ArK: Augmented Reality with Knowledge Interactive Emergent Ability
Viaarxiv icon

The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning

Add code
Feb 10, 2022
Figure 1 for The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning
Figure 2 for The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning
Figure 3 for The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning
Figure 4 for The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning
Viaarxiv icon

MERLOT: Multimodal Neural Script Knowledge Models

Add code
Jun 10, 2021
Figure 1 for MERLOT: Multimodal Neural Script Knowledge Models
Figure 2 for MERLOT: Multimodal Neural Script Knowledge Models
Figure 3 for MERLOT: Multimodal Neural Script Knowledge Models
Figure 4 for MERLOT: Multimodal Neural Script Knowledge Models
Viaarxiv icon

LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes

Add code
Jun 02, 2021
Figure 1 for LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes
Figure 2 for LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes
Figure 3 for LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes
Figure 4 for LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes
Viaarxiv icon

Natural Language Rationales with Full-Stack Visual Reasoning: From Pixels to Semantic Frames to Commonsense Graphs

Add code
Oct 15, 2020
Figure 1 for Natural Language Rationales with Full-Stack Visual Reasoning: From Pixels to Semantic Frames to Commonsense Graphs
Figure 2 for Natural Language Rationales with Full-Stack Visual Reasoning: From Pixels to Semantic Frames to Commonsense Graphs
Figure 3 for Natural Language Rationales with Full-Stack Visual Reasoning: From Pixels to Semantic Frames to Commonsense Graphs
Figure 4 for Natural Language Rationales with Full-Stack Visual Reasoning: From Pixels to Semantic Frames to Commonsense Graphs
Viaarxiv icon

Identity-Aware Multi-Sentence Video Description

Add code
Aug 22, 2020
Figure 1 for Identity-Aware Multi-Sentence Video Description
Figure 2 for Identity-Aware Multi-Sentence Video Description
Figure 3 for Identity-Aware Multi-Sentence Video Description
Figure 4 for Identity-Aware Multi-Sentence Video Description
Viaarxiv icon