Picture for Shun Inadumi

Shun Inadumi

Disambiguating Reference in Visually Grounded Dialogues through Joint Modeling of Textual and Multimodal Semantic Structures

Add code
May 16, 2025
Figure 1 for Disambiguating Reference in Visually Grounded Dialogues through Joint Modeling of Textual and Multimodal Semantic Structures
Figure 2 for Disambiguating Reference in Visually Grounded Dialogues through Joint Modeling of Textual and Multimodal Semantic Structures
Figure 3 for Disambiguating Reference in Visually Grounded Dialogues through Joint Modeling of Textual and Multimodal Semantic Structures
Figure 4 for Disambiguating Reference in Visually Grounded Dialogues through Joint Modeling of Textual and Multimodal Semantic Structures
Viaarxiv icon

A Gaze-grounded Visual Question Answering Dataset for Clarifying Ambiguous Japanese Questions

Add code
Mar 26, 2024
Figure 1 for A Gaze-grounded Visual Question Answering Dataset for Clarifying Ambiguous Japanese Questions
Figure 2 for A Gaze-grounded Visual Question Answering Dataset for Clarifying Ambiguous Japanese Questions
Figure 3 for A Gaze-grounded Visual Question Answering Dataset for Clarifying Ambiguous Japanese Questions
Figure 4 for A Gaze-grounded Visual Question Answering Dataset for Clarifying Ambiguous Japanese Questions
Viaarxiv icon