MARMOT: A Deep Learning Framework for Constructing Multimodal Representations for Vision-and-Language Tasks

Add code
Sep 23, 2021
Figure 1 for MARMOT: A Deep Learning Framework for Constructing Multimodal Representations for Vision-and-Language Tasks
Figure 2 for MARMOT: A Deep Learning Framework for Constructing Multimodal Representations for Vision-and-Language Tasks
Figure 3 for MARMOT: A Deep Learning Framework for Constructing Multimodal Representations for Vision-and-Language Tasks
Figure 4 for MARMOT: A Deep Learning Framework for Constructing Multimodal Representations for Vision-and-Language Tasks

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: