Alert button
Picture for Dahun Kim

Dahun Kim

Alert button

Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities

Add code
Bookmark button
Alert button
Nov 13, 2023
AJ Piergiovanni, Isaac Noble, Dahun Kim, Michael S. Ryoo, Victor Gomes, Anelia Angelova

Figure 1 for Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities
Figure 2 for Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities
Figure 3 for Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities
Figure 4 for Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities
Viaarxiv icon

Detection-Oriented Image-Text Pretraining for Open-Vocabulary Detection

Add code
Bookmark button
Alert button
Sep 29, 2023
Dahun Kim, Anelia Angelova, Weicheng Kuo

Viaarxiv icon

Contrastive Feature Masking Open-Vocabulary Vision Transformer

Add code
Bookmark button
Alert button
Sep 02, 2023
Dahun Kim, Anelia Angelova, Weicheng Kuo

Figure 1 for Contrastive Feature Masking Open-Vocabulary Vision Transformer
Figure 2 for Contrastive Feature Masking Open-Vocabulary Vision Transformer
Figure 3 for Contrastive Feature Masking Open-Vocabulary Vision Transformer
Figure 4 for Contrastive Feature Masking Open-Vocabulary Vision Transformer
Viaarxiv icon

Many-to-Many Spoken Language Translation via Unified Speech and Text Representation Learning with Unit-to-Unit Translation

Add code
Bookmark button
Alert button
Aug 03, 2023
Minsu Kim, Jeongsoo Choi, Dahun Kim, Yong Man Ro

Figure 1 for Many-to-Many Spoken Language Translation via Unified Speech and Text Representation Learning with Unit-to-Unit Translation
Figure 2 for Many-to-Many Spoken Language Translation via Unified Speech and Text Representation Learning with Unit-to-Unit Translation
Figure 3 for Many-to-Many Spoken Language Translation via Unified Speech and Text Representation Learning with Unit-to-Unit Translation
Figure 4 for Many-to-Many Spoken Language Translation via Unified Speech and Text Representation Learning with Unit-to-Unit Translation
Viaarxiv icon

Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers

Add code
Bookmark button
Alert button
May 11, 2023
Dahun Kim, Anelia Angelova, Weicheng Kuo

Figure 1 for Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Figure 2 for Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Figure 3 for Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Figure 4 for Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers
Viaarxiv icon

RECLIP: Resource-efficient CLIP by Training with Small Images

Add code
Bookmark button
Alert button
Apr 12, 2023
Runze Li, Dahun Kim, Bir Bhanu, Weicheng Kuo

Figure 1 for RECLIP: Resource-efficient CLIP by Training with Small Images
Figure 2 for RECLIP: Resource-efficient CLIP by Training with Small Images
Figure 3 for RECLIP: Resource-efficient CLIP by Training with Small Images
Figure 4 for RECLIP: Resource-efficient CLIP by Training with Small Images
Viaarxiv icon

Neural Image-based Avatars: Generalizable Radiance Fields for Human Avatar Modeling

Add code
Bookmark button
Alert button
Apr 10, 2023
Youngjoong Kwon, Dahun Kim, Duygu Ceylan, Henry Fuchs

Figure 1 for Neural Image-based Avatars: Generalizable Radiance Fields for Human Avatar Modeling
Figure 2 for Neural Image-based Avatars: Generalizable Radiance Fields for Human Avatar Modeling
Figure 3 for Neural Image-based Avatars: Generalizable Radiance Fields for Human Avatar Modeling
Figure 4 for Neural Image-based Avatars: Generalizable Radiance Fields for Human Avatar Modeling
Viaarxiv icon

Video-kMaX: A Simple Unified Approach for Online and Near-Online Video Panoptic Segmentation

Add code
Bookmark button
Alert button
Apr 10, 2023
Inkyu Shin, Dahun Kim, Qihang Yu, Jun Xie, Hong-Seok Kim, Bradley Green, In So Kweon, Kuk-Jin Yoon, Liang-Chieh Chen

Figure 1 for Video-kMaX: A Simple Unified Approach for Online and Near-Online Video Panoptic Segmentation
Figure 2 for Video-kMaX: A Simple Unified Approach for Online and Near-Online Video Panoptic Segmentation
Figure 3 for Video-kMaX: A Simple Unified Approach for Online and Near-Online Video Panoptic Segmentation
Figure 4 for Video-kMaX: A Simple Unified Approach for Online and Near-Online Video Panoptic Segmentation
Viaarxiv icon

MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks

Add code
Bookmark button
Alert button
Mar 30, 2023
Weicheng Kuo, AJ Piergiovanni, Dahun Kim, Xiyang Luo, Ben Caine, Wei Li, Abhijit Ogale, Luowei Zhou, Andrew Dai, Zhifeng Chen, Claire Cui, Anelia Angelova

Figure 1 for MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
Figure 2 for MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
Figure 3 for MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
Figure 4 for MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
Viaarxiv icon