Alert button
Picture for Kumara Kahatapitiya

Kumara Kahatapitiya

Alert button

Understanding Long Videos in One Multimodal Language Model Pass

Add code
Bookmark button
Alert button
Mar 25, 2024
Kanchana Ranasinghe, Xiang Li, Kumara Kahatapitiya, Michael S. Ryoo

Viaarxiv icon

Language Repository for Long Video Understanding

Add code
Bookmark button
Alert button
Mar 21, 2024
Kumara Kahatapitiya, Kanchana Ranasinghe, Jongwoo Park, Michael S. Ryoo

Figure 1 for Language Repository for Long Video Understanding
Figure 2 for Language Repository for Long Video Understanding
Figure 3 for Language Repository for Long Video Understanding
Figure 4 for Language Repository for Long Video Understanding
Viaarxiv icon

Object-Centric Diffusion for Efficient Video Editing

Add code
Bookmark button
Alert button
Jan 11, 2024
Kumara Kahatapitiya, Adil Karjauv, Davide Abati, Fatih Porikli, Yuki M. Asano, Amirhossein Habibian

Viaarxiv icon

VicTR: Video-conditioned Text Representations for Activity Recognition

Add code
Bookmark button
Alert button
Apr 05, 2023
Kumara Kahatapitiya, Anurag Arnab, Arsha Nagrani, Michael S. Ryoo

Figure 1 for VicTR: Video-conditioned Text Representations for Activity Recognition
Figure 2 for VicTR: Video-conditioned Text Representations for Activity Recognition
Figure 3 for VicTR: Video-conditioned Text Representations for Activity Recognition
Figure 4 for VicTR: Video-conditioned Text Representations for Activity Recognition
Viaarxiv icon

Token Turing Machines

Add code
Bookmark button
Alert button
Nov 16, 2022
Michael S. Ryoo, Keerthana Gopalakrishnan, Kumara Kahatapitiya, Ted Xiao, Kanishka Rao, Austin Stone, Yao Lu, Julian Ibarz, Anurag Arnab

Figure 1 for Token Turing Machines
Figure 2 for Token Turing Machines
Figure 3 for Token Turing Machines
Figure 4 for Token Turing Machines
Viaarxiv icon

Grafting Vision Transformers

Add code
Bookmark button
Alert button
Oct 28, 2022
Jongwoo Park, Kumara Kahatapitiya, Donghyun Kim, Shivchander Sudalairaj, Quanfu Fan, Michael S. Ryoo

Figure 1 for Grafting Vision Transformers
Figure 2 for Grafting Vision Transformers
Figure 3 for Grafting Vision Transformers
Figure 4 for Grafting Vision Transformers
Viaarxiv icon

MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection

Add code
Bookmark button
Alert button
Dec 07, 2021
Rui Dai, Srijan Das, Kumara Kahatapitiya, Michael S. Ryoo, Francois Bremond

Figure 1 for MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection
Figure 2 for MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection
Figure 3 for MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection
Figure 4 for MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection
Viaarxiv icon

SWAT: Spatial Structure Within and Among Tokens

Add code
Bookmark button
Alert button
Nov 26, 2021
Kumara Kahatapitiya, Michael S. Ryoo

Figure 1 for SWAT: Spatial Structure Within and Among Tokens
Figure 2 for SWAT: Spatial Structure Within and Among Tokens
Figure 3 for SWAT: Spatial Structure Within and Among Tokens
Figure 4 for SWAT: Spatial Structure Within and Among Tokens
Viaarxiv icon

Self-supervised Pretraining with Classification Labels for Temporal Activity Detection

Add code
Bookmark button
Alert button
Nov 26, 2021
Kumara Kahatapitiya, Zhou Ren, Haoxiang Li, Zhenyu Wu, Michael S. Ryoo

Figure 1 for Self-supervised Pretraining with Classification Labels for Temporal Activity Detection
Figure 2 for Self-supervised Pretraining with Classification Labels for Temporal Activity Detection
Figure 3 for Self-supervised Pretraining with Classification Labels for Temporal Activity Detection
Figure 4 for Self-supervised Pretraining with Classification Labels for Temporal Activity Detection
Viaarxiv icon

Coarse-Fine Networks for Temporal Activity Detection in Videos

Add code
Bookmark button
Alert button
Apr 01, 2021
Kumara Kahatapitiya, Michael S. Ryoo

Figure 1 for Coarse-Fine Networks for Temporal Activity Detection in Videos
Figure 2 for Coarse-Fine Networks for Temporal Activity Detection in Videos
Figure 3 for Coarse-Fine Networks for Temporal Activity Detection in Videos
Figure 4 for Coarse-Fine Networks for Temporal Activity Detection in Videos
Viaarxiv icon