Alert button
Picture for Gedas Bertasius

Gedas Bertasius

Alert button

LoCoNet: Long-Short Context Network for Active Speaker Detection

Add code
Bookmark button
Alert button
Jan 19, 2023
Xizi Wang, Feng Cheng, Gedas Bertasius, David Crandall

Figure 1 for LoCoNet: Long-Short Context Network for Active Speaker Detection
Figure 2 for LoCoNet: Long-Short Context Network for Active Speaker Detection
Figure 3 for LoCoNet: Long-Short Context Network for Active Speaker Detection
Figure 4 for LoCoNet: Long-Short Context Network for Active Speaker Detection
Viaarxiv icon

Efficient Movie Scene Detection using State-Space Transformers

Add code
Bookmark button
Alert button
Dec 29, 2022
Md Mohaiminul Islam, Mahmudul Hasan, Kishan Shamsundar Athrey, Tony Braskich, Gedas Bertasius

Figure 1 for Efficient Movie Scene Detection using State-Space Transformers
Figure 2 for Efficient Movie Scene Detection using State-Space Transformers
Figure 3 for Efficient Movie Scene Detection using State-Space Transformers
Figure 4 for Efficient Movie Scene Detection using State-Space Transformers
Viaarxiv icon

Vision Transformers are Parameter-Efficient Audio-Visual Learners

Add code
Bookmark button
Alert button
Dec 15, 2022
Yan-Bo Lin, Yi-Lin Sung, Jie Lei, Mohit Bansal, Gedas Bertasius

Figure 1 for Vision Transformers are Parameter-Efficient Audio-Visual Learners
Figure 2 for Vision Transformers are Parameter-Efficient Audio-Visual Learners
Figure 3 for Vision Transformers are Parameter-Efficient Audio-Visual Learners
Figure 4 for Vision Transformers are Parameter-Efficient Audio-Visual Learners
Viaarxiv icon

VindLU: A Recipe for Effective Video-and-Language Pretraining

Add code
Bookmark button
Alert button
Dec 09, 2022
Feng Cheng, Xizi Wang, Jie Lei, David Crandall, Mohit Bansal, Gedas Bertasius

Figure 1 for VindLU: A Recipe for Effective Video-and-Language Pretraining
Figure 2 for VindLU: A Recipe for Effective Video-and-Language Pretraining
Figure 3 for VindLU: A Recipe for Effective Video-and-Language Pretraining
Figure 4 for VindLU: A Recipe for Effective Video-and-Language Pretraining
Viaarxiv icon

Improving video retrieval using multilingual knowledge transfer

Add code
Bookmark button
Alert button
Aug 28, 2022
Avinash Madasu, Estelle Aflalo, Gabriela Ben Melech Stan, Shao-Yen Tseng, Gedas Bertasius, Vasudev Lal

Figure 1 for Improving video retrieval using multilingual knowledge transfer
Figure 2 for Improving video retrieval using multilingual knowledge transfer
Figure 3 for Improving video retrieval using multilingual knowledge transfer
Figure 4 for Improving video retrieval using multilingual knowledge transfer
Viaarxiv icon

Object State Change Classification in Egocentric Videos using the Divided Space-Time Attention Mechanism

Add code
Bookmark button
Alert button
Jul 24, 2022
Md Mohaiminul Islam, Gedas Bertasius

Figure 1 for Object State Change Classification in Egocentric Videos using the Divided Space-Time Attention Mechanism
Figure 2 for Object State Change Classification in Egocentric Videos using the Divided Space-Time Attention Mechanism
Figure 3 for Object State Change Classification in Egocentric Videos using the Divided Space-Time Attention Mechanism
Figure 4 for Object State Change Classification in Egocentric Videos using the Divided Space-Time Attention Mechanism
Viaarxiv icon

Learning to Retrieve Videos by Asking Questions

Add code
Bookmark button
Alert button
May 13, 2022
Avinash Madasu, Junier Oliva, Gedas Bertasius

Figure 1 for Learning to Retrieve Videos by Asking Questions
Figure 2 for Learning to Retrieve Videos by Asking Questions
Figure 3 for Learning to Retrieve Videos by Asking Questions
Figure 4 for Learning to Retrieve Videos by Asking Questions
Viaarxiv icon

ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound

Add code
Bookmark button
Alert button
Apr 06, 2022
Yan-Bo Lin, Jie Lei, Mohit Bansal, Gedas Bertasius

Figure 1 for ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound
Figure 2 for ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound
Figure 3 for ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound
Figure 4 for ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound
Viaarxiv icon

Long Movie Clip Classification with State-Space Video Models

Add code
Bookmark button
Alert button
Apr 04, 2022
Md Mohaiminul Islam, Gedas Bertasius

Figure 1 for Long Movie Clip Classification with State-Space Video Models
Figure 2 for Long Movie Clip Classification with State-Space Video Models
Figure 3 for Long Movie Clip Classification with State-Space Video Models
Figure 4 for Long Movie Clip Classification with State-Space Video Models
Viaarxiv icon