Picture for Reuben Tan

Reuben Tan

Koala: Key frame-conditioned long video-LLM

Add code
Apr 05, 2024
Figure 1 for Koala: Key frame-conditioned long video-LLM
Figure 2 for Koala: Key frame-conditioned long video-LLM
Figure 3 for Koala: Key frame-conditioned long video-LLM
Figure 4 for Koala: Key frame-conditioned long video-LLM
Viaarxiv icon

Socratis: Are large multimodal models emotionally aware?

Add code
Sep 05, 2023
Figure 1 for Socratis: Are large multimodal models emotionally aware?
Figure 2 for Socratis: Are large multimodal models emotionally aware?
Figure 3 for Socratis: Are large multimodal models emotionally aware?
Figure 4 for Socratis: Are large multimodal models emotionally aware?
Viaarxiv icon

Multiscale Video Pretraining for Long-Term Activity Forecasting

Add code
Jul 24, 2023
Figure 1 for Multiscale Video Pretraining for Long-Term Activity Forecasting
Figure 2 for Multiscale Video Pretraining for Long-Term Activity Forecasting
Figure 3 for Multiscale Video Pretraining for Long-Term Activity Forecasting
Figure 4 for Multiscale Video Pretraining for Long-Term Activity Forecasting
Viaarxiv icon

EgoAdapt: A multi-stream evaluation study of adaptation to real-world egocentric user video

Add code
Jul 11, 2023
Figure 1 for EgoAdapt: A multi-stream evaluation study of adaptation to real-world egocentric user video
Figure 2 for EgoAdapt: A multi-stream evaluation study of adaptation to real-world egocentric user video
Figure 3 for EgoAdapt: A multi-stream evaluation study of adaptation to real-world egocentric user video
Figure 4 for EgoAdapt: A multi-stream evaluation study of adaptation to real-world egocentric user video
Viaarxiv icon

Language-Guided Audio-Visual Source Separation via Trimodal Consistency

Add code
Mar 28, 2023
Figure 1 for Language-Guided Audio-Visual Source Separation via Trimodal Consistency
Figure 2 for Language-Guided Audio-Visual Source Separation via Trimodal Consistency
Figure 3 for Language-Guided Audio-Visual Source Separation via Trimodal Consistency
Figure 4 for Language-Guided Audio-Visual Source Separation via Trimodal Consistency
Viaarxiv icon

NewsStories: Illustrating articles with visual summaries

Add code
Aug 14, 2022
Figure 1 for NewsStories: Illustrating articles with visual summaries
Figure 2 for NewsStories: Illustrating articles with visual summaries
Figure 3 for NewsStories: Illustrating articles with visual summaries
Viaarxiv icon

Look at What I'm Doing: Self-Supervised Spatial Grounding of Narrations in Instructional Videos

Add code
Oct 20, 2021
Figure 1 for Look at What I'm Doing: Self-Supervised Spatial Grounding of Narrations in Instructional Videos
Figure 2 for Look at What I'm Doing: Self-Supervised Spatial Grounding of Narrations in Instructional Videos
Figure 3 for Look at What I'm Doing: Self-Supervised Spatial Grounding of Narrations in Instructional Videos
Figure 4 for Look at What I'm Doing: Self-Supervised Spatial Grounding of Narrations in Instructional Videos
Viaarxiv icon

Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News

Add code
Sep 24, 2020
Figure 1 for Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News
Figure 2 for Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News
Figure 3 for Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News
Figure 4 for Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News
Viaarxiv icon

wMAN: Weakly-supervised Moment Alignment Network for Text-based Video Segment Retrieval

Add code
Sep 27, 2019
Figure 1 for wMAN: Weakly-supervised Moment Alignment Network for Text-based Video Segment Retrieval
Figure 2 for wMAN: Weakly-supervised Moment Alignment Network for Text-based Video Segment Retrieval
Figure 3 for wMAN: Weakly-supervised Moment Alignment Network for Text-based Video Segment Retrieval
Figure 4 for wMAN: Weakly-supervised Moment Alignment Network for Text-based Video Segment Retrieval
Viaarxiv icon

Learning Similarity Conditions Without Explicit Supervision

Add code
Aug 22, 2019
Figure 1 for Learning Similarity Conditions Without Explicit Supervision
Figure 2 for Learning Similarity Conditions Without Explicit Supervision
Figure 3 for Learning Similarity Conditions Without Explicit Supervision
Figure 4 for Learning Similarity Conditions Without Explicit Supervision
Viaarxiv icon