Picture for Liangzhe Yuan

Liangzhe Yuan

VideoPrism: A Foundational Visual Encoder for Video Understanding

Add code
Feb 20, 2024
Figure 1 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 2 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 3 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 4 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Viaarxiv icon

Distilling Vision-Language Models on Millions of Videos

Add code
Jan 11, 2024
Figure 1 for Distilling Vision-Language Models on Millions of Videos
Figure 2 for Distilling Vision-Language Models on Millions of Videos
Figure 3 for Distilling Vision-Language Models on Millions of Videos
Figure 4 for Distilling Vision-Language Models on Millions of Videos
Viaarxiv icon

PolyMaX: General Dense Prediction with Mask Transformer

Add code
Nov 09, 2023
Figure 1 for PolyMaX: General Dense Prediction with Mask Transformer
Figure 2 for PolyMaX: General Dense Prediction with Mask Transformer
Figure 3 for PolyMaX: General Dense Prediction with Mask Transformer
Figure 4 for PolyMaX: General Dense Prediction with Mask Transformer
Viaarxiv icon

Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition

Add code
Aug 23, 2023
Figure 1 for Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition
Figure 2 for Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition
Figure 3 for Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition
Figure 4 for Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition
Viaarxiv icon

VideoGLUE: Video General Understanding Evaluation of Foundation Models

Add code
Jul 06, 2023
Figure 1 for VideoGLUE: Video General Understanding Evaluation of Foundation Models
Figure 2 for VideoGLUE: Video General Understanding Evaluation of Foundation Models
Figure 3 for VideoGLUE: Video General Understanding Evaluation of Foundation Models
Figure 4 for VideoGLUE: Video General Understanding Evaluation of Foundation Models
Viaarxiv icon

Spatiotemporally Discriminative Video-Language Pre-Training with Text Grounding

Add code
Mar 28, 2023
Figure 1 for Spatiotemporally Discriminative Video-Language Pre-Training with Text Grounding
Figure 2 for Spatiotemporally Discriminative Video-Language Pre-Training with Text Grounding
Figure 3 for Spatiotemporally Discriminative Video-Language Pre-Training with Text Grounding
Figure 4 for Spatiotemporally Discriminative Video-Language Pre-Training with Text Grounding
Viaarxiv icon

Unified Visual Relationship Detection with Vision and Language Models

Add code
Mar 16, 2023
Figure 1 for Unified Visual Relationship Detection with Vision and Language Models
Figure 2 for Unified Visual Relationship Detection with Vision and Language Models
Figure 3 for Unified Visual Relationship Detection with Vision and Language Models
Figure 4 for Unified Visual Relationship Detection with Vision and Language Models
Viaarxiv icon

Surrogate Gap Minimization Improves Sharpness-Aware Training

Add code
Mar 19, 2022
Figure 1 for Surrogate Gap Minimization Improves Sharpness-Aware Training
Figure 2 for Surrogate Gap Minimization Improves Sharpness-Aware Training
Figure 3 for Surrogate Gap Minimization Improves Sharpness-Aware Training
Figure 4 for Surrogate Gap Minimization Improves Sharpness-Aware Training
Viaarxiv icon

Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision

Add code
Dec 09, 2021
Figure 1 for Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
Figure 2 for Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
Figure 3 for Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
Figure 4 for Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
Viaarxiv icon

Exploring Temporal Granularity in Self-Supervised Video Representation Learning

Add code
Dec 08, 2021
Figure 1 for Exploring Temporal Granularity in Self-Supervised Video Representation Learning
Figure 2 for Exploring Temporal Granularity in Self-Supervised Video Representation Learning
Figure 3 for Exploring Temporal Granularity in Self-Supervised Video Representation Learning
Figure 4 for Exploring Temporal Granularity in Self-Supervised Video Representation Learning
Viaarxiv icon