Alert button
Picture for Liangzhe Yuan

Liangzhe Yuan

Alert button

VideoPrism: A Foundational Visual Encoder for Video Understanding

Add code
Bookmark button
Alert button
Feb 20, 2024
Long Zhao, Nitesh B. Gundavarapu, Liangzhe Yuan, Hao Zhou, Shen Yan, Jennifer J. Sun, Luke Friedman, Rui Qian, Tobias Weyand, Yue Zhao, Rachel Hornung, Florian Schroff, Ming-Hsuan Yang, David A. Ross, Huisheng Wang, Hartwig Adam, Mikhail Sirotenko, Ting Liu, Boqing Gong

Figure 1 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 2 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 3 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 4 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Viaarxiv icon

Distilling Vision-Language Models on Millions of Videos

Add code
Bookmark button
Alert button
Jan 11, 2024
Yue Zhao, Long Zhao, Xingyi Zhou, Jialin Wu, Chun-Te Chu, Hui Miao, Florian Schroff, Hartwig Adam, Ting Liu, Boqing Gong, Philipp Krähenbühl, Liangzhe Yuan

Figure 1 for Distilling Vision-Language Models on Millions of Videos
Figure 2 for Distilling Vision-Language Models on Millions of Videos
Figure 3 for Distilling Vision-Language Models on Millions of Videos
Figure 4 for Distilling Vision-Language Models on Millions of Videos
Viaarxiv icon

PolyMaX: General Dense Prediction with Mask Transformer

Add code
Bookmark button
Alert button
Nov 09, 2023
Xuan Yang, Liangzhe Yuan, Kimberly Wilber, Astuti Sharma, Xiuye Gu, Siyuan Qiao, Stephanie Debats, Huisheng Wang, Hartwig Adam, Mikhail Sirotenko, Liang-Chieh Chen

Figure 1 for PolyMaX: General Dense Prediction with Mask Transformer
Figure 2 for PolyMaX: General Dense Prediction with Mask Transformer
Figure 3 for PolyMaX: General Dense Prediction with Mask Transformer
Figure 4 for PolyMaX: General Dense Prediction with Mask Transformer
Viaarxiv icon

Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition

Add code
Bookmark button
Alert button
Aug 23, 2023
Qitong Wang, Long Zhao, Liangzhe Yuan, Ting Liu, Xi Peng

Figure 1 for Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition
Figure 2 for Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition
Figure 3 for Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition
Figure 4 for Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition
Viaarxiv icon

VideoGLUE: Video General Understanding Evaluation of Foundation Models

Add code
Bookmark button
Alert button
Jul 06, 2023
Liangzhe Yuan, Nitesh Bharadwaj Gundavarapu, Long Zhao, Hao Zhou, Yin Cui, Lu Jiang, Xuan Yang, Menglin Jia, Tobias Weyand, Luke Friedman, Mikhail Sirotenko, Huisheng Wang, Florian Schroff, Hartwig Adam, Ming-Hsuan Yang, Ting Liu, Boqing Gong

Figure 1 for VideoGLUE: Video General Understanding Evaluation of Foundation Models
Figure 2 for VideoGLUE: Video General Understanding Evaluation of Foundation Models
Figure 3 for VideoGLUE: Video General Understanding Evaluation of Foundation Models
Figure 4 for VideoGLUE: Video General Understanding Evaluation of Foundation Models
Viaarxiv icon

Spatiotemporally Discriminative Video-Language Pre-Training with Text Grounding

Add code
Bookmark button
Alert button
Mar 28, 2023
Yuanhao Xiong, Long Zhao, Boqing Gong, Ming-Hsuan Yang, Florian Schroff, Ting Liu, Cho-Jui Hsieh, Liangzhe Yuan

Figure 1 for Spatiotemporally Discriminative Video-Language Pre-Training with Text Grounding
Figure 2 for Spatiotemporally Discriminative Video-Language Pre-Training with Text Grounding
Figure 3 for Spatiotemporally Discriminative Video-Language Pre-Training with Text Grounding
Figure 4 for Spatiotemporally Discriminative Video-Language Pre-Training with Text Grounding
Viaarxiv icon

Unified Visual Relationship Detection with Vision and Language Models

Add code
Bookmark button
Alert button
Mar 16, 2023
Long Zhao, Liangzhe Yuan, Boqing Gong, Yin Cui, Florian Schroff, Ming-Hsuan Yang, Hartwig Adam, Ting Liu

Figure 1 for Unified Visual Relationship Detection with Vision and Language Models
Figure 2 for Unified Visual Relationship Detection with Vision and Language Models
Figure 3 for Unified Visual Relationship Detection with Vision and Language Models
Figure 4 for Unified Visual Relationship Detection with Vision and Language Models
Viaarxiv icon

Surrogate Gap Minimization Improves Sharpness-Aware Training

Add code
Bookmark button
Alert button
Mar 19, 2022
Juntang Zhuang, Boqing Gong, Liangzhe Yuan, Yin Cui, Hartwig Adam, Nicha Dvornek, Sekhar Tatikonda, James Duncan, Ting Liu

Figure 1 for Surrogate Gap Minimization Improves Sharpness-Aware Training
Figure 2 for Surrogate Gap Minimization Improves Sharpness-Aware Training
Figure 3 for Surrogate Gap Minimization Improves Sharpness-Aware Training
Figure 4 for Surrogate Gap Minimization Improves Sharpness-Aware Training
Viaarxiv icon

Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision

Add code
Bookmark button
Alert button
Dec 09, 2021
Liangzhe Yuan, Rui Qian, Yin Cui, Boqing Gong, Florian Schroff, Ming-Hsuan Yang, Hartwig Adam, Ting Liu

Figure 1 for Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
Figure 2 for Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
Figure 3 for Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
Figure 4 for Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
Viaarxiv icon

Exploring Temporal Granularity in Self-Supervised Video Representation Learning

Add code
Bookmark button
Alert button
Dec 08, 2021
Rui Qian, Yeqing Li, Liangzhe Yuan, Boqing Gong, Ting Liu, Matthew Brown, Serge Belongie, Ming-Hsuan Yang, Hartwig Adam, Yin Cui

Figure 1 for Exploring Temporal Granularity in Self-Supervised Video Representation Learning
Figure 2 for Exploring Temporal Granularity in Self-Supervised Video Representation Learning
Figure 3 for Exploring Temporal Granularity in Self-Supervised Video Representation Learning
Figure 4 for Exploring Temporal Granularity in Self-Supervised Video Representation Learning
Viaarxiv icon