Alert button
Picture for Yifei Huang

Yifei Huang

Alert button

Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding

Mar 14, 2024
Guo Chen, Yifei Huang, Jilan Xu, Baoqi Pei, Zhe Chen, Zhiqi Li, Jiahao Wang, Kunchang Li, Tong Lu, Limin Wang

Viaarxiv icon

FineBio: A Fine-Grained Video Dataset of Biological Experiments with Hierarchical Annotation

Feb 01, 2024
Takuma Yagi, Misaki Ohashi, Yifei Huang, Ryosuke Furuta, Shungo Adachi, Toutai Mitsuyama, Yoichi Sato

Viaarxiv icon

Retrieval-Augmented Egocentric Video Captioning

Jan 03, 2024
Jilan Xu, Yifei Huang, Junlin Hou, Guo Chen, Yuejie Zhang, Rui Feng, Weidi Xie

Viaarxiv icon

MoVQA: A Benchmark of Versatile Question-Answering for Long-Form Movie Understanding

Dec 08, 2023
Hongjie Zhang, Yi Liu, Lu Dong, Yifei Huang, Zhen-Hua Ling, Yali Wang, Limin Wang, Yu Qiao

Figure 1 for MoVQA: A Benchmark of Versatile Question-Answering for Long-Form Movie Understanding
Figure 2 for MoVQA: A Benchmark of Versatile Question-Answering for Long-Form Movie Understanding
Figure 3 for MoVQA: A Benchmark of Versatile Question-Answering for Long-Form Movie Understanding
Figure 4 for MoVQA: A Benchmark of Versatile Question-Answering for Long-Form Movie Understanding
Viaarxiv icon

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Nov 30, 2023
Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, Maria Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md Mohaiminul Islam, Suyog Jain, Rawal Khirodkar, Devansh Kukreja, Kevin J Liang, Jia-Wei Liu, Sagnik Majumder, Yongsen Mao, Miguel Martin, Effrosyni Mavroudi, Tushar Nagarajan, Francesco Ragusa, Santhosh Kumar Ramakrishnan, Luigi Seminara, Arjun Somayazulu, Yale Song, Shan Su, Zihui Xue, Edward Zhang, Jinxu Zhang, Angela Castillo, Changan Chen, Xinzhu Fu, Ryosuke Furuta, Cristina Gonzalez, Prince Gupta, Jiabo Hu, Yifei Huang, Yiming Huang, Weslie Khoo, Anush Kumar, Robert Kuo, Sach Lakhavani, Miao Liu, Mi Luo, Zhengyi Luo, Brighid Meredith, Austin Miller, Oluwatumininu Oguntola, Xiaqing Pan, Penny Peng, Shraman Pramanick, Merey Ramazanova, Fiona Ryan, Wei Shan, Kiran Somasundaram, Chenan Song, Audrey Southerland, Masatoshi Tateno, Huiyu Wang, Yuchen Wang, Takuma Yagi, Mingfei Yan, Xitong Yang, Zecheng Yu, Shengxin Cindy Zha, Chen Zhao, Ziwei Zhao, Zhifan Zhu, Jeff Zhuo, Pablo Arbelaez, Gedas Bertasius, David Crandall, Dima Damen, Jakob Engel, Giovanni Maria Farinella, Antonino Furnari, Bernard Ghanem, Judy Hoffman, C. V. Jawahar, Richard Newcombe, Hyun Soo Park, James M. Rehg, Yoichi Sato, Manolis Savva, Jianbo Shi, Mike Zheng Shou, Michael Wray

Figure 1 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 2 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 3 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 4 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Viaarxiv icon

Pretraining Language Models with Text-Attributed Heterogeneous Graphs

Oct 23, 2023
Tao Zou, Le Yu, Yifei Huang, Leilei Sun, Bowen Du

Viaarxiv icon

Proposal-based Temporal Action Localization with Point-level Supervision

Oct 09, 2023
Yuan Yin, Yifei Huang, Ryosuke Furuta, Yoichi Sato

Figure 1 for Proposal-based Temporal Action Localization with Point-level Supervision
Figure 2 for Proposal-based Temporal Action Localization with Point-level Supervision
Figure 3 for Proposal-based Temporal Action Localization with Point-level Supervision
Figure 4 for Proposal-based Temporal Action Localization with Point-level Supervision
Viaarxiv icon

Memory-and-Anticipation Transformer for Online Action Understanding

Aug 15, 2023
Jiahao Wang, Guo Chen, Yifei Huang, Limin Wang, Tong Lu

Figure 1 for Memory-and-Anticipation Transformer for Online Action Understanding
Figure 2 for Memory-and-Anticipation Transformer for Online Action Understanding
Figure 3 for Memory-and-Anticipation Transformer for Online Action Understanding
Figure 4 for Memory-and-Anticipation Transformer for Online Action Understanding
Viaarxiv icon

VideoLLM: Modeling Video Sequence with Large Language Models

May 23, 2023
Guo Chen, Yin-Dong Zheng, Jiahao Wang, Jilan Xu, Yifei Huang, Junting Pan, Yi Wang, Yali Wang, Yu Qiao, Tong Lu, Limin Wang

Figure 1 for VideoLLM: Modeling Video Sequence with Large Language Models
Figure 2 for VideoLLM: Modeling Video Sequence with Large Language Models
Figure 3 for VideoLLM: Modeling Video Sequence with Large Language Models
Figure 4 for VideoLLM: Modeling Video Sequence with Large Language Models
Viaarxiv icon