Alert button
Picture for Yifei Huang

Yifei Huang

Alert button

TextCenGen: Attention-Guided Text-Centric Background Adaptation for Text-to-Image Generation

Add code
Bookmark button
Alert button
Apr 18, 2024
Tianyi Liang, Jiangqi Liu, Sicheng Song, Shiqi Jiang, Yifei Huang, Changbo Wang, Chenhui Li

Viaarxiv icon

EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World

Add code
Bookmark button
Alert button
Mar 24, 2024
Yifei Huang, Guo Chen, Jilan Xu, Mingfang Zhang, Lijin Yang, Baoqi Pei, Hongjie Zhang, Lu Dong, Yali Wang, Limin Wang, Yu Qiao

Viaarxiv icon

InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding

Add code
Bookmark button
Alert button
Mar 22, 2024
Yi Wang, Kunchang Li, Xinhao Li, Jiashuo Yu, Yinan He, Guo Chen, Baoqi Pei, Rongkun Zheng, Jilan Xu, Zun Wang, Yansong Shi, Tianxiang Jiang, Songze Li, Hongjie Zhang, Yifei Huang, Yu Qiao, Yali Wang, Limin Wang

Viaarxiv icon

Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding

Add code
Bookmark button
Alert button
Mar 14, 2024
Guo Chen, Yifei Huang, Jilan Xu, Baoqi Pei, Zhe Chen, Zhiqi Li, Jiahao Wang, Kunchang Li, Tong Lu, Limin Wang

Figure 1 for Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding
Figure 2 for Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding
Figure 3 for Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding
Figure 4 for Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding
Viaarxiv icon

FineBio: A Fine-Grained Video Dataset of Biological Experiments with Hierarchical Annotation

Add code
Bookmark button
Alert button
Feb 01, 2024
Takuma Yagi, Misaki Ohashi, Yifei Huang, Ryosuke Furuta, Shungo Adachi, Toutai Mitsuyama, Yoichi Sato

Viaarxiv icon

Retrieval-Augmented Egocentric Video Captioning

Add code
Bookmark button
Alert button
Jan 03, 2024
Jilan Xu, Yifei Huang, Junlin Hou, Guo Chen, Yuejie Zhang, Rui Feng, Weidi Xie

Viaarxiv icon

MoVQA: A Benchmark of Versatile Question-Answering for Long-Form Movie Understanding

Add code
Bookmark button
Alert button
Dec 08, 2023
Hongjie Zhang, Yi Liu, Lu Dong, Yifei Huang, Zhen-Hua Ling, Yali Wang, Limin Wang, Yu Qiao

Figure 1 for MoVQA: A Benchmark of Versatile Question-Answering for Long-Form Movie Understanding
Figure 2 for MoVQA: A Benchmark of Versatile Question-Answering for Long-Form Movie Understanding
Figure 3 for MoVQA: A Benchmark of Versatile Question-Answering for Long-Form Movie Understanding
Figure 4 for MoVQA: A Benchmark of Versatile Question-Answering for Long-Form Movie Understanding
Viaarxiv icon

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Add code
Bookmark button
Alert button
Nov 30, 2023
Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, Maria Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md Mohaiminul Islam, Suyog Jain, Rawal Khirodkar, Devansh Kukreja, Kevin J Liang, Jia-Wei Liu, Sagnik Majumder, Yongsen Mao, Miguel Martin, Effrosyni Mavroudi, Tushar Nagarajan, Francesco Ragusa, Santhosh Kumar Ramakrishnan, Luigi Seminara, Arjun Somayazulu, Yale Song, Shan Su, Zihui Xue, Edward Zhang, Jinxu Zhang, Angela Castillo, Changan Chen, Xinzhu Fu, Ryosuke Furuta, Cristina Gonzalez, Prince Gupta, Jiabo Hu, Yifei Huang, Yiming Huang, Weslie Khoo, Anush Kumar, Robert Kuo, Sach Lakhavani, Miao Liu, Mi Luo, Zhengyi Luo, Brighid Meredith, Austin Miller, Oluwatumininu Oguntola, Xiaqing Pan, Penny Peng, Shraman Pramanick, Merey Ramazanova, Fiona Ryan, Wei Shan, Kiran Somasundaram, Chenan Song, Audrey Southerland, Masatoshi Tateno, Huiyu Wang, Yuchen Wang, Takuma Yagi, Mingfei Yan, Xitong Yang, Zecheng Yu, Shengxin Cindy Zha, Chen Zhao, Ziwei Zhao, Zhifan Zhu, Jeff Zhuo, Pablo Arbelaez, Gedas Bertasius, David Crandall, Dima Damen, Jakob Engel, Giovanni Maria Farinella, Antonino Furnari, Bernard Ghanem, Judy Hoffman, C. V. Jawahar, Richard Newcombe, Hyun Soo Park, James M. Rehg, Yoichi Sato, Manolis Savva, Jianbo Shi, Mike Zheng Shou, Michael Wray

Figure 1 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 2 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 3 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Figure 4 for Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Viaarxiv icon

Pretraining Language Models with Text-Attributed Heterogeneous Graphs

Add code
Bookmark button
Alert button
Oct 23, 2023
Tao Zou, Le Yu, Yifei Huang, Leilei Sun, Bowen Du

Viaarxiv icon