Alert button
Picture for Yixiao Ge

Yixiao Ge

Alert button

YOLO-World: Real-Time Open-Vocabulary Object Detection

Feb 02, 2024
Tianheng Cheng, Lin Song, Yixiao Ge, Wenyu Liu, Xinggang Wang, Ying Shan

Viaarxiv icon

Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities

Jan 25, 2024
Yiyuan Zhang, Xiaohan Ding, Kaixiong Gong, Yixiao Ge, Ying Shan, Xiangyu Yue

Viaarxiv icon

Supervised Fine-tuning in turn Improves Visual Foundation Models

Jan 18, 2024
Xiaohu Jiang, Yixiao Ge, Yuying Ge, Chun Yuan, Ying Shan

Viaarxiv icon

Towards A Better Metric for Text-to-Video Generation

Jan 15, 2024
Jay Zhangjie Wu, Guian Fang, Haoning Wu, Xintao Wang, Yixiao Ge, Xiaodong Cun, David Junhao Zhang, Jia-Wei Liu, Yuchao Gu, Rui Zhao, Weisi Lin, Wynne Hsu, Ying Shan, Mike Zheng Shou

Viaarxiv icon

LLaMA Pro: Progressive LLaMA with Block Expansion

Jan 04, 2024
Chengyue Wu, Yukang Gan, Yixiao Ge, Zeyu Lu, Jiahao Wang, Ye Feng, Ping Luo, Ying Shan

Viaarxiv icon

Cached Transformers: Improving Transformers with Differentiable Memory Cache

Dec 20, 2023
Zhaoyang Zhang, Wenqi Shao, Yixiao Ge, Xiaogang Wang, Jinwei Gu, Ping Luo

Viaarxiv icon

VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation

Dec 14, 2023
Jinguo Zhu, Xiaohan Ding, Yixiao Ge, Yuying Ge, Sijie Zhao, Hengshuang Zhao, Xiaohua Wang, Ying Shan

Viaarxiv icon

SmartEdit: Exploring Complex Instruction-based Image Editing with Multimodal Large Language Models

Dec 11, 2023
Yuzhou Huang, Liangbin Xie, Xintao Wang, Ziyang Yuan, Xiaodong Cun, Yixiao Ge, Jiantao Zhou, Chao Dong, Rui Huang, Ruimao Zhang, Ying Shan

Viaarxiv icon

EgoPlan-Bench: Benchmarking Egocentric Embodied Planning with Multimodal Large Language Models

Dec 11, 2023
Yi Chen, Yuying Ge, Yixiao Ge, Mingyu Ding, Bohao Li, Rui Wang, Ruifeng Xu, Ying Shan, Xihui Liu

Figure 1 for EgoPlan-Bench: Benchmarking Egocentric Embodied Planning with Multimodal Large Language Models
Figure 2 for EgoPlan-Bench: Benchmarking Egocentric Embodied Planning with Multimodal Large Language Models
Figure 3 for EgoPlan-Bench: Benchmarking Egocentric Embodied Planning with Multimodal Large Language Models
Figure 4 for EgoPlan-Bench: Benchmarking Egocentric Embodied Planning with Multimodal Large Language Models
Viaarxiv icon