Alert button
Picture for Ping Luo

Ping Luo

Alert button

Cached Transformers: Improving Transformers with Differentiable Memory Cache

Add code
Bookmark button
Alert button
Dec 20, 2023
Zhaoyang Zhang, Wenqi Shao, Yixiao Ge, Xiaogang Wang, Jinwei Gu, Ping Luo

Viaarxiv icon

SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution

Add code
Bookmark button
Alert button
Dec 18, 2023
Zhixuan Liang, Yao Mu, Hengbo Ma, Masayoshi Tomizuka, Mingyu Ding, Ping Luo

Viaarxiv icon

You Only Learn One Query: Learning Unified Human Query for Single-Stage Multi-Person Multi-Task Human-Centric Perception

Add code
Bookmark button
Alert button
Dec 09, 2023
Sheng Jin, Shuhuai Li, Tong Li, Wentao Liu, Chen Qian, Ping Luo

Viaarxiv icon

GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation

Add code
Bookmark button
Alert button
Dec 07, 2023
Shoufa Chen, Mengmeng Xu, Jiawei Ren, Yuren Cong, Sen He, Yanping Xie, Animesh Sinha, Ping Luo, Tao Xiang, Juan-Manuel Perez-Rua

Figure 1 for GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation
Figure 2 for GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation
Figure 3 for GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation
Figure 4 for GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation
Viaarxiv icon

MotionCtrl: A Unified and Flexible Motion Controller for Video Generation

Add code
Bookmark button
Alert button
Dec 06, 2023
Zhouxia Wang, Ziyang Yuan, Xintao Wang, Tianshui Chen, Menghan Xia, Ping Luo, Ying Shan

Viaarxiv icon

MVBench: A Comprehensive Multi-modal Video Understanding Benchmark

Add code
Bookmark button
Alert button
Dec 03, 2023
Kunchang Li, Yali Wang, Yinan He, Yizhuo Li, Yi Wang, Yi Liu, Zun Wang, Jilan Xu, Guo Chen, Ping Luo, Limin Wang, Yu Qiao

Figure 1 for MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
Figure 2 for MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
Figure 3 for MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
Figure 4 for MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
Viaarxiv icon

MLLMs-Augmented Visual-Language Representation Learning

Add code
Bookmark button
Alert button
Dec 01, 2023
Yanqing Liu, Kai Wang, Wenqi Shao, Ping Luo, Yu Qiao, Mike Zheng Shou, Kaipeng Zhang, Yang You

Figure 1 for MLLMs-Augmented Visual-Language Representation Learning
Figure 2 for MLLMs-Augmented Visual-Language Representation Learning
Figure 3 for MLLMs-Augmented Visual-Language Representation Learning
Figure 4 for MLLMs-Augmented Visual-Language Representation Learning
Viaarxiv icon

Advancing Vision Transformers with Group-Mix Attention

Add code
Bookmark button
Alert button
Nov 26, 2023
Chongjian Ge, Xiaohan Ding, Zhan Tong, Li Yuan, Jiangliu Wang, Yibing Song, Ping Luo

Viaarxiv icon