Alert button
Picture for Yehao Li

Yehao Li

Alert button

HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs

Mar 18, 2024
Ting Yao, Yehao Li, Yingwei Pan, Tao Mei

Viaarxiv icon

Control3D: Towards Controllable Text-to-3D Generation

Nov 09, 2023
Yang Chen, Yingwei Pan, Yehao Li, Ting Yao, Tao Mei

Viaarxiv icon

Semantic-Conditional Diffusion Networks for Image Captioning

Dec 06, 2022
Jianjie Luo, Yehao Li, Yingwei Pan, Ting Yao, Jianlin Feng, Hongyang Chao, Tao Mei

Figure 1 for Semantic-Conditional Diffusion Networks for Image Captioning
Figure 2 for Semantic-Conditional Diffusion Networks for Image Captioning
Figure 3 for Semantic-Conditional Diffusion Networks for Image Captioning
Figure 4 for Semantic-Conditional Diffusion Networks for Image Captioning
Viaarxiv icon

SPE-Net: Boosting Point Cloud Analysis via Rotation Robustness Enhancement

Nov 15, 2022
Zhaofan Qiu, Yehao Li, Yu Wang, Yingwei Pan, Ting Yao, Tao Mei

Viaarxiv icon

Dual Vision Transformer

Jul 12, 2022
Ting Yao, Yehao Li, Yingwei Pan, Yu Wang, Xiao-Ping Zhang, Tao Mei

Figure 1 for Dual Vision Transformer
Figure 2 for Dual Vision Transformer
Figure 3 for Dual Vision Transformer
Figure 4 for Dual Vision Transformer
Viaarxiv icon

Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning

Jul 11, 2022
Ting Yao, Yingwei Pan, Yehao Li, Chong-Wah Ngo, Tao Mei

Figure 1 for Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning
Figure 2 for Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning
Figure 3 for Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning
Figure 4 for Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning
Viaarxiv icon

Comprehending and Ordering Semantics for Image Captioning

Jun 14, 2022
Yehao Li, Yingwei Pan, Ting Yao, Tao Mei

Figure 1 for Comprehending and Ordering Semantics for Image Captioning
Figure 2 for Comprehending and Ordering Semantics for Image Captioning
Figure 3 for Comprehending and Ordering Semantics for Image Captioning
Figure 4 for Comprehending and Ordering Semantics for Image Captioning
Viaarxiv icon

Silver-Bullet-3D at ManiSkill 2021: Learning-from-Demonstrations and Heuristic Rule-based Methods for Object Manipulation

Jun 13, 2022
Yingwei Pan, Yehao Li, Yiheng Zhang, Qi Cai, Fuchen Long, Zhaofan Qiu, Ting Yao, Tao Mei

Figure 1 for Silver-Bullet-3D at ManiSkill 2021: Learning-from-Demonstrations and Heuristic Rule-based Methods for Object Manipulation
Figure 2 for Silver-Bullet-3D at ManiSkill 2021: Learning-from-Demonstrations and Heuristic Rule-based Methods for Object Manipulation
Figure 3 for Silver-Bullet-3D at ManiSkill 2021: Learning-from-Demonstrations and Heuristic Rule-based Methods for Object Manipulation
Figure 4 for Silver-Bullet-3D at ManiSkill 2021: Learning-from-Demonstrations and Heuristic Rule-based Methods for Object Manipulation
Viaarxiv icon

Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular Vision-Language Pre-training

Jan 11, 2022
Yehao Li, Jiahao Fan, Yingwei Pan, Ting Yao, Weiyao Lin, Tao Mei

Figure 1 for Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular Vision-Language Pre-training
Figure 2 for Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular Vision-Language Pre-training
Figure 3 for Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular Vision-Language Pre-training
Figure 4 for Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular Vision-Language Pre-training
Viaarxiv icon

CoCo-BERT: Improving Video-Language Pre-training with Contrastive Cross-modal Matching and Denoising

Dec 14, 2021
Jianjie Luo, Yehao Li, Yingwei Pan, Ting Yao, Hongyang Chao, Tao Mei

Figure 1 for CoCo-BERT: Improving Video-Language Pre-training with Contrastive Cross-modal Matching and Denoising
Figure 2 for CoCo-BERT: Improving Video-Language Pre-training with Contrastive Cross-modal Matching and Denoising
Figure 3 for CoCo-BERT: Improving Video-Language Pre-training with Contrastive Cross-modal Matching and Denoising
Figure 4 for CoCo-BERT: Improving Video-Language Pre-training with Contrastive Cross-modal Matching and Denoising
Viaarxiv icon