Alert button
Picture for Ming Yan

Ming Yan

Alert button

Improved Visual Fine-tuning with Natural Language Supervision

Add code
Bookmark button
Alert button
Apr 04, 2023
Junyang Wang, Yuanhong Xu, Juhua Hu, Ming Yan, Jitao Sang, Qi Qian

Figure 1 for Improved Visual Fine-tuning with Natural Language Supervision
Figure 2 for Improved Visual Fine-tuning with Natural Language Supervision
Figure 3 for Improved Visual Fine-tuning with Natural Language Supervision
Figure 4 for Improved Visual Fine-tuning with Natural Language Supervision
Viaarxiv icon

CIMI4D: A Large Multimodal Climbing Motion Dataset under Human-scene Interactions

Add code
Bookmark button
Alert button
Mar 31, 2023
Ming Yan, Xin Wang, Yudi Dai, Siqi Shen, Chenglu Wen, Lan Xu, Yuexin Ma, Cheng Wang

Figure 1 for CIMI4D: A Large Multimodal Climbing Motion Dataset under Human-scene Interactions
Figure 2 for CIMI4D: A Large Multimodal Climbing Motion Dataset under Human-scene Interactions
Figure 3 for CIMI4D: A Large Multimodal Climbing Motion Dataset under Human-scene Interactions
Figure 4 for CIMI4D: A Large Multimodal Climbing Motion Dataset under Human-scene Interactions
Viaarxiv icon

Correspondence-Free Domain Alignment for Unsupervised Cross-Domain Image Retrieval

Add code
Bookmark button
Alert button
Feb 13, 2023
Xu Wang, Dezhong Peng, Ming Yan, Peng Hu

Figure 1 for Correspondence-Free Domain Alignment for Unsupervised Cross-Domain Image Retrieval
Figure 2 for Correspondence-Free Domain Alignment for Unsupervised Cross-Domain Image Retrieval
Figure 3 for Correspondence-Free Domain Alignment for Unsupervised Cross-Domain Image Retrieval
Figure 4 for Correspondence-Free Domain Alignment for Unsupervised Cross-Domain Image Retrieval
Viaarxiv icon

mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video

Add code
Bookmark button
Alert button
Feb 01, 2023
Haiyang Xu, Qinghao Ye, Ming Yan, Yaya Shi, Jiabo Ye, Yuanhong Xu, Chenliang Li, Bin Bi, Qi Qian, Wei Wang, Guohai Xu, Ji Zhang, Songfang Huang, Fei Huang, Jingren Zhou

Figure 1 for mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video
Figure 2 for mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video
Figure 3 for mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video
Figure 4 for mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video
Viaarxiv icon

Learning Trajectory-Word Alignments for Video-Language Tasks

Add code
Bookmark button
Alert button
Jan 06, 2023
Xu Yang, Zhangzikang Li, Haiyang Xu, Hanwang Zhang, Qinghao Ye, Chenliang Li, Ming Yan, Yu Zhang, Fei Huang, Songfang Huang

Figure 1 for Learning Trajectory-Word Alignments for Video-Language Tasks
Figure 2 for Learning Trajectory-Word Alignments for Video-Language Tasks
Figure 3 for Learning Trajectory-Word Alignments for Video-Language Tasks
Figure 4 for Learning Trajectory-Word Alignments for Video-Language Tasks
Viaarxiv icon

HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training

Add code
Bookmark button
Alert button
Dec 30, 2022
Qinghao Ye, Guohai Xu, Ming Yan, Haiyang Xu, Qi Qian, Ji Zhang, Fei Huang

Figure 1 for HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training
Figure 2 for HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training
Figure 3 for HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training
Figure 4 for HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training
Viaarxiv icon

FedRolex: Model-Heterogeneous Federated Learning with Rolling Sub-Model Extraction

Add code
Bookmark button
Alert button
Dec 03, 2022
Samiul Alam, Luyang Liu, Ming Yan, Mi Zhang

Figure 1 for FedRolex: Model-Heterogeneous Federated Learning with Rolling Sub-Model Extraction
Figure 2 for FedRolex: Model-Heterogeneous Federated Learning with Rolling Sub-Model Extraction
Figure 3 for FedRolex: Model-Heterogeneous Federated Learning with Rolling Sub-Model Extraction
Figure 4 for FedRolex: Model-Heterogeneous Federated Learning with Rolling Sub-Model Extraction
Viaarxiv icon

Zero-shot Image Captioning by Anchor-augmented Vision-Language Space Alignment

Add code
Bookmark button
Alert button
Nov 14, 2022
Junyang Wang, Yi Zhang, Ming Yan, Ji Zhang, Jitao Sang

Figure 1 for Zero-shot Image Captioning by Anchor-augmented Vision-Language Space Alignment
Figure 2 for Zero-shot Image Captioning by Anchor-augmented Vision-Language Space Alignment
Figure 3 for Zero-shot Image Captioning by Anchor-augmented Vision-Language Space Alignment
Figure 4 for Zero-shot Image Captioning by Anchor-augmented Vision-Language Space Alignment
Viaarxiv icon

Eye-tracking based classification of Mandarin Chinese readers with and without dyslexia using neural sequence models

Add code
Bookmark button
Alert button
Oct 18, 2022
Patrick Haller, Andreas Säuberli, Sarah Elisabeth Kiener, Jinger Pan, Ming Yan, Lena Jäger

Figure 1 for Eye-tracking based classification of Mandarin Chinese readers with and without dyslexia using neural sequence models
Figure 2 for Eye-tracking based classification of Mandarin Chinese readers with and without dyslexia using neural sequence models
Figure 3 for Eye-tracking based classification of Mandarin Chinese readers with and without dyslexia using neural sequence models
Figure 4 for Eye-tracking based classification of Mandarin Chinese readers with and without dyslexia using neural sequence models
Viaarxiv icon

Communication-Efficient Topologies for Decentralized Learning with $O(1)$ Consensus Rate

Add code
Bookmark button
Alert button
Oct 14, 2022
Zhuoqing Song, Weijian Li, Kexin Jin, Lei Shi, Ming Yan, Wotao Yin, Kun Yuan

Figure 1 for Communication-Efficient Topologies for Decentralized Learning with $O(1)$ Consensus Rate
Figure 2 for Communication-Efficient Topologies for Decentralized Learning with $O(1)$ Consensus Rate
Figure 3 for Communication-Efficient Topologies for Decentralized Learning with $O(1)$ Consensus Rate
Figure 4 for Communication-Efficient Topologies for Decentralized Learning with $O(1)$ Consensus Rate
Viaarxiv icon