Picture for Zehuan Yuan

Zehuan Yuan

Learning Instance-Level Representation for Large-Scale Multi-Modal Pretraining in E-commerce

Add code
Apr 06, 2023
Figure 1 for Learning Instance-Level Representation for Large-Scale Multi-Modal Pretraining in E-commerce
Figure 2 for Learning Instance-Level Representation for Large-Scale Multi-Modal Pretraining in E-commerce
Figure 3 for Learning Instance-Level Representation for Large-Scale Multi-Modal Pretraining in E-commerce
Figure 4 for Learning Instance-Level Representation for Large-Scale Multi-Modal Pretraining in E-commerce
Viaarxiv icon

Multi-Level Contrastive Learning for Dense Prediction Task

Add code
Apr 04, 2023
Figure 1 for Multi-Level Contrastive Learning for Dense Prediction Task
Figure 2 for Multi-Level Contrastive Learning for Dense Prediction Task
Figure 3 for Multi-Level Contrastive Learning for Dense Prediction Task
Figure 4 for Multi-Level Contrastive Learning for Dense Prediction Task
Viaarxiv icon

Universal Instance Perception as Object Discovery and Retrieval

Add code
Mar 12, 2023
Figure 1 for Universal Instance Perception as Object Discovery and Retrieval
Figure 2 for Universal Instance Perception as Object Discovery and Retrieval
Figure 3 for Universal Instance Perception as Object Discovery and Retrieval
Figure 4 for Universal Instance Perception as Object Discovery and Retrieval
Viaarxiv icon

Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling

Add code
Jan 10, 2023
Figure 1 for Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling
Figure 2 for Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling
Figure 3 for Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling
Figure 4 for Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling
Viaarxiv icon

QueryPose: Sparse Multi-Person Pose Regression via Spatial-Aware Part-Level Query

Add code
Dec 15, 2022
Figure 1 for QueryPose: Sparse Multi-Person Pose Regression via Spatial-Aware Part-Level Query
Figure 2 for QueryPose: Sparse Multi-Person Pose Regression via Spatial-Aware Part-Level Query
Figure 3 for QueryPose: Sparse Multi-Person Pose Regression via Spatial-Aware Part-Level Query
Figure 4 for QueryPose: Sparse Multi-Person Pose Regression via Spatial-Aware Part-Level Query
Viaarxiv icon

Learning Object-Language Alignments for Open-Vocabulary Object Detection

Add code
Nov 27, 2022
Viaarxiv icon

Self-supervised Video Representation Learning with Motion-Aware Masked Autoencoders

Add code
Oct 09, 2022
Figure 1 for Self-supervised Video Representation Learning with Motion-Aware Masked Autoencoders
Figure 2 for Self-supervised Video Representation Learning with Motion-Aware Masked Autoencoders
Figure 3 for Self-supervised Video Representation Learning with Motion-Aware Masked Autoencoders
Figure 4 for Self-supervised Video Representation Learning with Motion-Aware Masked Autoencoders
Viaarxiv icon

MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning

Add code
Oct 09, 2022
Figure 1 for MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning
Figure 2 for MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning
Figure 3 for MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning
Figure 4 for MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning
Viaarxiv icon

ManiCLIP: Multi-Attribute Face Manipulation from Text

Add code
Oct 02, 2022
Figure 1 for ManiCLIP: Multi-Attribute Face Manipulation from Text
Figure 2 for ManiCLIP: Multi-Attribute Face Manipulation from Text
Figure 3 for ManiCLIP: Multi-Attribute Face Manipulation from Text
Figure 4 for ManiCLIP: Multi-Attribute Face Manipulation from Text
Viaarxiv icon

Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding

Add code
Sep 27, 2022
Figure 1 for Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding
Figure 2 for Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding
Figure 3 for Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding
Figure 4 for Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding
Viaarxiv icon