Picture for Jianlong Fu

Jianlong Fu

MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation

Add code
Dec 19, 2022
Viaarxiv icon

Weakly-supervised Pre-training for 3D Human Pose Estimation via Perspective Knowledge

Add code
Nov 22, 2022
Viaarxiv icon

Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning

Add code
Oct 12, 2022
Figure 1 for Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning
Figure 2 for Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning
Figure 3 for Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning
Figure 4 for Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning
Viaarxiv icon

Fine-Grained Image Style Transfer with Visual Transformers

Add code
Oct 11, 2022
Figure 1 for Fine-Grained Image Style Transfer with Visual Transformers
Figure 2 for Fine-Grained Image Style Transfer with Visual Transformers
Figure 3 for Fine-Grained Image Style Transfer with Visual Transformers
Figure 4 for Fine-Grained Image Style Transfer with Visual Transformers
Viaarxiv icon

CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment

Add code
Sep 23, 2022
Figure 1 for CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment
Figure 2 for CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment
Figure 3 for CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment
Figure 4 for CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment
Viaarxiv icon

AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation

Add code
Sep 08, 2022
Figure 1 for AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation
Figure 2 for AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation
Figure 3 for AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation
Figure 4 for AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation
Viaarxiv icon

4D LUT: Learnable Context-Aware 4D Lookup Table for Image Enhancement

Add code
Sep 05, 2022
Figure 1 for 4D LUT: Learnable Context-Aware 4D Lookup Table for Image Enhancement
Figure 2 for 4D LUT: Learnable Context-Aware 4D Lookup Table for Image Enhancement
Figure 3 for 4D LUT: Learnable Context-Aware 4D Lookup Table for Image Enhancement
Figure 4 for 4D LUT: Learnable Context-Aware 4D Lookup Table for Image Enhancement
Viaarxiv icon

Language-Guided Face Animation by Recurrent StyleGAN-based Generator

Add code
Aug 11, 2022
Figure 1 for Language-Guided Face Animation by Recurrent StyleGAN-based Generator
Figure 2 for Language-Guided Face Animation by Recurrent StyleGAN-based Generator
Figure 3 for Language-Guided Face Animation by Recurrent StyleGAN-based Generator
Figure 4 for Language-Guided Face Animation by Recurrent StyleGAN-based Generator
Viaarxiv icon

Exploring Anchor-based Detection for Ego4D Natural Language Query

Add code
Aug 10, 2022
Figure 1 for Exploring Anchor-based Detection for Ego4D Natural Language Query
Figure 2 for Exploring Anchor-based Detection for Ego4D Natural Language Query
Figure 3 for Exploring Anchor-based Detection for Ego4D Natural Language Query
Figure 4 for Exploring Anchor-based Detection for Ego4D Natural Language Query
Viaarxiv icon

GRIT-VLP: Grouped Mini-batch Sampling for Efficient Vision and Language Pre-training

Add code
Aug 08, 2022
Viaarxiv icon