Picture for Jiji Tang

Jiji Tang

Character-Adapter: Prompt-Guided Region Control for High-Fidelity Character Customization

Add code
Jun 24, 2024
Figure 1 for Character-Adapter: Prompt-Guided Region Control for High-Fidelity Character Customization
Figure 2 for Character-Adapter: Prompt-Guided Region Control for High-Fidelity Character Customization
Figure 3 for Character-Adapter: Prompt-Guided Region Control for High-Fidelity Character Customization
Figure 4 for Character-Adapter: Prompt-Guided Region Control for High-Fidelity Character Customization
Viaarxiv icon

Let Storytelling Tell Vivid Stories: An Expressive and Fluent Multimodal Storyteller

Add code
Mar 12, 2024
Figure 1 for Let Storytelling Tell Vivid Stories: An Expressive and Fluent Multimodal Storyteller
Figure 2 for Let Storytelling Tell Vivid Stories: An Expressive and Fluent Multimodal Storyteller
Figure 3 for Let Storytelling Tell Vivid Stories: An Expressive and Fluent Multimodal Storyteller
Figure 4 for Let Storytelling Tell Vivid Stories: An Expressive and Fluent Multimodal Storyteller
Viaarxiv icon

Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks

Add code
Jan 23, 2024
Figure 1 for Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks
Figure 2 for Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks
Figure 3 for Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks
Figure 4 for Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks
Viaarxiv icon

Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation

Add code
Aug 06, 2023
Figure 1 for Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation
Figure 2 for Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation
Figure 3 for Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation
Figure 4 for Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation
Viaarxiv icon

Structure-CLIP: Enhance Multi-modal Language Representations with Structure Knowledge

Add code
May 06, 2023
Figure 1 for Structure-CLIP: Enhance Multi-modal Language Representations with Structure Knowledge
Figure 2 for Structure-CLIP: Enhance Multi-modal Language Representations with Structure Knowledge
Figure 3 for Structure-CLIP: Enhance Multi-modal Language Representations with Structure Knowledge
Figure 4 for Structure-CLIP: Enhance Multi-modal Language Representations with Structure Knowledge
Viaarxiv icon

ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph

Add code
Jun 30, 2020
Figure 1 for ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph
Figure 2 for ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph
Figure 3 for ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph
Figure 4 for ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph
Viaarxiv icon