Picture for Xiaodan Liang

Xiaodan Liang

TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation

Add code
Apr 29, 2024
Figure 1 for TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation
Figure 2 for TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation
Figure 3 for TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation
Figure 4 for TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation
Viaarxiv icon

ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving

Add code
Apr 25, 2024
Figure 1 for ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving
Figure 2 for ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving
Figure 3 for ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving
Figure 4 for ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving
Viaarxiv icon

DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection

Add code
Apr 14, 2024
Figure 1 for DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection
Figure 2 for DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection
Figure 3 for DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection
Figure 4 for DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection
Viaarxiv icon

MLP Can Be A Good Transformer Learner

Add code
Apr 08, 2024
Viaarxiv icon

LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model

Add code
Mar 18, 2024
Viaarxiv icon

DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation

Add code
Mar 13, 2024
Viaarxiv icon

Language-Driven Visual Consensus for Zero-Shot Semantic Segmentation

Add code
Mar 13, 2024
Figure 1 for Language-Driven Visual Consensus for Zero-Shot Semantic Segmentation
Figure 2 for Language-Driven Visual Consensus for Zero-Shot Semantic Segmentation
Figure 3 for Language-Driven Visual Consensus for Zero-Shot Semantic Segmentation
Figure 4 for Language-Driven Visual Consensus for Zero-Shot Semantic Segmentation
Viaarxiv icon

NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning

Add code
Mar 12, 2024
Figure 1 for NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning
Figure 2 for NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning
Figure 3 for NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning
Figure 4 for NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning
Viaarxiv icon

Towards Deviation-Robust Agent Navigation via Perturbation-Aware Contrastive Learning

Add code
Mar 09, 2024
Viaarxiv icon

DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions

Add code
Mar 02, 2024
Figure 1 for DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions
Figure 2 for DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions
Figure 3 for DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions
Figure 4 for DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions
Viaarxiv icon