Picture for Kaixin Cai

Kaixin Cai

MagicSeg: Open-World Segmentation Pretraining via Counterfactural Diffusion-Based Auto-Generation

Add code
Mar 20, 2026
Viaarxiv icon

Can Atomic Step Decomposition Enhance the Self-structured Reasoning of Multimodal Large Models?

Add code
Mar 08, 2025
Viaarxiv icon

TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation

Add code
Apr 29, 2024
Figure 1 for TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation
Figure 2 for TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation
Figure 3 for TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation
Figure 4 for TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation
Viaarxiv icon

LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model

Add code
Mar 18, 2024
Figure 1 for LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model
Figure 2 for LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model
Figure 3 for LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model
Figure 4 for LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model
Viaarxiv icon

MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner for Open-World Semantic Segmentation

Add code
Aug 09, 2023
Figure 1 for MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner for Open-World Semantic Segmentation
Figure 2 for MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner for Open-World Semantic Segmentation
Figure 3 for MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner for Open-World Semantic Segmentation
Figure 4 for MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner for Open-World Semantic Segmentation
Viaarxiv icon