Picture for Ji Xie

Ji Xie

Seeing Sound, Hearing Sight: Uncovering Modality Bias and Conflict of AI models in Sound Localization

Add code
May 16, 2025
Viaarxiv icon

In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer

Add code
Apr 29, 2025
Viaarxiv icon

3DIS-FLUX: simple and efficient multi-instance generation with DiT rendering

Add code
Jan 09, 2025
Viaarxiv icon

3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation

Add code
Oct 16, 2024
Figure 1 for 3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation
Figure 2 for 3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation
Figure 3 for 3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation
Figure 4 for 3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation
Viaarxiv icon