Picture for Xianbiao Qi

Xianbiao Qi

DNT: a Deeply Normalized Transformer that can be trained by Momentum SGD

Add code
Jul 23, 2025
Viaarxiv icon

MiniMax-Remover: Taming Bad Noise Helps Video Object Removal

Add code
May 30, 2025
Viaarxiv icon

Taming Transformer Without Using Learning Rate Warmup

Add code
May 28, 2025
Viaarxiv icon

Exploring a Principled Framework for Deep Subspace Clustering

Add code
Mar 21, 2025
Viaarxiv icon

Neural Normalized Cut: A Differential and Generalizable Approach for Spectral Clustering

Add code
Mar 12, 2025
Viaarxiv icon

Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model

Add code
Feb 24, 2025
Viaarxiv icon

Señorita-2M: A High-Quality Instruction-based Dataset for General Video Editing by Video Specialists

Add code
Feb 10, 2025
Figure 1 for Señorita-2M: A High-Quality Instruction-based Dataset for General Video Editing by Video Specialists
Figure 2 for Señorita-2M: A High-Quality Instruction-based Dataset for General Video Editing by Video Specialists
Figure 3 for Señorita-2M: A High-Quality Instruction-based Dataset for General Video Editing by Video Specialists
Figure 4 for Señorita-2M: A High-Quality Instruction-based Dataset for General Video Editing by Video Specialists
Viaarxiv icon

Elucidating the design space of language models for image generation

Add code
Oct 21, 2024
Figure 1 for Elucidating the design space of language models for image generation
Figure 2 for Elucidating the design space of language models for image generation
Figure 3 for Elucidating the design space of language models for image generation
Figure 4 for Elucidating the design space of language models for image generation
Viaarxiv icon

BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities

Add code
Oct 18, 2024
Figure 1 for BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities
Figure 2 for BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities
Figure 3 for BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities
Figure 4 for BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities
Viaarxiv icon

CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Controllability and Compatibility

Add code
Mar 18, 2024
Viaarxiv icon