Picture for Qingsong Xie

Qingsong Xie

Dynamic-I2V: Exploring Image-to-Video Generaion Models via Multimodal LLM

Add code
May 26, 2025
Viaarxiv icon

NTIRE 2025 challenge on Text to Image Generation Model Quality Assessment

Add code
May 22, 2025
Viaarxiv icon

Improved Visual-Spatial Reasoning via R1-Zero-Like Training

Add code
Apr 01, 2025
Viaarxiv icon

MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization

Add code
Apr 01, 2025
Viaarxiv icon

H2VU-Benchmark: A Comprehensive Benchmark for Hierarchical Holistic Video Understanding

Add code
Mar 31, 2025
Viaarxiv icon

Layton: Latent Consistency Tokenizer for 1024-pixel Image Reconstruction and Generation by 256 Tokens

Add code
Mar 12, 2025
Viaarxiv icon

PainterNet: Adaptive Image Inpainting with Actual-Token Attention and Diverse Mask Control

Add code
Dec 02, 2024
Viaarxiv icon

Fine-tuning Diffusion Models for Enhancing Face Quality in Text-to-image Generation

Add code
Jun 24, 2024
Viaarxiv icon

MLCM: Multistep Consistency Distillation of Latent Diffusion Model

Add code
Jun 12, 2024
Viaarxiv icon

Instruct-ReID++: Towards Universal Purpose Instruction-Guided Person Re-identification

Add code
May 28, 2024
Figure 1 for Instruct-ReID++: Towards Universal Purpose Instruction-Guided Person Re-identification
Figure 2 for Instruct-ReID++: Towards Universal Purpose Instruction-Guided Person Re-identification
Figure 3 for Instruct-ReID++: Towards Universal Purpose Instruction-Guided Person Re-identification
Figure 4 for Instruct-ReID++: Towards Universal Purpose Instruction-Guided Person Re-identification
Viaarxiv icon