Picture for Jiaya Jia

Jiaya Jia

Modular Customization of Diffusion Models via Blockwise-Parameterized Low-Rank Adaptation

Add code
Mar 11, 2025
Viaarxiv icon

Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement

Add code
Mar 09, 2025
Viaarxiv icon

Effective LLM Knowledge Learning via Model Generalization

Add code
Mar 05, 2025
Viaarxiv icon

GRADEO: Towards Human-Like Evaluation for Text-to-Video Generation via Multi-Step Reasoning

Add code
Mar 04, 2025
Viaarxiv icon

Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers

Add code
Jan 07, 2025
Figure 1 for Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers
Figure 2 for Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers
Figure 3 for Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers
Figure 4 for Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers
Viaarxiv icon

Generative Video Propagation

Add code
Dec 27, 2024
Figure 1 for Generative Video Propagation
Figure 2 for Generative Video Propagation
Figure 3 for Generative Video Propagation
Figure 4 for Generative Video Propagation
Viaarxiv icon

DreamOmni: Unified Image Generation and Editing

Add code
Dec 22, 2024
Figure 1 for DreamOmni: Unified Image Generation and Editing
Figure 2 for DreamOmni: Unified Image Generation and Editing
Figure 3 for DreamOmni: Unified Image Generation and Editing
Figure 4 for DreamOmni: Unified Image Generation and Editing
Viaarxiv icon

Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition

Add code
Dec 12, 2024
Figure 1 for Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition
Figure 2 for Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition
Figure 3 for Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition
Figure 4 for Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition
Viaarxiv icon

VisionZip: Longer is Better but Not Necessary in Vision Language Models

Add code
Dec 05, 2024
Figure 1 for VisionZip: Longer is Better but Not Necessary in Vision Language Models
Figure 2 for VisionZip: Longer is Better but Not Necessary in Vision Language Models
Figure 3 for VisionZip: Longer is Better but Not Necessary in Vision Language Models
Figure 4 for VisionZip: Longer is Better but Not Necessary in Vision Language Models
Viaarxiv icon

ControlNeXt: Powerful and Efficient Control for Image and Video Generation

Add code
Aug 15, 2024
Figure 1 for ControlNeXt: Powerful and Efficient Control for Image and Video Generation
Figure 2 for ControlNeXt: Powerful and Efficient Control for Image and Video Generation
Figure 3 for ControlNeXt: Powerful and Efficient Control for Image and Video Generation
Figure 4 for ControlNeXt: Powerful and Efficient Control for Image and Video Generation
Viaarxiv icon