Picture for Muzhi Zhu

Muzhi Zhu

LLaDA2.1: Speeding Up Text Diffusion via Token Editing

Add code
Feb 09, 2026
Viaarxiv icon

Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality

Add code
Dec 08, 2025
Viaarxiv icon

Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO

Add code
May 27, 2025
Viaarxiv icon

Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration

Add code
May 26, 2025
Viaarxiv icon

SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories

Add code
Mar 11, 2025
Viaarxiv icon

DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks

Add code
Feb 25, 2025
Viaarxiv icon

A Simple Image Segmentation Framework via In-Context Examples

Add code
Oct 07, 2024
Viaarxiv icon

Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation

Add code
Oct 03, 2024
Figure 1 for Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation
Figure 2 for Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation
Figure 3 for Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation
Figure 4 for Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation
Viaarxiv icon

Generative Active Learning for Long-tailed Instance Segmentation

Add code
Jun 04, 2024
Viaarxiv icon

DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data

Add code
May 16, 2024
Viaarxiv icon