Picture for Hongyuan Zhang

Hongyuan Zhang

Safeguarding Text-to-Image Generative Models Against Unauthorized Knowledge Distillation

Add code
May 21, 2026
Viaarxiv icon

GRPO-TTA: Test-Time Visual Tuning for Vision-Language Models via GRPO-Driven Reinforcement Learning

Add code
May 05, 2026
Viaarxiv icon

Batch Loss Score for Dynamic Data Pruning

Add code
Apr 06, 2026
Viaarxiv icon

VPWEM: Non-Markovian Visuomotor Policy with Working and Episodic Memory

Add code
Mar 05, 2026
Viaarxiv icon

Crab$^{+}$: A Scalable and Unified Audio-Visual Scene Understanding Model with Explicit Cooperation

Add code
Mar 04, 2026
Viaarxiv icon

AHAP: Reconstructing Arbitrary Humans from Arbitrary Perspectives with Geometric Priors

Add code
Feb 27, 2026
Viaarxiv icon

ERNIE 5.0 Technical Report

Add code
Feb 04, 2026
Viaarxiv icon

MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning

Add code
Dec 29, 2025
Viaarxiv icon

ViewMask-1-to-3: Multi-View Consistent Image Generation via Multimodal Diffusion Models

Add code
Dec 16, 2025
Figure 1 for ViewMask-1-to-3: Multi-View Consistent Image Generation via Multimodal Diffusion Models
Figure 2 for ViewMask-1-to-3: Multi-View Consistent Image Generation via Multimodal Diffusion Models
Figure 3 for ViewMask-1-to-3: Multi-View Consistent Image Generation via Multimodal Diffusion Models
Figure 4 for ViewMask-1-to-3: Multi-View Consistent Image Generation via Multimodal Diffusion Models
Viaarxiv icon

GRPO-RM: Fine-Tuning Representation Models via GRPO-Driven Reinforcement Learning

Add code
Nov 19, 2025
Viaarxiv icon