Picture for Harry Yang

Harry Yang

Follow-Your-Shape: Shape-Aware Image Editing via Trajectory-Guided Region Control

Add code
Aug 12, 2025
Viaarxiv icon

Enhancing Vector Quantization with Distributional Matching: A Theoretical and Empirical Study

Add code
Jun 18, 2025
Viaarxiv icon

When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding

Add code
Jun 05, 2025
Viaarxiv icon

Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models

Add code
Apr 04, 2025
Viaarxiv icon

VideoGen-of-Thought: Step-by-step generating multi-shot video with minimal manual intervention

Add code
Mar 20, 2025
Viaarxiv icon

Temporal Regularization Makes Your Video Generator Stronger

Add code
Mar 19, 2025
Viaarxiv icon

Niagara: Normal-Integrated Geometric Affine Field for Scene Reconstruction from a Single View

Add code
Mar 16, 2025
Figure 1 for Niagara: Normal-Integrated Geometric Affine Field for Scene Reconstruction from a Single View
Figure 2 for Niagara: Normal-Integrated Geometric Affine Field for Scene Reconstruction from a Single View
Figure 3 for Niagara: Normal-Integrated Geometric Affine Field for Scene Reconstruction from a Single View
Figure 4 for Niagara: Normal-Integrated Geometric Affine Field for Scene Reconstruction from a Single View
Viaarxiv icon

VideoMerge: Towards Training-free Long Video Generation

Add code
Mar 13, 2025
Figure 1 for VideoMerge: Towards Training-free Long Video Generation
Figure 2 for VideoMerge: Towards Training-free Long Video Generation
Figure 3 for VideoMerge: Towards Training-free Long Video Generation
Figure 4 for VideoMerge: Towards Training-free Long Video Generation
Viaarxiv icon

LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization

Add code
Mar 11, 2025
Figure 1 for LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization
Figure 2 for LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization
Figure 3 for LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization
Figure 4 for LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization
Viaarxiv icon

VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer

Add code
Feb 09, 2025
Viaarxiv icon