Picture for Xinchen Zhang

Xinchen Zhang

NoteIt: A System Converting Instructional Videos to Interactable Notes Through Multimodal Video Understanding

Add code
Aug 20, 2025
Viaarxiv icon

PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning

Add code
Jun 17, 2025
Viaarxiv icon

MMaDA: Multimodal Large Diffusion Language Models

Add code
May 21, 2025
Viaarxiv icon

Seed1.5-VL Technical Report

Add code
May 11, 2025
Viaarxiv icon

HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation

Add code
Feb 17, 2025
Viaarxiv icon

Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening

Add code
Feb 17, 2025
Figure 1 for Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening
Figure 2 for Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening
Figure 3 for Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening
Figure 4 for Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening
Viaarxiv icon

Continual Learning with Strategic Selection and Forgetting for Network Intrusion Detection

Add code
Dec 20, 2024
Viaarxiv icon

IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation

Add code
Oct 09, 2024
Figure 1 for IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Figure 2 for IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Figure 3 for IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Figure 4 for IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Viaarxiv icon

The Group Robustness is in the Details: Revisiting Finetuning under Spurious Correlations

Add code
Jul 19, 2024
Viaarxiv icon

RealCompo: Dynamic Equilibrium between Realism and Compositionality Improves Text-to-Image Diffusion Models

Add code
Feb 20, 2024
Figure 1 for RealCompo: Dynamic Equilibrium between Realism and Compositionality Improves Text-to-Image Diffusion Models
Figure 2 for RealCompo: Dynamic Equilibrium between Realism and Compositionality Improves Text-to-Image Diffusion Models
Figure 3 for RealCompo: Dynamic Equilibrium between Realism and Compositionality Improves Text-to-Image Diffusion Models
Figure 4 for RealCompo: Dynamic Equilibrium between Realism and Compositionality Improves Text-to-Image Diffusion Models
Viaarxiv icon