Picture for Ziyong Feng

Ziyong Feng

UniVerse: Unleashing the Scene Prior of Video Diffusion Models for Robust Radiance Field Reconstruction

Add code
Oct 02, 2025
Viaarxiv icon

Gradient-Attention Guided Dual-Masking Synergetic Framework for Robust Text-based Person Retrieval

Add code
Sep 11, 2025
Viaarxiv icon

Region-based Cluster Discrimination for Visual Representation Learning

Add code
Jul 26, 2025
Viaarxiv icon

Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs

Add code
Apr 24, 2025
Viaarxiv icon

Decoupled Global-Local Alignment for Improving Compositional Understanding

Add code
Apr 23, 2025
Viaarxiv icon

MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space

Add code
Mar 19, 2025
Viaarxiv icon

Mocap-2-to-3: Lifting 2D Diffusion-Based Pretrained Models for 3D Motion Capture

Add code
Mar 05, 2025
Viaarxiv icon

RealSyn: An Effective and Scalable Multimodal Interleaved Document Transformation Paradigm

Add code
Feb 18, 2025
Figure 1 for RealSyn: An Effective and Scalable Multimodal Interleaved Document Transformation Paradigm
Figure 2 for RealSyn: An Effective and Scalable Multimodal Interleaved Document Transformation Paradigm
Figure 3 for RealSyn: An Effective and Scalable Multimodal Interleaved Document Transformation Paradigm
Figure 4 for RealSyn: An Effective and Scalable Multimodal Interleaved Document Transformation Paradigm
Viaarxiv icon

ORID: Organ-Regional Information Driven Framework for Radiology Report Generation

Add code
Nov 20, 2024
Viaarxiv icon

Croc: Pretraining Large Multimodal Models with Cross-Modal Comprehension

Add code
Oct 18, 2024
Figure 1 for Croc: Pretraining Large Multimodal Models with Cross-Modal Comprehension
Figure 2 for Croc: Pretraining Large Multimodal Models with Cross-Modal Comprehension
Figure 3 for Croc: Pretraining Large Multimodal Models with Cross-Modal Comprehension
Figure 4 for Croc: Pretraining Large Multimodal Models with Cross-Modal Comprehension
Viaarxiv icon