Picture for Ziyun Zeng

Ziyun Zeng

MMIG-Bench: Towards Comprehensive and Explainable Evaluation of Multi-Modal Image Generation Models

Add code
May 26, 2025
Viaarxiv icon

LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale

Add code
Apr 22, 2025
Viaarxiv icon

OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting

Add code
Mar 12, 2025
Viaarxiv icon

MMCOMPOSITION: Revisiting the Compositionality of Pre-trained Vision-Language Models

Add code
Oct 13, 2024
Figure 1 for MMCOMPOSITION: Revisiting the Compositionality of Pre-trained Vision-Language Models
Figure 2 for MMCOMPOSITION: Revisiting the Compositionality of Pre-trained Vision-Language Models
Figure 3 for MMCOMPOSITION: Revisiting the Compositionality of Pre-trained Vision-Language Models
Figure 4 for MMCOMPOSITION: Revisiting the Compositionality of Pre-trained Vision-Language Models
Viaarxiv icon

PromptFix: You Prompt and We Fix the Photo

Add code
May 27, 2024
Figure 1 for PromptFix: You Prompt and We Fix the Photo
Figure 2 for PromptFix: You Prompt and We Fix the Photo
Figure 3 for PromptFix: You Prompt and We Fix the Photo
Figure 4 for PromptFix: You Prompt and We Fix the Photo
Viaarxiv icon

GMMFormer: Gaussian-Mixture-Model based Transformer for Efficient Partially Relevant Video Retrieval

Add code
Oct 08, 2023
Viaarxiv icon

Making LLaMA SEE and Draw with SEED Tokenizer

Add code
Oct 02, 2023
Figure 1 for Making LLaMA SEE and Draw with SEED Tokenizer
Figure 2 for Making LLaMA SEE and Draw with SEED Tokenizer
Figure 3 for Making LLaMA SEE and Draw with SEED Tokenizer
Figure 4 for Making LLaMA SEE and Draw with SEED Tokenizer
Viaarxiv icon

VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation

Add code
Aug 28, 2023
Figure 1 for VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation
Figure 2 for VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation
Figure 3 for VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation
Figure 4 for VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation
Viaarxiv icon

MISSRec: Pre-training and Transferring Multi-modal Interest-aware Sequence Representation for Recommendation

Add code
Aug 22, 2023
Figure 1 for MISSRec: Pre-training and Transferring Multi-modal Interest-aware Sequence Representation for Recommendation
Figure 2 for MISSRec: Pre-training and Transferring Multi-modal Interest-aware Sequence Representation for Recommendation
Figure 3 for MISSRec: Pre-training and Transferring Multi-modal Interest-aware Sequence Representation for Recommendation
Figure 4 for MISSRec: Pre-training and Transferring Multi-modal Interest-aware Sequence Representation for Recommendation
Viaarxiv icon

Planting a SEED of Vision in Large Language Model

Add code
Jul 16, 2023
Figure 1 for Planting a SEED of Vision in Large Language Model
Figure 2 for Planting a SEED of Vision in Large Language Model
Figure 3 for Planting a SEED of Vision in Large Language Model
Figure 4 for Planting a SEED of Vision in Large Language Model
Viaarxiv icon