Picture for Ruizhi Qiao

Ruizhi Qiao

Multimodal Label Relevance Ranking via Reinforcement Learning

Add code
Jul 18, 2024
Figure 1 for Multimodal Label Relevance Ranking via Reinforcement Learning
Figure 2 for Multimodal Label Relevance Ranking via Reinforcement Learning
Figure 3 for Multimodal Label Relevance Ranking via Reinforcement Learning
Figure 4 for Multimodal Label Relevance Ranking via Reinforcement Learning
Viaarxiv icon

Unified and Dynamic Graph for Temporal Character Grouping in Long Videos

Add code
Aug 29, 2023
Figure 1 for Unified and Dynamic Graph for Temporal Character Grouping in Long Videos
Figure 2 for Unified and Dynamic Graph for Temporal Character Grouping in Long Videos
Figure 3 for Unified and Dynamic Graph for Temporal Character Grouping in Long Videos
Figure 4 for Unified and Dynamic Graph for Temporal Character Grouping in Long Videos
Viaarxiv icon

Coarse-to-Fine: Learning Compact Discriminative Representation for Single-Stage Image Retrieval

Add code
Aug 08, 2023
Figure 1 for Coarse-to-Fine: Learning Compact Discriminative Representation for Single-Stage Image Retrieval
Figure 2 for Coarse-to-Fine: Learning Compact Discriminative Representation for Single-Stage Image Retrieval
Figure 3 for Coarse-to-Fine: Learning Compact Discriminative Representation for Single-Stage Image Retrieval
Figure 4 for Coarse-to-Fine: Learning Compact Discriminative Representation for Single-Stage Image Retrieval
Viaarxiv icon

D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with Glance Annotation

Add code
Aug 08, 2023
Figure 1 for D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with Glance Annotation
Figure 2 for D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with Glance Annotation
Figure 3 for D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with Glance Annotation
Figure 4 for D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with Glance Annotation
Viaarxiv icon

Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies

Add code
Mar 26, 2023
Figure 1 for Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies
Figure 2 for Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies
Figure 3 for Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies
Figure 4 for Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies
Viaarxiv icon

See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval

Add code
Aug 26, 2022
Figure 1 for See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval
Figure 2 for See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval
Figure 3 for See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval
Figure 4 for See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval
Viaarxiv icon

VLMAE: Vision-Language Masked Autoencoder

Add code
Aug 19, 2022
Figure 1 for VLMAE: Vision-Language Masked Autoencoder
Figure 2 for VLMAE: Vision-Language Masked Autoencoder
Figure 3 for VLMAE: Vision-Language Masked Autoencoder
Figure 4 for VLMAE: Vision-Language Masked Autoencoder
Viaarxiv icon

Exploiting Feature Diversity for Make-up Temporal Video Grounding

Add code
Aug 12, 2022
Figure 1 for Exploiting Feature Diversity for Make-up Temporal Video Grounding
Figure 2 for Exploiting Feature Diversity for Make-up Temporal Video Grounding
Figure 3 for Exploiting Feature Diversity for Make-up Temporal Video Grounding
Figure 4 for Exploiting Feature Diversity for Make-up Temporal Video Grounding
Viaarxiv icon

Open-Vocabulary Multi-Label Classification via Multi-modal Knowledge Transfer

Add code
Jul 05, 2022
Figure 1 for Open-Vocabulary Multi-Label Classification via Multi-modal Knowledge Transfer
Figure 2 for Open-Vocabulary Multi-Label Classification via Multi-modal Knowledge Transfer
Figure 3 for Open-Vocabulary Multi-Label Classification via Multi-modal Knowledge Transfer
Figure 4 for Open-Vocabulary Multi-Label Classification via Multi-modal Knowledge Transfer
Viaarxiv icon

Scene Consistency Representation Learning for Video Scene Segmentation

Add code
May 11, 2022
Figure 1 for Scene Consistency Representation Learning for Video Scene Segmentation
Figure 2 for Scene Consistency Representation Learning for Video Scene Segmentation
Figure 3 for Scene Consistency Representation Learning for Video Scene Segmentation
Figure 4 for Scene Consistency Representation Learning for Video Scene Segmentation
Viaarxiv icon