Picture for Shaogang Gong

Shaogang Gong

CoS: Chain-of-Shot Prompting for Long Video Understanding

Add code
Feb 10, 2025
Figure 1 for CoS: Chain-of-Shot Prompting for Long Video Understanding
Figure 2 for CoS: Chain-of-Shot Prompting for Long Video Understanding
Figure 3 for CoS: Chain-of-Shot Prompting for Long Video Understanding
Figure 4 for CoS: Chain-of-Shot Prompting for Long Video Understanding
Viaarxiv icon

INT: Instance-Specific Negative Mining for Task-Generic Promptable Segmentation

Add code
Jan 30, 2025
Figure 1 for INT: Instance-Specific Negative Mining for Task-Generic Promptable Segmentation
Figure 2 for INT: Instance-Specific Negative Mining for Task-Generic Promptable Segmentation
Figure 3 for INT: Instance-Specific Negative Mining for Task-Generic Promptable Segmentation
Figure 4 for INT: Instance-Specific Negative Mining for Task-Generic Promptable Segmentation
Viaarxiv icon

InvSeg: Test-Time Prompt Inversion for Semantic Segmentation

Add code
Oct 15, 2024
Figure 1 for InvSeg: Test-Time Prompt Inversion for Semantic Segmentation
Figure 2 for InvSeg: Test-Time Prompt Inversion for Semantic Segmentation
Figure 3 for InvSeg: Test-Time Prompt Inversion for Semantic Segmentation
Figure 4 for InvSeg: Test-Time Prompt Inversion for Semantic Segmentation
Viaarxiv icon

Leveraging Hallucinations to Reduce Manual Prompt Dependency in Promptable Segmentation

Add code
Aug 27, 2024
Viaarxiv icon

Few-Shot Image Generation by Conditional Relaxing Diffusion Inversion

Add code
Jul 09, 2024
Viaarxiv icon

SHINE: Saliency-aware HIerarchical NEgative Ranking for Compositional Temporal Grounding

Add code
Jul 06, 2024
Figure 1 for SHINE: Saliency-aware HIerarchical NEgative Ranking for Compositional Temporal Grounding
Figure 2 for SHINE: Saliency-aware HIerarchical NEgative Ranking for Compositional Temporal Grounding
Figure 3 for SHINE: Saliency-aware HIerarchical NEgative Ranking for Compositional Temporal Grounding
Figure 4 for SHINE: Saliency-aware HIerarchical NEgative Ranking for Compositional Temporal Grounding
Viaarxiv icon

MLLM as Video Narrator: Mitigating Modality Imbalance in Video Moment Retrieval

Add code
Jun 25, 2024
Figure 1 for MLLM as Video Narrator: Mitigating Modality Imbalance in Video Moment Retrieval
Figure 2 for MLLM as Video Narrator: Mitigating Modality Imbalance in Video Moment Retrieval
Figure 3 for MLLM as Video Narrator: Mitigating Modality Imbalance in Video Moment Retrieval
Figure 4 for MLLM as Video Narrator: Mitigating Modality Imbalance in Video Moment Retrieval
Viaarxiv icon

Hybrid-Learning Video Moment Retrieval across Multi-Domain Labels

Add code
Jun 03, 2024
Viaarxiv icon

Enhancing Zero-Shot Facial Expression Recognition by LLM Knowledge Transfer

Add code
May 29, 2024
Figure 1 for Enhancing Zero-Shot Facial Expression Recognition by LLM Knowledge Transfer
Figure 2 for Enhancing Zero-Shot Facial Expression Recognition by LLM Knowledge Transfer
Figure 3 for Enhancing Zero-Shot Facial Expression Recognition by LLM Knowledge Transfer
Figure 4 for Enhancing Zero-Shot Facial Expression Recognition by LLM Knowledge Transfer
Viaarxiv icon

Generative Video Diffusion for Unseen Cross-Domain Video Moment Retrieval

Add code
Jan 29, 2024
Viaarxiv icon