Picture for Shiming Xiang

Shiming Xiang

Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation

Add code
Oct 11, 2024
Figure 1 for Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation
Figure 2 for Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation
Figure 3 for Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation
Figure 4 for Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation
Viaarxiv icon

Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis

Add code
Sep 10, 2024
Figure 1 for Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis
Figure 2 for Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis
Figure 3 for Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis
Figure 4 for Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis
Viaarxiv icon

AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation

Add code
Aug 03, 2024
Figure 1 for AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation
Figure 2 for AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation
Figure 3 for AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation
Figure 4 for AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation
Viaarxiv icon

AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization

Add code
Jul 11, 2024
Figure 1 for AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization
Figure 2 for AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization
Figure 3 for AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization
Figure 4 for AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization
Viaarxiv icon

SegICL: A Universal In-context Learning Framework for Enhanced Segmentation in Medical Imaging

Add code
Apr 02, 2024
Figure 1 for SegICL: A Universal In-context Learning Framework for Enhanced Segmentation in Medical Imaging
Figure 2 for SegICL: A Universal In-context Learning Framework for Enhanced Segmentation in Medical Imaging
Figure 3 for SegICL: A Universal In-context Learning Framework for Enhanced Segmentation in Medical Imaging
Figure 4 for SegICL: A Universal In-context Learning Framework for Enhanced Segmentation in Medical Imaging
Viaarxiv icon

Weak Distribution Detectors Lead to Stronger Generalizability of Vision-Language Prompt Tuning

Add code
Mar 31, 2024
Viaarxiv icon

Reusable Architecture Growth for Continual Stereo Matching

Add code
Mar 30, 2024
Figure 1 for Reusable Architecture Growth for Continual Stereo Matching
Figure 2 for Reusable Architecture Growth for Continual Stereo Matching
Figure 3 for Reusable Architecture Growth for Continual Stereo Matching
Figure 4 for Reusable Architecture Growth for Continual Stereo Matching
Viaarxiv icon

Enhancing Visual Continual Learning with Language-Guided Supervision

Add code
Mar 24, 2024
Figure 1 for Enhancing Visual Continual Learning with Language-Guided Supervision
Figure 2 for Enhancing Visual Continual Learning with Language-Guided Supervision
Figure 3 for Enhancing Visual Continual Learning with Language-Guided Supervision
Figure 4 for Enhancing Visual Continual Learning with Language-Guided Supervision
Viaarxiv icon

Defying Imbalanced Forgetting in Class Incremental Learning

Add code
Mar 22, 2024
Figure 1 for Defying Imbalanced Forgetting in Class Incremental Learning
Figure 2 for Defying Imbalanced Forgetting in Class Incremental Learning
Figure 3 for Defying Imbalanced Forgetting in Class Incremental Learning
Figure 4 for Defying Imbalanced Forgetting in Class Incremental Learning
Viaarxiv icon

Compositional Kronecker Context Optimization for Vision-Language Models

Add code
Mar 18, 2024
Viaarxiv icon