Picture for Shutao Li

Shutao Li

Fellow, IEEE

Modeling the Label Distributions for Weakly-Supervised Semantic Segmentation

Add code
Mar 20, 2024
Figure 1 for Modeling the Label Distributions for Weakly-Supervised Semantic Segmentation
Figure 2 for Modeling the Label Distributions for Weakly-Supervised Semantic Segmentation
Figure 3 for Modeling the Label Distributions for Weakly-Supervised Semantic Segmentation
Figure 4 for Modeling the Label Distributions for Weakly-Supervised Semantic Segmentation
Viaarxiv icon

GeReA: Question-Aware Prompt Captions for Knowledge-based Visual Question Answering

Add code
Feb 04, 2024
Viaarxiv icon

Hyperspectral Image Fusion via Logarithmic Low-rank Tensor Ring Decomposition

Add code
Oct 16, 2023
Figure 1 for Hyperspectral Image Fusion via Logarithmic Low-rank Tensor Ring Decomposition
Figure 2 for Hyperspectral Image Fusion via Logarithmic Low-rank Tensor Ring Decomposition
Figure 3 for Hyperspectral Image Fusion via Logarithmic Low-rank Tensor Ring Decomposition
Figure 4 for Hyperspectral Image Fusion via Logarithmic Low-rank Tensor Ring Decomposition
Viaarxiv icon

VPUFormer: Visual Prompt Unified Transformer for Interactive Image Segmentation

Add code
Jun 11, 2023
Figure 1 for VPUFormer: Visual Prompt Unified Transformer for Interactive Image Segmentation
Figure 2 for VPUFormer: Visual Prompt Unified Transformer for Interactive Image Segmentation
Figure 3 for VPUFormer: Visual Prompt Unified Transformer for Interactive Image Segmentation
Figure 4 for VPUFormer: Visual Prompt Unified Transformer for Interactive Image Segmentation
Viaarxiv icon

AdaptiveClick: Clicks-aware Transformer with Adaptive Focal Loss for Interactive Image Segmentation

Add code
May 07, 2023
Viaarxiv icon

LOGO-Former: Local-Global Spatio-Temporal Transformer for Dynamic Facial Expression Recognition

Add code
May 05, 2023
Figure 1 for LOGO-Former: Local-Global Spatio-Temporal Transformer for Dynamic Facial Expression Recognition
Figure 2 for LOGO-Former: Local-Global Spatio-Temporal Transformer for Dynamic Facial Expression Recognition
Figure 3 for LOGO-Former: Local-Global Spatio-Temporal Transformer for Dynamic Facial Expression Recognition
Figure 4 for LOGO-Former: Local-Global Spatio-Temporal Transformer for Dynamic Facial Expression Recognition
Viaarxiv icon

Learning to Locate Visual Answer in Video Corpus Using Question

Add code
Oct 11, 2022
Figure 1 for Learning to Locate Visual Answer in Video Corpus Using Question
Figure 2 for Learning to Locate Visual Answer in Video Corpus Using Question
Figure 3 for Learning to Locate Visual Answer in Video Corpus Using Question
Figure 4 for Learning to Locate Visual Answer in Video Corpus Using Question
Viaarxiv icon

Scene-Aware Prompt for Multi-modal Dialogue Understanding and Generation

Add code
Jul 05, 2022
Figure 1 for Scene-Aware Prompt for Multi-modal Dialogue Understanding and Generation
Figure 2 for Scene-Aware Prompt for Multi-modal Dialogue Understanding and Generation
Figure 3 for Scene-Aware Prompt for Multi-modal Dialogue Understanding and Generation
Figure 4 for Scene-Aware Prompt for Multi-modal Dialogue Understanding and Generation
Viaarxiv icon

Spatio-Temporal Transformer for Dynamic Facial Expression Recognition in the Wild

Add code
May 10, 2022
Figure 1 for Spatio-Temporal Transformer for Dynamic Facial Expression Recognition in the Wild
Figure 2 for Spatio-Temporal Transformer for Dynamic Facial Expression Recognition in the Wild
Figure 3 for Spatio-Temporal Transformer for Dynamic Facial Expression Recognition in the Wild
Figure 4 for Spatio-Temporal Transformer for Dynamic Facial Expression Recognition in the Wild
Viaarxiv icon

LingYi: Medical Conversational Question Answering System based on Multi-modal Knowledge Graphs

Add code
Apr 20, 2022
Figure 1 for LingYi: Medical Conversational Question Answering System based on Multi-modal Knowledge Graphs
Figure 2 for LingYi: Medical Conversational Question Answering System based on Multi-modal Knowledge Graphs
Figure 3 for LingYi: Medical Conversational Question Answering System based on Multi-modal Knowledge Graphs
Figure 4 for LingYi: Medical Conversational Question Answering System based on Multi-modal Knowledge Graphs
Viaarxiv icon