Picture for Jingjia Huang

Jingjia Huang

Mug-STAN: Adapting Image-Language Pretrained Models for General Video Understanding

Add code
Nov 25, 2023
Viaarxiv icon

Associating Spatially-Consistent Grouping with Text-supervised Semantic Segmentation

Add code
Apr 03, 2023
Figure 1 for Associating Spatially-Consistent Grouping with Text-supervised Semantic Segmentation
Figure 2 for Associating Spatially-Consistent Grouping with Text-supervised Semantic Segmentation
Figure 3 for Associating Spatially-Consistent Grouping with Text-supervised Semantic Segmentation
Figure 4 for Associating Spatially-Consistent Grouping with Text-supervised Semantic Segmentation
Viaarxiv icon

Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring

Add code
Jan 26, 2023
Figure 1 for Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring
Figure 2 for Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring
Figure 3 for Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring
Figure 4 for Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring
Viaarxiv icon

Temporal Perceiving Video-Language Pre-training

Add code
Jan 18, 2023
Figure 1 for Temporal Perceiving Video-Language Pre-training
Figure 2 for Temporal Perceiving Video-Language Pre-training
Figure 3 for Temporal Perceiving Video-Language Pre-training
Figure 4 for Temporal Perceiving Video-Language Pre-training
Viaarxiv icon

Class Prototype-based Cleaner for Label Noise Learning

Add code
Dec 21, 2022
Figure 1 for Class Prototype-based Cleaner for Label Noise Learning
Figure 2 for Class Prototype-based Cleaner for Label Noise Learning
Figure 3 for Class Prototype-based Cleaner for Label Noise Learning
Figure 4 for Class Prototype-based Cleaner for Label Noise Learning
Viaarxiv icon

Knowledge Guided Bidirectional Attention Network for Human-Object Interaction Detection

Add code
Jul 16, 2022
Figure 1 for Knowledge Guided Bidirectional Attention Network for Human-Object Interaction Detection
Figure 2 for Knowledge Guided Bidirectional Attention Network for Human-Object Interaction Detection
Figure 3 for Knowledge Guided Bidirectional Attention Network for Human-Object Interaction Detection
Figure 4 for Knowledge Guided Bidirectional Attention Network for Human-Object Interaction Detection
Viaarxiv icon

Clover: Towards A Unified Video-Language Alignment and Fusion Model

Add code
Jul 16, 2022
Figure 1 for Clover: Towards A Unified Video-Language Alignment and Fusion Model
Figure 2 for Clover: Towards A Unified Video-Language Alignment and Fusion Model
Figure 3 for Clover: Towards A Unified Video-Language Alignment and Fusion Model
Figure 4 for Clover: Towards A Unified Video-Language Alignment and Fusion Model
Viaarxiv icon

ARMIN: Towards a More Efficient and Light-weight Recurrent Memory Network

Add code
Jun 28, 2019
Figure 1 for ARMIN: Towards a More Efficient and Light-weight Recurrent Memory Network
Figure 2 for ARMIN: Towards a More Efficient and Light-weight Recurrent Memory Network
Figure 3 for ARMIN: Towards a More Efficient and Light-weight Recurrent Memory Network
Figure 4 for ARMIN: Towards a More Efficient and Light-weight Recurrent Memory Network
Viaarxiv icon

A Self-Adaptive Proposal Model for Temporal Action Detection based on Reinforcement Learning

Add code
Jun 22, 2017
Figure 1 for A Self-Adaptive Proposal Model for Temporal Action Detection based on Reinforcement Learning
Figure 2 for A Self-Adaptive Proposal Model for Temporal Action Detection based on Reinforcement Learning
Figure 3 for A Self-Adaptive Proposal Model for Temporal Action Detection based on Reinforcement Learning
Figure 4 for A Self-Adaptive Proposal Model for Temporal Action Detection based on Reinforcement Learning
Viaarxiv icon