Picture for Yi Yang

Yi Yang

The Hong Kong University of Science and Technology, Hong Kong SAR, China

Variational Cross-Graph Reasoning and Adaptive Structured Semantics Learning for Compositional Temporal Grounding

Add code
Jan 22, 2023
Viaarxiv icon

Temporal Perceiving Video-Language Pre-training

Add code
Jan 18, 2023
Viaarxiv icon

DR-WLC: Dimensionality Reduction cognition for object detection and pose estimation by Watching, Learning and Checking

Add code
Jan 17, 2023
Figure 1 for DR-WLC: Dimensionality Reduction cognition for object detection and pose estimation by Watching, Learning and Checking
Figure 2 for DR-WLC: Dimensionality Reduction cognition for object detection and pose estimation by Watching, Learning and Checking
Figure 3 for DR-WLC: Dimensionality Reduction cognition for object detection and pose estimation by Watching, Learning and Checking
Figure 4 for DR-WLC: Dimensionality Reduction cognition for object detection and pose estimation by Watching, Learning and Checking
Viaarxiv icon

Further Improving Weakly-supervised Object Localization via Causal Knowledge Distillation

Add code
Jan 03, 2023
Figure 1 for Further Improving Weakly-supervised Object Localization via Causal Knowledge Distillation
Figure 2 for Further Improving Weakly-supervised Object Localization via Causal Knowledge Distillation
Figure 3 for Further Improving Weakly-supervised Object Localization via Causal Knowledge Distillation
Figure 4 for Further Improving Weakly-supervised Object Localization via Causal Knowledge Distillation
Viaarxiv icon

Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models

Add code
Dec 31, 2022
Viaarxiv icon

StepNet: Spatial-temporal Part-aware Network for Sign Language Recognition

Add code
Dec 25, 2022
Figure 1 for StepNet: Spatial-temporal Part-aware Network for Sign Language Recognition
Figure 2 for StepNet: Spatial-temporal Part-aware Network for Sign Language Recognition
Figure 3 for StepNet: Spatial-temporal Part-aware Network for Sign Language Recognition
Figure 4 for StepNet: Spatial-temporal Part-aware Network for Sign Language Recognition
Viaarxiv icon

MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering

Add code
Dec 19, 2022
Viaarxiv icon

One is All: Bridging the Gap Between Neural Radiance Fields Architectures with Progressive Volume Distillation

Add code
Nov 30, 2022
Viaarxiv icon

A Light-weight, Effective and Efficient Model for Label Aggregation in Crowdsourcing

Add code
Nov 19, 2022
Figure 1 for A Light-weight, Effective and Efficient Model for Label Aggregation in Crowdsourcing
Figure 2 for A Light-weight, Effective and Efficient Model for Label Aggregation in Crowdsourcing
Figure 3 for A Light-weight, Effective and Efficient Model for Label Aggregation in Crowdsourcing
Figure 4 for A Light-weight, Effective and Efficient Model for Label Aggregation in Crowdsourcing
Viaarxiv icon

Stereo Image Rain Removal via Dual-View Mutual Attention

Add code
Nov 18, 2022
Viaarxiv icon