Alert button
Picture for Yi Yang

Yi Yang

Alert button

Lana: A Language-Capable Navigator for Instruction Following and Generation

Add code
Bookmark button
Alert button
Mar 15, 2023
Xiaohan Wang, Wenguan Wang, Jiayi Shao, Yi Yang

Figure 1 for Lana: A Language-Capable Navigator for Instruction Following and Generation
Figure 2 for Lana: A Language-Capable Navigator for Instruction Following and Generation
Figure 3 for Lana: A Language-Capable Navigator for Instruction Following and Generation
Figure 4 for Lana: A Language-Capable Navigator for Instruction Following and Generation
Viaarxiv icon

DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training

Add code
Bookmark button
Alert button
Mar 06, 2023
Wei Li, Linchao Zhu, Longyin Wen, Yi Yang

Figure 1 for DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training
Figure 2 for DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training
Figure 3 for DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training
Figure 4 for DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training
Viaarxiv icon

Soft Prompt Guided Joint Learning for Cross-Domain Sentiment Analysis

Add code
Bookmark button
Alert button
Mar 01, 2023
Jingli Shi, Weihua Li, Quan Bai, Yi Yang, Jianhua Jiang

Figure 1 for Soft Prompt Guided Joint Learning for Cross-Domain Sentiment Analysis
Figure 2 for Soft Prompt Guided Joint Learning for Cross-Domain Sentiment Analysis
Figure 3 for Soft Prompt Guided Joint Learning for Cross-Domain Sentiment Analysis
Figure 4 for Soft Prompt Guided Joint Learning for Cross-Domain Sentiment Analysis
Viaarxiv icon

Variational Cross-Graph Reasoning and Adaptive Structured Semantics Learning for Compositional Temporal Grounding

Add code
Bookmark button
Alert button
Jan 22, 2023
Juncheng Li, Siliang Tang, Linchao Zhu, Wenqiao Zhang, Yi Yang, Tat-Seng Chua, Fei Wu, Yueting Zhuang

Figure 1 for Variational Cross-Graph Reasoning and Adaptive Structured Semantics Learning for Compositional Temporal Grounding
Figure 2 for Variational Cross-Graph Reasoning and Adaptive Structured Semantics Learning for Compositional Temporal Grounding
Figure 3 for Variational Cross-Graph Reasoning and Adaptive Structured Semantics Learning for Compositional Temporal Grounding
Figure 4 for Variational Cross-Graph Reasoning and Adaptive Structured Semantics Learning for Compositional Temporal Grounding
Viaarxiv icon

Temporal Perceiving Video-Language Pre-training

Add code
Bookmark button
Alert button
Jan 18, 2023
Fan Ma, Xiaojie Jin, Heng Wang, Jingjia Huang, Linchao Zhu, Jiashi Feng, Yi Yang

Figure 1 for Temporal Perceiving Video-Language Pre-training
Figure 2 for Temporal Perceiving Video-Language Pre-training
Figure 3 for Temporal Perceiving Video-Language Pre-training
Figure 4 for Temporal Perceiving Video-Language Pre-training
Viaarxiv icon

DR-WLC: Dimensionality Reduction cognition for object detection and pose estimation by Watching, Learning and Checking

Add code
Bookmark button
Alert button
Jan 17, 2023
Yu Gao, Xi Xu, Tianji Jiang, Siyuan Chen, Yi Yang, Yufeng Yue, Mengyin Fu

Figure 1 for DR-WLC: Dimensionality Reduction cognition for object detection and pose estimation by Watching, Learning and Checking
Figure 2 for DR-WLC: Dimensionality Reduction cognition for object detection and pose estimation by Watching, Learning and Checking
Figure 3 for DR-WLC: Dimensionality Reduction cognition for object detection and pose estimation by Watching, Learning and Checking
Figure 4 for DR-WLC: Dimensionality Reduction cognition for object detection and pose estimation by Watching, Learning and Checking
Viaarxiv icon

Further Improving Weakly-supervised Object Localization via Causal Knowledge Distillation

Add code
Bookmark button
Alert button
Jan 03, 2023
Feifei Shao, Yawei Luo, Shengjian Wu, Qiyi Li, Fei Gao, Yi Yang, Jun Xiao

Figure 1 for Further Improving Weakly-supervised Object Localization via Causal Knowledge Distillation
Figure 2 for Further Improving Weakly-supervised Object Localization via Causal Knowledge Distillation
Figure 3 for Further Improving Weakly-supervised Object Localization via Causal Knowledge Distillation
Figure 4 for Further Improving Weakly-supervised Object Localization via Causal Knowledge Distillation
Viaarxiv icon

Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models

Add code
Bookmark button
Alert button
Dec 31, 2022
Wenhao Wu, Xiaohan Wang, Haipeng Luo, Jingdong Wang, Yi Yang, Wanli Ouyang

Figure 1 for Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
Figure 2 for Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
Figure 3 for Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
Figure 4 for Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
Viaarxiv icon

StepNet: Spatial-temporal Part-aware Network for Sign Language Recognition

Add code
Bookmark button
Alert button
Dec 25, 2022
Xiaolong Shen, Zhedong Zheng, Yi Yang

Figure 1 for StepNet: Spatial-temporal Part-aware Network for Sign Language Recognition
Figure 2 for StepNet: Spatial-temporal Part-aware Network for Sign Language Recognition
Figure 3 for StepNet: Spatial-temporal Part-aware Network for Sign Language Recognition
Figure 4 for StepNet: Spatial-temporal Part-aware Network for Sign Language Recognition
Viaarxiv icon

MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering

Add code
Bookmark button
Alert button
Dec 19, 2022
Difei Gao, Luowei Zhou, Lei Ji, Linchao Zhu, Yi Yang, Mike Zheng Shou

Figure 1 for MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering
Figure 2 for MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering
Figure 3 for MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering
Figure 4 for MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering
Viaarxiv icon