Picture for Yuexian Zou

Yuexian Zou

SpatioTemporal Focus for Skeleton-based Action Recognition

Add code
Mar 31, 2022
Figure 1 for SpatioTemporal Focus for Skeleton-based Action Recognition
Figure 2 for SpatioTemporal Focus for Skeleton-based Action Recognition
Figure 3 for SpatioTemporal Focus for Skeleton-based Action Recognition
Figure 4 for SpatioTemporal Focus for Skeleton-based Action Recognition
Viaarxiv icon

Unsupervised Pre-training for Temporal Action Localization Tasks

Add code
Mar 25, 2022
Figure 1 for Unsupervised Pre-training for Temporal Action Localization Tasks
Figure 2 for Unsupervised Pre-training for Temporal Action Localization Tasks
Figure 3 for Unsupervised Pre-training for Temporal Action Localization Tasks
Figure 4 for Unsupervised Pre-training for Temporal Action Localization Tasks
Viaarxiv icon

Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model

Add code
Jan 06, 2022
Figure 1 for Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model
Figure 2 for Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model
Figure 3 for Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model
Figure 4 for Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model
Viaarxiv icon

Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI

Add code
Dec 30, 2021
Figure 1 for Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI
Figure 2 for Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI
Figure 3 for Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI
Figure 4 for Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI
Viaarxiv icon

Detect what you want: Target Sound Detection

Add code
Dec 19, 2021
Figure 1 for Detect what you want: Target Sound Detection
Figure 2 for Detect what you want: Target Sound Detection
Figure 3 for Detect what you want: Target Sound Detection
Figure 4 for Detect what you want: Target Sound Detection
Viaarxiv icon

CLIP Meets Video Captioners: Attribute-Aware Representation Learning Promotes Accurate Captioning

Add code
Nov 30, 2021
Figure 1 for CLIP Meets Video Captioners: Attribute-Aware Representation Learning Promotes Accurate Captioning
Figure 2 for CLIP Meets Video Captioners: Attribute-Aware Representation Learning Promotes Accurate Captioning
Figure 3 for CLIP Meets Video Captioners: Attribute-Aware Representation Learning Promotes Accurate Captioning
Figure 4 for CLIP Meets Video Captioners: Attribute-Aware Representation Learning Promotes Accurate Captioning
Viaarxiv icon

Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information

Add code
Oct 12, 2021
Figure 1 for Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information
Figure 2 for Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information
Figure 3 for Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information
Figure 4 for Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information
Viaarxiv icon

A Mutual learning framework for Few-shot Sound Event Detection

Add code
Oct 09, 2021
Figure 1 for A Mutual learning framework for Few-shot Sound Event Detection
Figure 2 for A Mutual learning framework for Few-shot Sound Event Detection
Figure 3 for A Mutual learning framework for Few-shot Sound Event Detection
Figure 4 for A Mutual learning framework for Few-shot Sound Event Detection
Viaarxiv icon

Towards Joint Intent Detection and Slot Filling via Higher-order Attention

Add code
Sep 22, 2021
Figure 1 for Towards Joint Intent Detection and Slot Filling via Higher-order Attention
Figure 2 for Towards Joint Intent Detection and Slot Filling via Higher-order Attention
Figure 3 for Towards Joint Intent Detection and Slot Filling via Higher-order Attention
Figure 4 for Towards Joint Intent Detection and Slot Filling via Higher-order Attention
Viaarxiv icon

On Pursuit of Designing Multi-modal Transformer for Video Grounding

Add code
Sep 13, 2021
Figure 1 for On Pursuit of Designing Multi-modal Transformer for Video Grounding
Figure 2 for On Pursuit of Designing Multi-modal Transformer for Video Grounding
Figure 3 for On Pursuit of Designing Multi-modal Transformer for Video Grounding
Figure 4 for On Pursuit of Designing Multi-modal Transformer for Video Grounding
Viaarxiv icon