Picture for Xilin Chen

Xilin Chen

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Add code
Mar 31, 2020
Figure 1 for Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text
Figure 2 for Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text
Figure 3 for Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text
Figure 4 for Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text
Viaarxiv icon

Mutual Information Maximization for Effective Lip Reading

Add code
Mar 13, 2020
Figure 1 for Mutual Information Maximization for Effective Lip Reading
Figure 2 for Mutual Information Maximization for Effective Lip Reading
Figure 3 for Mutual Information Maximization for Effective Lip Reading
Figure 4 for Mutual Information Maximization for Effective Lip Reading
Viaarxiv icon

Deformation Flow Based Two-Stream Network for Lip Reading

Add code
Mar 13, 2020
Figure 1 for Deformation Flow Based Two-Stream Network for Lip Reading
Figure 2 for Deformation Flow Based Two-Stream Network for Lip Reading
Figure 3 for Deformation Flow Based Two-Stream Network for Lip Reading
Figure 4 for Deformation Flow Based Two-Stream Network for Lip Reading
Viaarxiv icon

Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence Lip-Reading

Add code
Mar 09, 2020
Figure 1 for Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence Lip-Reading
Figure 2 for Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence Lip-Reading
Figure 3 for Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence Lip-Reading
Figure 4 for Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence Lip-Reading
Viaarxiv icon

Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition

Add code
Mar 09, 2020
Figure 1 for Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition
Figure 2 for Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition
Figure 3 for Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition
Figure 4 for Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition
Viaarxiv icon

UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

Add code
Feb 15, 2020
Figure 1 for UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation
Figure 2 for UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation
Figure 3 for UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation
Figure 4 for UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation
Viaarxiv icon

Emotion Recognition for In-the-wild Videos

Add code
Feb 13, 2020
Figure 1 for Emotion Recognition for In-the-wild Videos
Figure 2 for Emotion Recognition for In-the-wild Videos
Figure 3 for Emotion Recognition for In-the-wild Videos
Viaarxiv icon

$M^3$T: Multi-Modal Continuous Valence-Arousal Estimation in the Wild

Add code
Feb 07, 2020
Figure 1 for $M^3$T: Multi-Modal Continuous Valence-Arousal Estimation in the Wild
Figure 2 for $M^3$T: Multi-Modal Continuous Valence-Arousal Estimation in the Wild
Figure 3 for $M^3$T: Multi-Modal Continuous Valence-Arousal Estimation in the Wild
Figure 4 for $M^3$T: Multi-Modal Continuous Valence-Arousal Estimation in the Wild
Viaarxiv icon

Deep Heterogeneous Hashing for Face Video Retrieval

Add code
Nov 04, 2019
Figure 1 for Deep Heterogeneous Hashing for Face Video Retrieval
Figure 2 for Deep Heterogeneous Hashing for Face Video Retrieval
Figure 3 for Deep Heterogeneous Hashing for Face Video Retrieval
Figure 4 for Deep Heterogeneous Hashing for Face Video Retrieval
Viaarxiv icon

FCSR-GAN: Joint Face Completion and Super-resolution via Multi-task Learning

Add code
Nov 04, 2019
Figure 1 for FCSR-GAN: Joint Face Completion and Super-resolution via Multi-task Learning
Figure 2 for FCSR-GAN: Joint Face Completion and Super-resolution via Multi-task Learning
Figure 3 for FCSR-GAN: Joint Face Completion and Super-resolution via Multi-task Learning
Figure 4 for FCSR-GAN: Joint Face Completion and Super-resolution via Multi-task Learning
Viaarxiv icon