Alert button
Picture for Samuel Thomas

Samuel Thomas

Alert button

Improving End-to-End Models for Set Prediction in Spoken Language Understanding

Add code
Bookmark button
Alert button
Jan 28, 2022
Hong-Kwang J. Kuo, Zoltan Tuske, Samuel Thomas, Brian Kingsbury, George Saon

Figure 1 for Improving End-to-End Models for Set Prediction in Spoken Language Understanding
Figure 2 for Improving End-to-End Models for Set Prediction in Spoken Language Understanding
Figure 3 for Improving End-to-End Models for Set Prediction in Spoken Language Understanding
Figure 4 for Improving End-to-End Models for Set Prediction in Spoken Language Understanding
Viaarxiv icon

Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval

Add code
Bookmark button
Alert button
Dec 08, 2021
Nina Shvetsova, Brian Chen, Andrew Rouditchenko, Samuel Thomas, Brian Kingsbury, Rogerio Feris, David Harwath, James Glass, Hilde Kuehne

Figure 1 for Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval
Figure 2 for Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval
Figure 3 for Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval
Figure 4 for Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval
Viaarxiv icon

Routing with Self-Attention for Multimodal Capsule Networks

Add code
Bookmark button
Alert button
Dec 01, 2021
Kevin Duarte, Brian Chen, Nina Shvetsova, Andrew Rouditchenko, Samuel Thomas, Alexander Liu, David Harwath, James Glass, Hilde Kuehne, Mubarak Shah

Figure 1 for Routing with Self-Attention for Multimodal Capsule Networks
Figure 2 for Routing with Self-Attention for Multimodal Capsule Networks
Figure 3 for Routing with Self-Attention for Multimodal Capsule Networks
Figure 4 for Routing with Self-Attention for Multimodal Capsule Networks
Viaarxiv icon

Cascaded Multilingual Audio-Visual Learning from Videos

Add code
Bookmark button
Alert button
Nov 08, 2021
Andrew Rouditchenko, Angie Boggust, David Harwath, Samuel Thomas, Hilde Kuehne, Brian Chen, Rameswar Panda, Rogerio Feris, Brian Kingsbury, Michael Picheny, James Glass

Figure 1 for Cascaded Multilingual Audio-Visual Learning from Videos
Figure 2 for Cascaded Multilingual Audio-Visual Learning from Videos
Figure 3 for Cascaded Multilingual Audio-Visual Learning from Videos
Figure 4 for Cascaded Multilingual Audio-Visual Learning from Videos
Viaarxiv icon

Integrating Dialog History into End-to-End Spoken Language Understanding Systems

Add code
Bookmark button
Alert button
Aug 18, 2021
Jatin Ganhotra, Samuel Thomas, Hong-Kwang J. Kuo, Sachindra Joshi, George Saon, Zoltán Tüske, Brian Kingsbury

Figure 1 for Integrating Dialog History into End-to-End Spoken Language Understanding Systems
Figure 2 for Integrating Dialog History into End-to-End Spoken Language Understanding Systems
Figure 3 for Integrating Dialog History into End-to-End Spoken Language Understanding Systems
Figure 4 for Integrating Dialog History into End-to-End Spoken Language Understanding Systems
Viaarxiv icon

Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos

Add code
Bookmark button
Alert button
May 05, 2021
Brian Chen, Andrew Rouditchenko, Kevin Duarte, Hilde Kuehne, Samuel Thomas, Angie Boggust, Rameswar Panda, Brian Kingsbury, Rogerio Feris, David Harwath, James Glass, Michael Picheny, Shih-Fu Chang

Figure 1 for Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos
Figure 2 for Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos
Figure 3 for Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos
Figure 4 for Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos
Viaarxiv icon

RNN Transducer Models For Spoken Language Understanding

Add code
Bookmark button
Alert button
Apr 08, 2021
Samuel Thomas, Hong-Kwang J. Kuo, George Saon, Zoltán Tüske, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory

Figure 1 for RNN Transducer Models For Spoken Language Understanding
Figure 2 for RNN Transducer Models For Spoken Language Understanding
Figure 3 for RNN Transducer Models For Spoken Language Understanding
Figure 4 for RNN Transducer Models For Spoken Language Understanding
Viaarxiv icon

Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs

Add code
Bookmark button
Alert button
Apr 07, 2021
Sujeong Cha, Wangrui Hou, Hyun Jung, My Phung, Michael Picheny, Hong-Kwang Kuo, Samuel Thomas, Edmilson Morais

Figure 1 for Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs
Figure 2 for Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs
Figure 3 for Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs
Figure 4 for Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs
Viaarxiv icon

End-to-end spoken language understanding using transformer networks and self-supervised pre-trained features

Add code
Bookmark button
Alert button
Nov 16, 2020
Edmilson Morais, Hong-Kwang J. Kuo, Samuel Thomas, Zoltan Tuske, Brian Kingsbury

Figure 1 for End-to-end spoken language understanding using transformer networks and self-supervised pre-trained features
Figure 2 for End-to-end spoken language understanding using transformer networks and self-supervised pre-trained features
Figure 3 for End-to-end spoken language understanding using transformer networks and self-supervised pre-trained features
Figure 4 for End-to-end spoken language understanding using transformer networks and self-supervised pre-trained features
Viaarxiv icon