Alert button
Picture for Mostofa Patwary

Mostofa Patwary

Alert button

End-to-End Training of Neural Retrievers for Open-Domain Question Answering

Jan 02, 2021
Devendra Singh Sachan, Mostofa Patwary, Mohammad Shoeybi, Neel Kant, Wei Ping, William L Hamilton, Bryan Catanzaro

Figure 1 for End-to-End Training of Neural Retrievers for Open-Domain Question Answering
Figure 2 for End-to-End Training of Neural Retrievers for Open-Domain Question Answering
Figure 3 for End-to-End Training of Neural Retrievers for Open-Domain Question Answering
Figure 4 for End-to-End Training of Neural Retrievers for Open-Domain Question Answering
Viaarxiv icon

Local Knowledge Powered Conversational Agents

Oct 20, 2020
Sashank Santhanam, Wei Ping, Raul Puri, Mohammad Shoeybi, Mostofa Patwary, Bryan Catanzaro

Figure 1 for Local Knowledge Powered Conversational Agents
Figure 2 for Local Knowledge Powered Conversational Agents
Figure 3 for Local Knowledge Powered Conversational Agents
Figure 4 for Local Knowledge Powered Conversational Agents
Viaarxiv icon

BioMegatron: Larger Biomedical Domain Language Model

Oct 14, 2020
Hoo-Chang Shin, Yang Zhang, Evelina Bakhturina, Raul Puri, Mostofa Patwary, Mohammad Shoeybi, Raghav Mani

Figure 1 for BioMegatron: Larger Biomedical Domain Language Model
Figure 2 for BioMegatron: Larger Biomedical Domain Language Model
Figure 3 for BioMegatron: Larger Biomedical Domain Language Model
Figure 4 for BioMegatron: Larger Biomedical Domain Language Model
Viaarxiv icon

MEGATRON-CNTRL: Controllable Story Generation with External Knowledge Using Large-Scale Language Models

Oct 02, 2020
Peng Xu, Mostofa Patwary, Mohammad Shoeybi, Raul Puri, Pascale Fung, Anima Anandkumar, Bryan Catanzaro

Figure 1 for MEGATRON-CNTRL: Controllable Story Generation with External Knowledge Using Large-Scale Language Models
Figure 2 for MEGATRON-CNTRL: Controllable Story Generation with External Knowledge Using Large-Scale Language Models
Figure 3 for MEGATRON-CNTRL: Controllable Story Generation with External Knowledge Using Large-Scale Language Models
Figure 4 for MEGATRON-CNTRL: Controllable Story Generation with External Knowledge Using Large-Scale Language Models
Viaarxiv icon

Large Scale Multi-Actor Generative Dialog Modeling

May 13, 2020
Alex Boyd, Raul Puri, Mohammad Shoeybi, Mostofa Patwary, Bryan Catanzaro

Figure 1 for Large Scale Multi-Actor Generative Dialog Modeling
Figure 2 for Large Scale Multi-Actor Generative Dialog Modeling
Figure 3 for Large Scale Multi-Actor Generative Dialog Modeling
Figure 4 for Large Scale Multi-Actor Generative Dialog Modeling
Viaarxiv icon

Training Question Answering Models From Synthetic Data

Feb 22, 2020
Raul Puri, Ryan Spring, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro

Figure 1 for Training Question Answering Models From Synthetic Data
Figure 2 for Training Question Answering Models From Synthetic Data
Figure 3 for Training Question Answering Models From Synthetic Data
Figure 4 for Training Question Answering Models From Synthetic Data
Viaarxiv icon

Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism

Oct 05, 2019
Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper, Bryan Catanzaro

Figure 1 for Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Figure 2 for Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Figure 3 for Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Figure 4 for Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Viaarxiv icon

DisCo: Physics-Based Unsupervised Discovery of Coherent Structures in Spatiotemporal Systems

Sep 25, 2019
Adam Rupe, Nalini Kumar, Vladislav Epifanov, Karthik Kashinath, Oleksandr Pavlyk, Frank Schlimbach, Mostofa Patwary, Sergey Maidanov, Victor Lee, Prabhat, James P. Crutchfield

Figure 1 for DisCo: Physics-Based Unsupervised Discovery of Coherent Structures in Spatiotemporal Systems
Figure 2 for DisCo: Physics-Based Unsupervised Discovery of Coherent Structures in Spatiotemporal Systems
Figure 3 for DisCo: Physics-Based Unsupervised Discovery of Coherent Structures in Spatiotemporal Systems
Figure 4 for DisCo: Physics-Based Unsupervised Discovery of Coherent Structures in Spatiotemporal Systems
Viaarxiv icon