Alert button
Picture for Oleksii Kuchaiev

Oleksii Kuchaiev

Alert button

Tied-Lora: Enhacing parameter efficiency of LoRA with weight tying

Nov 16, 2023
Adithya Renduchintala, Tugrul Konuk, Oleksii Kuchaiev

Viaarxiv icon

HelpSteer: Multi-attribute Helpfulness Dataset for SteerLM

Nov 16, 2023
Zhilin Wang, Yi Dong, Jiaqi Zeng, Virginia Adams, Makesh Narsimhan Sreedhar, Daniel Egert, Olivier Delalleau, Jane Polak Scowcroft, Neel Kant, Aidan Swope, Oleksii Kuchaiev

Viaarxiv icon

SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF

Oct 09, 2023
Yi Dong, Zhilin Wang, Makesh Narsimhan Sreedhar, Xianchao Wu, Oleksii Kuchaiev

Figure 1 for SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF
Figure 2 for SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF
Figure 3 for SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF
Figure 4 for SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF
Viaarxiv icon

Leveraging Synthetic Targets for Machine Translation

May 07, 2023
Sarthak Mittal, Oleksii Hrinchuk, Oleksii Kuchaiev

Figure 1 for Leveraging Synthetic Targets for Machine Translation
Figure 2 for Leveraging Synthetic Targets for Machine Translation
Figure 3 for Leveraging Synthetic Targets for Machine Translation
Figure 4 for Leveraging Synthetic Targets for Machine Translation
Viaarxiv icon

Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study

Apr 13, 2023
Boxin Wang, Wei Ping, Peng Xu, Lawrence McAfee, Zihan Liu, Mohammad Shoeybi, Yi Dong, Oleksii Kuchaiev, Bo Li, Chaowei Xiao, Anima Anandkumar, Bryan Catanzaro

Figure 1 for Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study
Figure 2 for Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study
Figure 3 for Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study
Figure 4 for Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study
Viaarxiv icon

Finding the Right Recipe for Low Resource Domain Adaptation in Neural Machine Translation

Jun 02, 2022
Virginia Adams, Sandeep Subramanian, Mike Chrzanowski, Oleksii Hrinchuk, Oleksii Kuchaiev

Figure 1 for Finding the Right Recipe for Low Resource Domain Adaptation in Neural Machine Translation
Figure 2 for Finding the Right Recipe for Low Resource Domain Adaptation in Neural Machine Translation
Figure 3 for Finding the Right Recipe for Low Resource Domain Adaptation in Neural Machine Translation
Figure 4 for Finding the Right Recipe for Low Resource Domain Adaptation in Neural Machine Translation
Viaarxiv icon

NVIDIA NeMo Neural Machine Translation Systems for English-German and English-Russian News and Biomedical Tasks at WMT21

Nov 16, 2021
Sandeep Subramanian, Oleksii Hrinchuk, Virginia Adams, Oleksii Kuchaiev

Figure 1 for NVIDIA NeMo Neural Machine Translation Systems for English-German and English-Russian News and Biomedical Tasks at WMT21
Figure 2 for NVIDIA NeMo Neural Machine Translation Systems for English-German and English-Russian News and Biomedical Tasks at WMT21
Figure 3 for NVIDIA NeMo Neural Machine Translation Systems for English-German and English-Russian News and Biomedical Tasks at WMT21
Figure 4 for NVIDIA NeMo Neural Machine Translation Systems for English-German and English-Russian News and Biomedical Tasks at WMT21
Viaarxiv icon

SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition

Apr 06, 2021
Patrick K. O'Neill, Vitaly Lavrukhin, Somshubra Majumdar, Vahid Noroozi, Yuekai Zhang, Oleksii Kuchaiev, Jagadeesh Balam, Yuliya Dovzhenko, Keenan Freyberg, Michael D. Shulman, Boris Ginsburg, Shinji Watanabe, Georg Kucsko

Figure 1 for SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
Figure 2 for SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
Figure 3 for SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
Figure 4 for SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
Viaarxiv icon

NeMo: a toolkit for building AI applications using Neural Modules

Sep 14, 2019
Oleksii Kuchaiev, Jason Li, Huyen Nguyen, Oleksii Hrinchuk, Ryan Leary, Boris Ginsburg, Samuel Kriman, Stanislav Beliaev, Vitaly Lavrukhin, Jack Cook, Patrice Castonguay, Mariya Popova, Jocelyn Huang, Jonathan M. Cohen

Figure 1 for NeMo: a toolkit for building AI applications using Neural Modules
Figure 2 for NeMo: a toolkit for building AI applications using Neural Modules
Viaarxiv icon

Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks

May 27, 2019
Boris Ginsburg, Patrice Castonguay, Oleksii Hrinchuk, Oleksii Kuchaiev, Vitaly Lavrukhin, Ryan Leary, Jason Li, Huyen Nguyen, Jonathan M. Cohen

Figure 1 for Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks
Figure 2 for Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks
Figure 3 for Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks
Figure 4 for Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks
Viaarxiv icon