Alert button
Picture for Behrooz Ghorbani

Behrooz Ghorbani

Alert button

Order Matters in the Presence of Dataset Imbalance for Multilingual Learning

Add code
Bookmark button
Alert button
Dec 11, 2023
Dami Choi, Derrick Xin, Hamid Dadkhahi, Justin Gilmer, Ankush Garg, Orhan Firat, Chih-Kuan Yeh, Andrew M. Dai, Behrooz Ghorbani

Viaarxiv icon

Epsilon Sampling Rocks: Investigating Sampling Strategies for Minimum Bayes Risk Decoding for Machine Translation

Add code
Bookmark button
Alert button
May 18, 2023
Markus Freitag, Behrooz Ghorbani, Patrick Fernandes

Figure 1 for Epsilon Sampling Rocks: Investigating Sampling Strategies for Minimum Bayes Risk Decoding for Machine Translation
Figure 2 for Epsilon Sampling Rocks: Investigating Sampling Strategies for Minimum Bayes Risk Decoding for Machine Translation
Figure 3 for Epsilon Sampling Rocks: Investigating Sampling Strategies for Minimum Bayes Risk Decoding for Machine Translation
Figure 4 for Epsilon Sampling Rocks: Investigating Sampling Strategies for Minimum Bayes Risk Decoding for Machine Translation
Viaarxiv icon

Scaling Laws for Multilingual Neural Machine Translation

Add code
Bookmark button
Alert button
Feb 19, 2023
Patrick Fernandes, Behrooz Ghorbani, Xavier Garcia, Markus Freitag, Orhan Firat

Figure 1 for Scaling Laws for Multilingual Neural Machine Translation
Figure 2 for Scaling Laws for Multilingual Neural Machine Translation
Figure 3 for Scaling Laws for Multilingual Neural Machine Translation
Figure 4 for Scaling Laws for Multilingual Neural Machine Translation
Viaarxiv icon

Binarized Neural Machine Translation

Add code
Bookmark button
Alert button
Feb 09, 2023
Yichi Zhang, Ankush Garg, Yuan Cao, Łukasz Lew, Behrooz Ghorbani, Zhiru Zhang, Orhan Firat

Figure 1 for Binarized Neural Machine Translation
Figure 2 for Binarized Neural Machine Translation
Figure 3 for Binarized Neural Machine Translation
Figure 4 for Binarized Neural Machine Translation
Viaarxiv icon

Do Current Multi-Task Optimization Methods in Deep Learning Even Help?

Add code
Bookmark button
Alert button
Sep 23, 2022
Derrick Xin, Behrooz Ghorbani, Ankush Garg, Orhan Firat, Justin Gilmer

Figure 1 for Do Current Multi-Task Optimization Methods in Deep Learning Even Help?
Figure 2 for Do Current Multi-Task Optimization Methods in Deep Learning Even Help?
Figure 3 for Do Current Multi-Task Optimization Methods in Deep Learning Even Help?
Figure 4 for Do Current Multi-Task Optimization Methods in Deep Learning Even Help?
Viaarxiv icon

Adaptive Gradient Methods at the Edge of Stability

Add code
Bookmark button
Alert button
Jul 29, 2022
Jeremy M. Cohen, Behrooz Ghorbani, Shankar Krishnan, Naman Agarwal, Sourabh Medapati, Michal Badura, Daniel Suo, David Cardoze, Zachary Nado, George E. Dahl, Justin Gilmer

Figure 1 for Adaptive Gradient Methods at the Edge of Stability
Figure 2 for Adaptive Gradient Methods at the Edge of Stability
Figure 3 for Adaptive Gradient Methods at the Edge of Stability
Figure 4 for Adaptive Gradient Methods at the Edge of Stability
Viaarxiv icon

Examining Scaling and Transfer of Language Model Architectures for Machine Translation

Add code
Bookmark button
Alert button
Feb 16, 2022
Biao Zhang, Behrooz Ghorbani, Ankur Bapna, Yong Cheng, Xavier Garcia, Jonathan Shen, Orhan Firat

Figure 1 for Examining Scaling and Transfer of Language Model Architectures for Machine Translation
Figure 2 for Examining Scaling and Transfer of Language Model Architectures for Machine Translation
Figure 3 for Examining Scaling and Transfer of Language Model Architectures for Machine Translation
Figure 4 for Examining Scaling and Transfer of Language Model Architectures for Machine Translation
Viaarxiv icon

Data Scaling Laws in NMT: The Effect of Noise and Architecture

Add code
Bookmark button
Alert button
Feb 04, 2022
Yamini Bansal, Behrooz Ghorbani, Ankush Garg, Biao Zhang, Maxim Krikun, Colin Cherry, Behnam Neyshabur, Orhan Firat

Figure 1 for Data Scaling Laws in NMT: The Effect of Noise and Architecture
Figure 2 for Data Scaling Laws in NMT: The Effect of Noise and Architecture
Figure 3 for Data Scaling Laws in NMT: The Effect of Noise and Architecture
Figure 4 for Data Scaling Laws in NMT: The Effect of Noise and Architecture
Viaarxiv icon

A Loss Curvature Perspective on Training Instability in Deep Learning

Add code
Bookmark button
Alert button
Oct 08, 2021
Justin Gilmer, Behrooz Ghorbani, Ankush Garg, Sneha Kudugunta, Behnam Neyshabur, David Cardoze, George Dahl, Zachary Nado, Orhan Firat

Figure 1 for A Loss Curvature Perspective on Training Instability in Deep Learning
Figure 2 for A Loss Curvature Perspective on Training Instability in Deep Learning
Figure 3 for A Loss Curvature Perspective on Training Instability in Deep Learning
Figure 4 for A Loss Curvature Perspective on Training Instability in Deep Learning
Viaarxiv icon