Alert button
Picture for Dara Bahri

Dara Bahri

Alert button

Charformer: Fast Character Transformers via Gradient-based Subword Tokenization

Jun 23, 2021
Yi Tay, Vinh Q. Tran, Sebastian Ruder, Jai Gupta, Hyung Won Chung, Dara Bahri, Zhen Qin, Simon Baumgartner, Cong Yu, Donald Metzler

Figure 1 for Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
Figure 2 for Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
Figure 3 for Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
Figure 4 for Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
Viaarxiv icon

Churn Reduction via Distillation

Jun 04, 2021
Heinrich Jiang, Harikrishna Narasimhan, Dara Bahri, Andrew Cotter, Afshin Rostamizadeh

Figure 1 for Churn Reduction via Distillation
Figure 2 for Churn Reduction via Distillation
Figure 3 for Churn Reduction via Distillation
Figure 4 for Churn Reduction via Distillation
Viaarxiv icon

Are Pre-trained Convolutions Better than Pre-trained Transformers?

May 07, 2021
Yi Tay, Mostafa Dehghani, Jai Gupta, Dara Bahri, Vamsi Aribandi, Zhen Qin, Donald Metzler

Figure 1 for Are Pre-trained Convolutions Better than Pre-trained Transformers?
Figure 2 for Are Pre-trained Convolutions Better than Pre-trained Transformers?
Figure 3 for Are Pre-trained Convolutions Better than Pre-trained Transformers?
Figure 4 for Are Pre-trained Convolutions Better than Pre-trained Transformers?
Viaarxiv icon

Rethinking Search: Making Experts out of Dilettantes

May 05, 2021
Donald Metzler, Yi Tay, Dara Bahri, Marc Najork

Figure 1 for Rethinking Search: Making Experts out of Dilettantes
Figure 2 for Rethinking Search: Making Experts out of Dilettantes
Figure 3 for Rethinking Search: Making Experts out of Dilettantes
Viaarxiv icon

OmniNet: Omnidirectional Representations from Transformers

Mar 01, 2021
Yi Tay, Mostafa Dehghani, Vamsi Aribandi, Jai Gupta, Philip Pham, Zhen Qin, Dara Bahri, Da-Cheng Juan, Donald Metzler

Figure 1 for OmniNet: Omnidirectional Representations from Transformers
Figure 2 for OmniNet: Omnidirectional Representations from Transformers
Figure 3 for OmniNet: Omnidirectional Representations from Transformers
Figure 4 for OmniNet: Omnidirectional Representations from Transformers
Viaarxiv icon

Locally Adaptive Label Smoothing for Predictive Churn

Feb 09, 2021
Dara Bahri, Heinrich Jiang

Figure 1 for Locally Adaptive Label Smoothing for Predictive Churn
Figure 2 for Locally Adaptive Label Smoothing for Predictive Churn
Figure 3 for Locally Adaptive Label Smoothing for Predictive Churn
Figure 4 for Locally Adaptive Label Smoothing for Predictive Churn
Viaarxiv icon

Label Smoothed Embedding Hypothesis for Out-of-Distribution Detection

Feb 09, 2021
Dara Bahri, Heinrich Jiang, Yi Tay, Donald Metzler

Figure 1 for Label Smoothed Embedding Hypothesis for Out-of-Distribution Detection
Figure 2 for Label Smoothed Embedding Hypothesis for Out-of-Distribution Detection
Figure 3 for Label Smoothed Embedding Hypothesis for Out-of-Distribution Detection
Figure 4 for Label Smoothed Embedding Hypothesis for Out-of-Distribution Detection
Viaarxiv icon

StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling

Dec 15, 2020
Yikang Shen, Yi Tay, Che Zheng, Dara Bahri, Donald Metzler, Aaron Courville

Figure 1 for StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling
Figure 2 for StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling
Figure 3 for StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling
Figure 4 for StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling
Viaarxiv icon

Long Range Arena: A Benchmark for Efficient Transformers

Nov 08, 2020
Yi Tay, Mostafa Dehghani, Samira Abnar, Yikang Shen, Dara Bahri, Philip Pham, Jinfeng Rao, Liu Yang, Sebastian Ruder, Donald Metzler

Figure 1 for Long Range Arena: A Benchmark for Efficient Transformers
Figure 2 for Long Range Arena: A Benchmark for Efficient Transformers
Figure 3 for Long Range Arena: A Benchmark for Efficient Transformers
Figure 4 for Long Range Arena: A Benchmark for Efficient Transformers
Viaarxiv icon