Alert button
Picture for Davis Yoshida

Davis Yoshida

Alert button

MAP's not dead yet: Uncovering true language model modes by conditioning away degeneracy

Add code
Bookmark button
Alert button
Nov 15, 2023
Davis Yoshida, Kartik Goyal, Kevin Gimpel

Viaarxiv icon

NF4 Isn't Information Theoretically Optimal (and that's Good)

Add code
Bookmark button
Alert button
Jun 14, 2023
Davis Yoshida

Figure 1 for NF4 Isn't Information Theoretically Optimal (and that's Good)
Figure 2 for NF4 Isn't Information Theoretically Optimal (and that's Good)
Figure 3 for NF4 Isn't Information Theoretically Optimal (and that's Good)
Figure 4 for NF4 Isn't Information Theoretically Optimal (and that's Good)
Viaarxiv icon

Reconsidering the Past: Optimizing Hidden States in Language Models

Add code
Bookmark button
Alert button
Dec 16, 2021
Davis Yoshida, Kevin Gimpel

Figure 1 for Reconsidering the Past: Optimizing Hidden States in Language Models
Figure 2 for Reconsidering the Past: Optimizing Hidden States in Language Models
Figure 3 for Reconsidering the Past: Optimizing Hidden States in Language Models
Figure 4 for Reconsidering the Past: Optimizing Hidden States in Language Models
Viaarxiv icon

Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size

Add code
Bookmark button
Alert button
Aug 16, 2020
Davis Yoshida, Allyson Ettinger, Kevin Gimpel

Figure 1 for Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size
Figure 2 for Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size
Figure 3 for Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size
Figure 4 for Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size
Viaarxiv icon