Alert button
Picture for Sho Takase

Sho Takase

Alert button

Spike No More: Stabilizing the Pre-training of Large Language Models

Add code
Bookmark button
Alert button
Dec 28, 2023
Sho Takase, Shun Kiyono, Sosuke Kobayashi, Jun Suzuki

Viaarxiv icon

Exploring Effectiveness of GPT-3 in Grammatical Error Correction: A Study on Performance and Controllability in Prompt-Based Methods

Add code
Bookmark button
Alert button
May 29, 2023
Mengsay Loem, Masahiro Kaneko, Sho Takase, Naoaki Okazaki

Figure 1 for Exploring Effectiveness of GPT-3 in Grammatical Error Correction: A Study on Performance and Controllability in Prompt-Based Methods
Figure 2 for Exploring Effectiveness of GPT-3 in Grammatical Error Correction: A Study on Performance and Controllability in Prompt-Based Methods
Figure 3 for Exploring Effectiveness of GPT-3 in Grammatical Error Correction: A Study on Performance and Controllability in Prompt-Based Methods
Figure 4 for Exploring Effectiveness of GPT-3 in Grammatical Error Correction: A Study on Performance and Controllability in Prompt-Based Methods
Viaarxiv icon

Nearest Neighbor Non-autoregressive Text Generation

Add code
Bookmark button
Alert button
Aug 26, 2022
Ayana Niwa, Sho Takase, Naoaki Okazaki

Figure 1 for Nearest Neighbor Non-autoregressive Text Generation
Figure 2 for Nearest Neighbor Non-autoregressive Text Generation
Figure 3 for Nearest Neighbor Non-autoregressive Text Generation
Figure 4 for Nearest Neighbor Non-autoregressive Text Generation
Viaarxiv icon

Are Neighbors Enough? Multi-Head Neural n-gram can be Alternative to Self-attention

Add code
Bookmark button
Alert button
Jul 27, 2022
Mengsay Loem, Sho Takase, Masahiro Kaneko, Naoaki Okazaki

Figure 1 for Are Neighbors Enough? Multi-Head Neural n-gram can be Alternative to Self-attention
Figure 2 for Are Neighbors Enough? Multi-Head Neural n-gram can be Alternative to Self-attention
Figure 3 for Are Neighbors Enough? Multi-Head Neural n-gram can be Alternative to Self-attention
Figure 4 for Are Neighbors Enough? Multi-Head Neural n-gram can be Alternative to Self-attention
Viaarxiv icon

On Layer Normalizations and Residual Connections in Transformers

Add code
Bookmark button
Alert button
Jun 01, 2022
Sho Takase, Shun Kiyono, Sosuke Kobayashi, Jun Suzuki

Figure 1 for On Layer Normalizations and Residual Connections in Transformers
Figure 2 for On Layer Normalizations and Residual Connections in Transformers
Figure 3 for On Layer Normalizations and Residual Connections in Transformers
Figure 4 for On Layer Normalizations and Residual Connections in Transformers
Viaarxiv icon

Single Model Ensemble for Subword Regularized Models in Low-Resource Machine Translation

Add code
Bookmark button
Alert button
Mar 25, 2022
Sho Takase, Tatsuya Hiraoka, Naoaki Okazaki

Figure 1 for Single Model Ensemble for Subword Regularized Models in Low-Resource Machine Translation
Figure 2 for Single Model Ensemble for Subword Regularized Models in Low-Resource Machine Translation
Figure 3 for Single Model Ensemble for Subword Regularized Models in Low-Resource Machine Translation
Figure 4 for Single Model Ensemble for Subword Regularized Models in Low-Resource Machine Translation
Viaarxiv icon

Interpretability for Language Learners Using Example-Based Grammatical Error Correction

Add code
Bookmark button
Alert button
Mar 14, 2022
Masahiro Kaneko, Sho Takase, Ayana Niwa, Naoaki Okazaki

Figure 1 for Interpretability for Language Learners Using Example-Based Grammatical Error Correction
Figure 2 for Interpretability for Language Learners Using Example-Based Grammatical Error Correction
Figure 3 for Interpretability for Language Learners Using Example-Based Grammatical Error Correction
Figure 4 for Interpretability for Language Learners Using Example-Based Grammatical Error Correction
Viaarxiv icon

ExtraPhrase: Efficient Data Augmentation for Abstractive Summarization

Add code
Bookmark button
Alert button
Jan 14, 2022
Mengsay Loem, Sho Takase, Masahiro Kaneko, Naoaki Okazaki

Figure 1 for ExtraPhrase: Efficient Data Augmentation for Abstractive Summarization
Figure 2 for ExtraPhrase: Efficient Data Augmentation for Abstractive Summarization
Figure 3 for ExtraPhrase: Efficient Data Augmentation for Abstractive Summarization
Figure 4 for ExtraPhrase: Efficient Data Augmentation for Abstractive Summarization
Viaarxiv icon

Joint Optimization of Tokenization and Downstream Model

Add code
Bookmark button
Alert button
May 26, 2021
Tatsuya Hiraoka, Sho Takase, Kei Uchiumi, Atsushi Keyaki, Naoaki Okazaki

Figure 1 for Joint Optimization of Tokenization and Downstream Model
Figure 2 for Joint Optimization of Tokenization and Downstream Model
Figure 3 for Joint Optimization of Tokenization and Downstream Model
Figure 4 for Joint Optimization of Tokenization and Downstream Model
Viaarxiv icon

Lessons on Parameter Sharing across Layers in Transformers

Add code
Bookmark button
Alert button
Apr 13, 2021
Sho Takase, Shun Kiyono

Figure 1 for Lessons on Parameter Sharing across Layers in Transformers
Figure 2 for Lessons on Parameter Sharing across Layers in Transformers
Figure 3 for Lessons on Parameter Sharing across Layers in Transformers
Figure 4 for Lessons on Parameter Sharing across Layers in Transformers
Viaarxiv icon