Picture for Maxim Krikun

Maxim Krikun

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Add code
Mar 08, 2024
Viaarxiv icon

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

PaLM 2 Technical Report

Add code
May 17, 2023
Figure 1 for PaLM 2 Technical Report
Figure 2 for PaLM 2 Technical Report
Figure 3 for PaLM 2 Technical Report
Figure 4 for PaLM 2 Technical Report
Viaarxiv icon

The unreasonable effectiveness of few-shot learning for machine translation

Add code
Feb 02, 2023
Figure 1 for The unreasonable effectiveness of few-shot learning for machine translation
Figure 2 for The unreasonable effectiveness of few-shot learning for machine translation
Figure 3 for The unreasonable effectiveness of few-shot learning for machine translation
Figure 4 for The unreasonable effectiveness of few-shot learning for machine translation
Viaarxiv icon

Building Machine Translation Systems for the Next Thousand Languages

Add code
May 16, 2022
Figure 1 for Building Machine Translation Systems for the Next Thousand Languages
Figure 2 for Building Machine Translation Systems for the Next Thousand Languages
Figure 3 for Building Machine Translation Systems for the Next Thousand Languages
Figure 4 for Building Machine Translation Systems for the Next Thousand Languages
Viaarxiv icon

LaMDA: Language Models for Dialog Applications

Add code
Feb 10, 2022
Figure 1 for LaMDA: Language Models for Dialog Applications
Figure 2 for LaMDA: Language Models for Dialog Applications
Figure 3 for LaMDA: Language Models for Dialog Applications
Figure 4 for LaMDA: Language Models for Dialog Applications
Viaarxiv icon

Data Scaling Laws in NMT: The Effect of Noise and Architecture

Add code
Feb 04, 2022
Figure 1 for Data Scaling Laws in NMT: The Effect of Noise and Architecture
Figure 2 for Data Scaling Laws in NMT: The Effect of Noise and Architecture
Figure 3 for Data Scaling Laws in NMT: The Effect of Noise and Architecture
Figure 4 for Data Scaling Laws in NMT: The Effect of Noise and Architecture
Viaarxiv icon

GLaM: Efficient Scaling of Language Models with Mixture-of-Experts

Add code
Dec 13, 2021
Figure 1 for GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Figure 2 for GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Figure 3 for GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Figure 4 for GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Viaarxiv icon

Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference

Add code
Sep 24, 2021
Figure 1 for Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference
Figure 2 for Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference
Figure 3 for Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference
Figure 4 for Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference
Viaarxiv icon

Scaling Laws for Neural Machine Translation

Add code
Sep 16, 2021
Figure 1 for Scaling Laws for Neural Machine Translation
Figure 2 for Scaling Laws for Neural Machine Translation
Figure 3 for Scaling Laws for Neural Machine Translation
Figure 4 for Scaling Laws for Neural Machine Translation
Viaarxiv icon