Picture for Behnam Neyshabur

Behnam Neyshabur

Shammie

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Add code
Mar 08, 2024
Viaarxiv icon

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Add code
Dec 22, 2023
Figure 1 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 2 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 3 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 4 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Viaarxiv icon

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

Convexifying Transformers: Improving optimization and understanding of transformer networks

Add code
Nov 20, 2022
Figure 1 for Convexifying Transformers: Improving optimization and understanding of transformer networks
Figure 2 for Convexifying Transformers: Improving optimization and understanding of transformer networks
Figure 3 for Convexifying Transformers: Improving optimization and understanding of transformer networks
Figure 4 for Convexifying Transformers: Improving optimization and understanding of transformer networks
Viaarxiv icon

Layer-Stack Temperature Scaling

Add code
Nov 18, 2022
Figure 1 for Layer-Stack Temperature Scaling
Figure 2 for Layer-Stack Temperature Scaling
Figure 3 for Layer-Stack Temperature Scaling
Figure 4 for Layer-Stack Temperature Scaling
Viaarxiv icon

REPAIR: REnormalizing Permuted Activations for Interpolation Repair

Add code
Nov 15, 2022
Figure 1 for REPAIR: REnormalizing Permuted Activations for Interpolation Repair
Figure 2 for REPAIR: REnormalizing Permuted Activations for Interpolation Repair
Figure 3 for REPAIR: REnormalizing Permuted Activations for Interpolation Repair
Figure 4 for REPAIR: REnormalizing Permuted Activations for Interpolation Repair
Viaarxiv icon

Teaching Algorithmic Reasoning via In-context Learning

Add code
Nov 15, 2022
Figure 1 for Teaching Algorithmic Reasoning via In-context Learning
Figure 2 for Teaching Algorithmic Reasoning via In-context Learning
Figure 3 for Teaching Algorithmic Reasoning via In-context Learning
Figure 4 for Teaching Algorithmic Reasoning via In-context Learning
Viaarxiv icon

Revisiting Neural Scaling Laws in Language and Vision

Add code
Sep 13, 2022
Figure 1 for Revisiting Neural Scaling Laws in Language and Vision
Figure 2 for Revisiting Neural Scaling Laws in Language and Vision
Figure 3 for Revisiting Neural Scaling Laws in Language and Vision
Figure 4 for Revisiting Neural Scaling Laws in Language and Vision
Viaarxiv icon

Exploring Length Generalization in Large Language Models

Add code
Jul 11, 2022
Figure 1 for Exploring Length Generalization in Large Language Models
Figure 2 for Exploring Length Generalization in Large Language Models
Figure 3 for Exploring Length Generalization in Large Language Models
Figure 4 for Exploring Length Generalization in Large Language Models
Viaarxiv icon

Long Range Language Modeling via Gated State Spaces

Add code
Jul 02, 2022
Figure 1 for Long Range Language Modeling via Gated State Spaces
Figure 2 for Long Range Language Modeling via Gated State Spaces
Figure 3 for Long Range Language Modeling via Gated State Spaces
Viaarxiv icon