Picture for Max Vladymyrov

Max Vladymyrov

UC Merced

Contextually Guided Transformers via Low-Rank Adaptation

Add code
Jun 06, 2025
Viaarxiv icon

How new data permeates LLM knowledge and how to dilute it

Add code
Apr 13, 2025
Figure 1 for How new data permeates LLM knowledge and how to dilute it
Figure 2 for How new data permeates LLM knowledge and how to dilute it
Figure 3 for How new data permeates LLM knowledge and how to dilute it
Figure 4 for How new data permeates LLM knowledge and how to dilute it
Viaarxiv icon

Long Context In-Context Compression by Getting to the Gist of Gisting

Add code
Apr 11, 2025
Figure 1 for Long Context In-Context Compression by Getting to the Gist of Gisting
Figure 2 for Long Context In-Context Compression by Getting to the Gist of Gisting
Figure 3 for Long Context In-Context Compression by Getting to the Gist of Gisting
Figure 4 for Long Context In-Context Compression by Getting to the Gist of Gisting
Viaarxiv icon

Learning and Unlearning of Fabricated Knowledge in Language Models

Add code
Oct 29, 2024
Figure 1 for Learning and Unlearning of Fabricated Knowledge in Language Models
Figure 2 for Learning and Unlearning of Fabricated Knowledge in Language Models
Figure 3 for Learning and Unlearning of Fabricated Knowledge in Language Models
Figure 4 for Learning and Unlearning of Fabricated Knowledge in Language Models
Viaarxiv icon

Narrowing the Focus: Learned Optimizers for Pretrained Models

Add code
Aug 21, 2024
Figure 1 for Narrowing the Focus: Learned Optimizers for Pretrained Models
Figure 2 for Narrowing the Focus: Learned Optimizers for Pretrained Models
Figure 3 for Narrowing the Focus: Learned Optimizers for Pretrained Models
Figure 4 for Narrowing the Focus: Learned Optimizers for Pretrained Models
Viaarxiv icon

Linear Transformers are Versatile In-Context Learners

Add code
Feb 21, 2024
Figure 1 for Linear Transformers are Versatile In-Context Learners
Figure 2 for Linear Transformers are Versatile In-Context Learners
Figure 3 for Linear Transformers are Versatile In-Context Learners
Figure 4 for Linear Transformers are Versatile In-Context Learners
Viaarxiv icon

Uncovering mesa-optimization algorithms in Transformers

Add code
Sep 11, 2023
Figure 1 for Uncovering mesa-optimization algorithms in Transformers
Figure 2 for Uncovering mesa-optimization algorithms in Transformers
Figure 3 for Uncovering mesa-optimization algorithms in Transformers
Figure 4 for Uncovering mesa-optimization algorithms in Transformers
Viaarxiv icon

Continual Few-Shot Learning Using HyperTransformers

Add code
Jan 12, 2023
Viaarxiv icon

Training trajectories, mini-batch losses and the curious role of the learning rate

Add code
Jan 05, 2023
Figure 1 for Training trajectories, mini-batch losses and the curious role of the learning rate
Figure 2 for Training trajectories, mini-batch losses and the curious role of the learning rate
Figure 3 for Training trajectories, mini-batch losses and the curious role of the learning rate
Figure 4 for Training trajectories, mini-batch losses and the curious role of the learning rate
Viaarxiv icon

Transformers learn in-context by gradient descent

Add code
Dec 15, 2022
Viaarxiv icon