Picture for Brian K Chen

Brian K Chen

Memory-Efficient LLM Training by Various-Grained Low-Rank Projection of Gradients

Add code
May 03, 2025
Viaarxiv icon

Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers

Add code
Jun 06, 2024
Figure 1 for Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers
Figure 2 for Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers
Figure 3 for Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers
Viaarxiv icon