Picture for Ali Ghodsi

Ali Ghodsi

Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)

Add code
Sep 16, 2023
Figure 1 for Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)
Figure 2 for Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)
Figure 3 for Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)
Figure 4 for Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)
Viaarxiv icon

SortedNet, a Place for Every Network and Every Network in its Place: Towards a Generalized Solution for Training Many-in-One Neural Networks

Add code
Sep 01, 2023
Viaarxiv icon

Recurrent Neural Networks and Long Short-Term Memory Networks: Tutorial and Survey

Add code
Apr 22, 2023
Figure 1 for Recurrent Neural Networks and Long Short-Term Memory Networks: Tutorial and Survey
Figure 2 for Recurrent Neural Networks and Long Short-Term Memory Networks: Tutorial and Survey
Figure 3 for Recurrent Neural Networks and Long Short-Term Memory Networks: Tutorial and Survey
Figure 4 for Recurrent Neural Networks and Long Short-Term Memory Networks: Tutorial and Survey
Viaarxiv icon

Improved knowledge distillation by utilizing backward pass knowledge in neural networks

Add code
Jan 27, 2023
Viaarxiv icon

Improving Generalization of Pre-trained Language Models via Stochastic Weight Averaging

Add code
Dec 16, 2022
Figure 1 for Improving Generalization of Pre-trained Language Models via Stochastic Weight Averaging
Figure 2 for Improving Generalization of Pre-trained Language Models via Stochastic Weight Averaging
Figure 3 for Improving Generalization of Pre-trained Language Models via Stochastic Weight Averaging
Figure 4 for Improving Generalization of Pre-trained Language Models via Stochastic Weight Averaging
Viaarxiv icon

Continuation KD: Improved Knowledge Distillation through the Lens of Continuation Optimization

Add code
Dec 12, 2022
Viaarxiv icon

DyLoRA: Parameter Efficient Tuning of Pre-trained Models using Dynamic Search-Free Low-Rank Adaptation

Add code
Oct 14, 2022
Figure 1 for DyLoRA: Parameter Efficient Tuning of Pre-trained Models using Dynamic Search-Free Low-Rank Adaptation
Figure 2 for DyLoRA: Parameter Efficient Tuning of Pre-trained Models using Dynamic Search-Free Low-Rank Adaptation
Figure 3 for DyLoRA: Parameter Efficient Tuning of Pre-trained Models using Dynamic Search-Free Low-Rank Adaptation
Figure 4 for DyLoRA: Parameter Efficient Tuning of Pre-trained Models using Dynamic Search-Free Low-Rank Adaptation
Viaarxiv icon

Towards Understanding Label Regularization for Fine-tuning Pre-trained Language Models

Add code
May 25, 2022
Figure 1 for Towards Understanding Label Regularization for Fine-tuning Pre-trained Language Models
Figure 2 for Towards Understanding Label Regularization for Fine-tuning Pre-trained Language Models
Figure 3 for Towards Understanding Label Regularization for Fine-tuning Pre-trained Language Models
Figure 4 for Towards Understanding Label Regularization for Fine-tuning Pre-trained Language Models
Viaarxiv icon

Theoretical Connection between Locally Linear Embedding, Factor Analysis, and Probabilistic PCA

Add code
Mar 25, 2022
Viaarxiv icon

When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data Augmentation

Add code
Mar 17, 2022
Figure 1 for When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data Augmentation
Figure 2 for When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data Augmentation
Figure 3 for When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data Augmentation
Figure 4 for When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data Augmentation
Viaarxiv icon