Picture for Subhabrata Mukherjee

Subhabrata Mukherjee

Orca: Progressive Learning from Complex Explanation Traces of GPT-4

Add code
Jun 05, 2023
Figure 1 for Orca: Progressive Learning from Complex Explanation Traces of GPT-4
Figure 2 for Orca: Progressive Learning from Complex Explanation Traces of GPT-4
Figure 3 for Orca: Progressive Learning from Complex Explanation Traces of GPT-4
Figure 4 for Orca: Progressive Learning from Complex Explanation Traces of GPT-4
Viaarxiv icon

GRILL: Grounded Vision-language Pre-training via Aligning Text and Image Regions

Add code
May 24, 2023
Figure 1 for GRILL: Grounded Vision-language Pre-training via Aligning Text and Image Regions
Figure 2 for GRILL: Grounded Vision-language Pre-training via Aligning Text and Image Regions
Figure 3 for GRILL: Grounded Vision-language Pre-training via Aligning Text and Image Regions
Figure 4 for GRILL: Grounded Vision-language Pre-training via Aligning Text and Image Regions
Viaarxiv icon

A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training

Add code
May 03, 2023
Figure 1 for A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training
Figure 2 for A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training
Figure 3 for A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training
Figure 4 for A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training
Viaarxiv icon

Accelerating Dataset Distillation via Model Augmentation

Add code
Dec 12, 2022
Figure 1 for Accelerating Dataset Distillation via Model Augmentation
Figure 2 for Accelerating Dataset Distillation via Model Augmentation
Figure 3 for Accelerating Dataset Distillation via Model Augmentation
Figure 4 for Accelerating Dataset Distillation via Model Augmentation
Viaarxiv icon

AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning

Add code
Nov 02, 2022
Figure 1 for AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning
Figure 2 for AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning
Figure 3 for AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning
Figure 4 for AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning
Viaarxiv icon

AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers

Add code
Oct 14, 2022
Figure 1 for AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers
Figure 2 for AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers
Figure 3 for AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers
Figure 4 for AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers
Viaarxiv icon

Small Character Models Match Large Word Models for Autocomplete Under Memory Constraints

Add code
Oct 06, 2022
Figure 1 for Small Character Models Match Large Word Models for Autocomplete Under Memory Constraints
Figure 2 for Small Character Models Match Large Word Models for Autocomplete Under Memory Constraints
Figure 3 for Small Character Models Match Large Word Models for Autocomplete Under Memory Constraints
Figure 4 for Small Character Models Match Large Word Models for Autocomplete Under Memory Constraints
Viaarxiv icon

ADMoE: Anomaly Detection with Mixture-of-Experts from Noisy Labels

Add code
Aug 24, 2022
Figure 1 for ADMoE: Anomaly Detection with Mixture-of-Experts from Noisy Labels
Figure 2 for ADMoE: Anomaly Detection with Mixture-of-Experts from Noisy Labels
Figure 3 for ADMoE: Anomaly Detection with Mixture-of-Experts from Noisy Labels
Figure 4 for ADMoE: Anomaly Detection with Mixture-of-Experts from Noisy Labels
Viaarxiv icon

AdaMix: Mixture-of-Adapter for Parameter-efficient Tuning of Large Language Models

Add code
May 24, 2022
Figure 1 for AdaMix: Mixture-of-Adapter for Parameter-efficient Tuning of Large Language Models
Figure 2 for AdaMix: Mixture-of-Adapter for Parameter-efficient Tuning of Large Language Models
Figure 3 for AdaMix: Mixture-of-Adapter for Parameter-efficient Tuning of Large Language Models
Figure 4 for AdaMix: Mixture-of-Adapter for Parameter-efficient Tuning of Large Language Models
Viaarxiv icon

Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners

Add code
Apr 16, 2022
Figure 1 for Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners
Figure 2 for Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners
Figure 3 for Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners
Figure 4 for Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners
Viaarxiv icon