Picture for Giang Do

Giang Do

Sparse Mixture of Experts as Unified Competitive Learning

Add code
Mar 29, 2025
Viaarxiv icon

S2MoE: Robust Sparse Mixture of Experts via Stochastic Learning

Add code
Mar 29, 2025
Figure 1 for S2MoE: Robust Sparse Mixture of Experts via Stochastic Learning
Figure 2 for S2MoE: Robust Sparse Mixture of Experts via Stochastic Learning
Figure 3 for S2MoE: Robust Sparse Mixture of Experts via Stochastic Learning
Figure 4 for S2MoE: Robust Sparse Mixture of Experts via Stochastic Learning
Viaarxiv icon

On the effectiveness of discrete representations in sparse mixture of experts

Add code
Nov 28, 2024
Figure 1 for On the effectiveness of discrete representations in sparse mixture of experts
Figure 2 for On the effectiveness of discrete representations in sparse mixture of experts
Figure 3 for On the effectiveness of discrete representations in sparse mixture of experts
Figure 4 for On the effectiveness of discrete representations in sparse mixture of experts
Viaarxiv icon

SimSMoE: Solving Representational Collapse via Similarity Measure

Add code
Jun 22, 2024
Figure 1 for SimSMoE: Solving Representational Collapse via Similarity Measure
Figure 2 for SimSMoE: Solving Representational Collapse via Similarity Measure
Figure 3 for SimSMoE: Solving Representational Collapse via Similarity Measure
Figure 4 for SimSMoE: Solving Representational Collapse via Similarity Measure
Viaarxiv icon

CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition

Add code
Feb 04, 2024
Figure 1 for CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition
Figure 2 for CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition
Figure 3 for CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition
Figure 4 for CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition
Viaarxiv icon

HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts

Add code
Dec 12, 2023
Figure 1 for HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts
Figure 2 for HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts
Figure 3 for HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts
Figure 4 for HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts
Viaarxiv icon