Picture for Huy Nguyen

Huy Nguyen

Mixture of Experts Meets Prompt-Based Continual Learning

Add code
May 23, 2024
Viaarxiv icon

Statistical Advantages of Perturbing Cosine Router in Sparse Mixture of Experts

Add code
May 23, 2024
Figure 1 for Statistical Advantages of Perturbing Cosine Router in Sparse Mixture of Experts
Figure 2 for Statistical Advantages of Perturbing Cosine Router in Sparse Mixture of Experts
Figure 3 for Statistical Advantages of Perturbing Cosine Router in Sparse Mixture of Experts
Figure 4 for Statistical Advantages of Perturbing Cosine Router in Sparse Mixture of Experts
Viaarxiv icon

Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts

Add code
May 22, 2024
Figure 1 for Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts
Figure 2 for Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts
Figure 3 for Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts
Figure 4 for Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts
Viaarxiv icon

On Parameter Estimation in Deviated Gaussian Mixture of Experts

Add code
Feb 07, 2024
Viaarxiv icon

FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion

Add code
Feb 05, 2024
Figure 1 for FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion
Figure 2 for FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion
Figure 3 for FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion
Figure 4 for FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion
Viaarxiv icon

On Least Squares Estimation in Softmax Gating Mixture of Experts

Add code
Feb 05, 2024
Viaarxiv icon

CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition

Add code
Feb 04, 2024
Figure 1 for CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition
Figure 2 for CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition
Figure 3 for CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition
Figure 4 for CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition
Viaarxiv icon

Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?

Add code
Jan 25, 2024
Viaarxiv icon

AG-ReID.v2: Bridging Aerial and Ground Views for Person Re-identification

Add code
Jan 05, 2024
Viaarxiv icon

A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts

Add code
Oct 22, 2023
Figure 1 for A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts
Viaarxiv icon