Picture for Rachel S. Y. Teo

Rachel S. Y. Teo

Almost Asymptotically Optimal Active Clustering Through Pairwise Observations

Add code
Feb 05, 2026
Viaarxiv icon

The Blessing and Curse of Dimensionality in Safety Alignment

Add code
Jul 27, 2025
Viaarxiv icon

MoLEx: Mixture of Layer Experts for Finetuning with Sparse Upcycling

Add code
Mar 14, 2025
Viaarxiv icon

CAMEx: Curvature-aware Merging of Experts

Add code
Feb 26, 2025
Figure 1 for CAMEx: Curvature-aware Merging of Experts
Figure 2 for CAMEx: Curvature-aware Merging of Experts
Figure 3 for CAMEx: Curvature-aware Merging of Experts
Figure 4 for CAMEx: Curvature-aware Merging of Experts
Viaarxiv icon

Tight Clusters Make Specialized Experts

Add code
Feb 21, 2025
Figure 1 for Tight Clusters Make Specialized Experts
Figure 2 for Tight Clusters Make Specialized Experts
Figure 3 for Tight Clusters Make Specialized Experts
Figure 4 for Tight Clusters Make Specialized Experts
Viaarxiv icon

MomentumSMoE: Integrating Momentum into Sparse Mixture of Experts

Add code
Oct 18, 2024
Viaarxiv icon

Unveiling the Hidden Structure of Self-Attention via Kernel Principal Component Analysis

Add code
Jun 19, 2024
Figure 1 for Unveiling the Hidden Structure of Self-Attention via Kernel Principal Component Analysis
Figure 2 for Unveiling the Hidden Structure of Self-Attention via Kernel Principal Component Analysis
Figure 3 for Unveiling the Hidden Structure of Self-Attention via Kernel Principal Component Analysis
Figure 4 for Unveiling the Hidden Structure of Self-Attention via Kernel Principal Component Analysis
Viaarxiv icon