We present a set-based machine learning framework that infers posterior distributions of galaxy cluster masses from projected galaxy dynamics. Our model combines Deep Sets and conditional normalizing flows to incorporate both positional and velocity information of member galaxies to predict residual corrections to the $M$-$\sigma$ relation for improved interpretability. Trained on the Uchuu-UniverseMachine simulation, our approach significantly reduces scatter and provides well-calibrated uncertainties across the full mass range compared to traditional dynamical estimates.