We provide the optimal receive combining strategy for federated learning in multiple-input multiple-output (MIMO) systems. Our proposed algorithm allows the clients to perform individual gradient sparsification which greatly improves performance in scenarios with heterogeneous (non i.i.d.) training data. The proposed method beats the benchmark by a wide margin.