In many machine learning applications, the training data can contain highly sensitive personal information. Training large-scale deep models that are guaranteed not to leak sensitive information while not compromising their accuracy has been a significant challenge. In this work, we study the multi-class classification setting where the labels are considered sensitive and ought to be protected. We propose a new algorithm for training deep neural networks with label differential privacy, and run evaluations on several datasets. For Fashion MNIST and CIFAR-10, we demonstrate that our algorithm achieves significantly higher accuracy than the state-of-the-art, and in some regimes comes close to the non-private baselines. We also provide non-trivial training results for the the challenging CIFAR-100 dataset. We complement our algorithm with theoretical findings showing that in the setting of convex empirical risk minimization, the sample complexity of training with label differential privacy is dimension-independent, which is in contrast to vanilla differential privacy.