Mainstream approaches for unsupervised domain adaptation (UDA) learn domain-invariant representations to bridge domain gap. More recently, self-training has been gaining momentum in UDA. Originated from semi-supervised learning, self-training uses unlabeled data efficiently by training on pseudo-labels. However, as corroborated in this work, under distributional shift in UDA, the pseudo-labels can be unreliable in terms of their large discrepancy from the ground truth labels. Thereby, we propose Cycle Self-Training (CST), a principled self-training algorithm that enforces pseudo-labels to generalize across domains. In the forward step, CST generates target pseudo-labels with a source-trained classifier. In the reverse step, CST trains a target classifier using target pseudo-labels, and then updates the shared representations to make the target classifier perform well on the source data. We introduce the Tsallis entropy, a novel regularization to improve the quality of target pseudo-labels. On quadratic neural networks, we prove that CST recovers target ground truth, while both invariant feature learning and vanilla self-training fail. Empirical results indicate that CST significantly improves over prior state-of-the-arts in standard UDA benchmarks across visual recognition and sentiment analysis tasks.