This paper studies how well generative adversarial networks (GANs) learn probability distributions from finite samples. Our main results estimate the convergence rates of GANs under a collection of integral probability metrics defined through H\"older classes, including the Wasserstein distance as a special case. We also show that GANs are able to adaptively learn data distributions with low-dimensional structure or have H\"older densities, when the network architectures are chosen properly. In particular, for distributions concentrate around a low-dimensional set, it is proved that the learning rates of GANs do not depend on the high ambient dimension, but on the lower intrinsic dimension. Our analysis is based on a new oracle inequality decomposing the estimation error into generator and discriminator approximation error and statistical error, which may be of independent interest.