Unsupervised domain adaptation (UDA) has achieved unprecedented success in improving the cross-domain robustness of object detection models. However, existing UDA methods largely ignore the instantaneous data distribution during model learning, which could deteriorate the feature representation given large domain shift. In this work, we propose a Self-Guided Adaptation (SGA) model, target at aligning feature representation and transferring object detection models across domains while considering the instantaneous alignment difficulty. The core of SGA is to calculate "hardness" factors for sample pairs indicating domain distance in a kernel space. With the hardness factor, the proposed SGA adaptively indicates the importance of samples and assigns them different constrains. Indicated by hardness factors, Self-Guided Progressive Sampling (SPS) is implemented in an "easy-to-hard" way during model adaptation. Using multi-stage convolutional features, SGA is further aggregated to fully align hierarchical representations of detection models. Extensive experiments on commonly used benchmarks show that SGA improves the state-of-the-art methods with significant margins, while demonstrating the effectiveness on large domain shift.