Object detection is one of the major problems in computer vision, and has been extensively studied. Most of the existing detection works rely on labor-intensive supervision, such as ground truth bounding boxes of objects or at least image-level annotations. On the contrary, we propose an object detection method that does not require any form of human annotation on target tasks, by exploiting freely available web images. In order to facilitate effective knowledge transfer from web images, we introduce a multi-instance multi-label domain adaption learning framework with two key innovations. First of all, we propose an instance-level adversarial domain adaptation network with attention on foreground objects to transfer the object appearances from web domain to target domain. Second, to preserve the class-specific semantic structure of transferred object features, we propose a simultaneous transfer mechanism to transfer the supervision across domains through pseudo strong label generation. With our end-to-end framework that simultaneously learns a weakly supervised detector and transfers knowledge across domains, we achieved significant improvements over baseline methods on the benchmark datasets.