Quantitative analysis of microscope videos often requires instance segmentation and tracking of cellular and subcellular objects. Traditional method is composed of two stages: (1) instance object segmentation of each frame, and (2) associate objects frame by frame. Recently, pixel embedding based deep learning approaches provide single stage holistic solutions to tackle instance segmentation and tracking simultaneously. However, the deep learning methods require consistent annotations not only spatially (for segmentation), but also temporally (for tracking). In computer vision, annotated training data with consistent segmentation and tracking is resource intensive, which can be more severe in microscopy imaging owing to (1) dense objects (e.g., overlapping or touching), and (2) high dynamics (e.g., irregular motion and mitosis). To alleviate the lack of such annotations in dynamics scenes, adversarial simulations have provided successful solutions in computer vision, such as using simulated environments (e.g., computer games) to train real-world self-driving systems. In this paper, we proposed an annotation-free synthetic instance segmentation and tracking (ASIST) method with adversarial simulation and single-stage pixel-embedding based learning. The contribution is three-fold: (1) the proposed method aggregates adversarial simulations and single-stage pixel-embedding based deep learning; (2) the method is assessed with both cellular (i.e., HeLa cells) and subcellular (i.e., microvilli) objects; and (3) to the best of our knowledge, this is the first study to explore annotation-free instance segmentation and tracking study for microscope videos. From the results, our ASIST method achieved promising results compared with fully supervised approaches.