Analyzing the urban trajectory in cities has become an important topic in data mining. How can we model the human mobility consisting of stay and travel from the raw trajectory data? How can we infer such a mobility model from the single trajectory information? How can we further generalize the mobility inference to accommodate the real-world trajectory data that is sparsely sampled over time? In this paper, based on formal and rigid definitions of the stay/travel mobility, we propose a single trajectory inference algorithm that utilizes a generic long-tailed sparsity pattern in the large-scale trajectory data. The algorithm guarantees a 100\% precision in the stay/travel inference with a provable lower-bound in the recall. Furthermore, we introduce an encoder-decoder learning architecture that admits multiple trajectories as inputs. The architecture is optimized for the mobility inference problem through customized embedding and learning mechanism. Evaluations with three trajectory data sets of 40 million urban users validate the performance guarantees of the proposed inference algorithm and demonstrate the superiority of our deep learning model, in comparison to well-known sequence learning methods. On extremely sparse trajectories, the deep learning model achieves a 2$\times$ overall accuracy improvement from the single trajectory inference algorithm, through proven scalability and generalizability to large-scale versatile training data.