The advent of the Internet era has led to an explosive growth in the Electronic Health Records (EHR) in the past decades. The EHR data can be regarded as a collection of clinical events, including laboratory results, medication records, physiological indicators, etc, which can be used for clinical outcome prediction tasks to support constructions of intelligent health systems. Learning patient representation from these clinical events for the clinical outcome prediction is an important but challenging step. Most related studies transform EHR data of a patient into a sequence of clinical events in temporal order and then use sequential models to learn patient representations for outcome prediction. However, clinical event sequence contains thousands of event types and temporal dependencies. We further make an observation that clinical events occurring in a short period are not constrained by any temporal order but events in a long term are influenced by temporal dependencies. The multi-scale temporal property makes it difficult for traditional sequential models to capture the short-term co-occurrence and the long-term temporal dependencies in clinical event sequences. In response to the above challenges, this paper proposes a Multi-level Representation Model (MRM). MRM first uses a sparse attention mechanism to model the short-term co-occurrence, then uses interval-based event pooling to remove redundant information and reduce sequence length and finally predicts clinical outcomes through Long Short-Term Memory (LSTM). Experiments on real-world datasets indicate that our proposed model largely improves the performance of clinical outcome prediction tasks using EHR data.