Short-term road traffic prediction (STTP) is one of the most important modules in Intelligent Transportation Systems (ITS). However, network-level STTP still remains challenging due to the difficulties both in modeling the diverse traffic patterns and tacking high-dimensional time series with low latency. Therefore, a framework combining with a deep clustering (DeepCluster) module is developed for STTP at largescale networks in this paper. The DeepCluster module is proposed to supervise the representation learning in a visualized way from the large unlabeled dataset. More specifically, to fully exploit the traffic periodicity, the raw series is first split into a number of sub-series for triplets generation. The convolutional neural networks (CNNs) with triplet loss are utilized to extract the features of shape by transferring the series into visual images. The shape-based representations are then used for road segments clustering. Thereafter, motivated by the fact that the road segments in a group have similar patterns, a model sharing strategy is further proposed to build recurrent NNs (RNNs)-based predictions through a group-based model (GM), instead of individual-based model (IM) in which one model are built for one road exclusively. Our framework can not only significantly reduce the number of models and cost, but also increase the number of training data and the diversity of samples. In the end, we evaluate the proposed framework over the network of Liuli Bridge in Beijing. Experimental results show that the DeepCluster can effectively cluster the road segments and GM can achieve comparable performance against the IM with less number of models.