Accurately tracking and predicting behaviors of surrounding objects are key prerequisites for intelligent systems such as autonomous vehicles to achieve safe and high-quality decision making and motion planning. However, there still remain challenges for multi-target tracking due to object number fluctuation and occlusion. To overcome these challenges, we propose a constrained mixture sequential Monte Carlo (CMSMC) method in which a mixture representation is incorporated in the estimated posterior distribution to maintain multi-modality. Multiple targets can be tracked simultaneously within a unified framework without explicit data association between observations and tracking targets. The framework can incorporate an arbitrary prediction model as the implicit proposal distribution of the CMSMC method. An example in this paper is a learning-based model for hierarchical time-series prediction, which consists of a behavior recognition module and a state evolution module. Both modules in the proposed model are generic and flexible so as to be applied to a class of time-series prediction problems where behaviors can be separated into different levels. Finally, the proposed framework is applied to a numerical case study as well as a task of on-road vehicle tracking, behavior recognition, and prediction in highway scenarios. Instead of only focusing on forecasting trajectory of a single entity, we jointly predict continuous motions for interactive entities simultaneously. The proposed approaches are evaluated from multiple aspects, which demonstrate great potential for intelligent vehicular systems and traffic surveillance systems.