In the post-deep learning era, the Transformer architecture has demonstrated its powerful performance across pre-trained big models and various downstream tasks. However, the enormous computational demands of this architecture have deterred many researchers. To further reduce the complexity of attention models, numerous efforts have been made to design more efficient methods. Among them, the State Space Model (SSM), as a possible replacement for the self-attention based Transformer model, has drawn more and more attention in recent years. In this paper, we give the first comprehensive review of these works and also provide experimental comparisons and analysis to better demonstrate the features and advantages of SSM. Specifically, we first give a detailed description of principles to help the readers quickly capture the key ideas of SSM. After that, we dive into the reviews of existing SSMs and their various applications, including natural language processing, computer vision, graph, multi-modal and multi-media, point cloud/event stream, time series data, and other domains. In addition, we give statistical comparisons and analysis of these models and hope it helps the readers to understand the effectiveness of different structures on various tasks. Then, we propose possible research points in this direction to better promote the development of the theoretical model and application of SSM. More related works will be continuously updated on the following GitHub: https://github.com/Event-AHU/Mamba_State_Space_Model_Paper_List.
The main task of personalized recommendation is capturing users' interests based on their historical behaviors. Most of recent advances in recommender systems mainly focus on modeling users' preferences accurately using deep learning based approaches. There are two important properties of users' interests, one is that users' interests are dynamic and evolve over time, the other is that users' interests have different resolutions, or temporal-ranges to be precise, such as long-term and short-term preferences. Existing approaches either use Recurrent Neural Networks (RNNs) to address the drifts in users' interests without considering different temporal-ranges, or design two different networks to model long-term and short-term preferences separately. This paper presents a multi-resolution interest fusion model (MRIF) that takes both properties of users' interests into consideration. The proposed model is capable to capture the dynamic changes in users' interests at different temporal-ranges, and provides an effective way to combine a group of multi-resolution user interests to make predictions. Experiments show that our method outperforms state-of-the-art recommendation methods consistently.