Abstract:Recent breakthroughs in generative AI have transformed recommender systems through end-to-end generation. OneRec reformulates recommendation as an autoregressive generation task, achieving high Model FLOPs Utilization. While OneRec-V1 has shown significant empirical success in real-world deployment, two critical challenges hinder its scalability and performance: (1) inefficient computational allocation where 97.66% of resources are consumed by sequence encoding rather than generation, and (2) limitations in reinforcement learning relying solely on reward models. To address these challenges, we propose OneRec-V2, featuring: (1) Lazy Decoder-Only Architecture: Eliminates encoder bottlenecks, reducing total computation by 94% and training resources by 90%, enabling successful scaling to 8B parameters. (2) Preference Alignment with Real-World User Interactions: Incorporates Duration-Aware Reward Shaping and Adaptive Ratio Clipping to better align with user preferences using real-world feedback. Extensive A/B tests on Kuaishou demonstrate OneRec-V2's effectiveness, improving App Stay Time by 0.467%/0.741% while balancing multi-objective recommendations. This work advances generative recommendation scalability and alignment with real-world feedback, representing a step forward in the development of end-to-end recommender systems.




Abstract:Recommender systems have been widely used in various large-scale user-oriented platforms for many years. However, compared to the rapid developments in the AI community, recommendation systems have not achieved a breakthrough in recent years. For instance, they still rely on a multi-stage cascaded architecture rather than an end-to-end approach, leading to computational fragmentation and optimization inconsistencies, and hindering the effective application of key breakthrough technologies from the AI community in recommendation scenarios. To address these issues, we propose OneRec, which reshapes the recommendation system through an end-to-end generative approach and achieves promising results. Firstly, we have enhanced the computational FLOPs of the current recommendation model by 10 $\times$ and have identified the scaling laws for recommendations within certain boundaries. Secondly, reinforcement learning techniques, previously difficult to apply for optimizing recommendations, show significant potential in this framework. Lastly, through infrastructure optimizations, we have achieved 23.7% and 28.8% Model FLOPs Utilization (MFU) on flagship GPUs during training and inference, respectively, aligning closely with the LLM community. This architecture significantly reduces communication and storage overhead, resulting in operating expense that is only 10.6% of traditional recommendation pipelines. Deployed in Kuaishou/Kuaishou Lite APP, it handles 25% of total queries per second, enhancing overall App Stay Time by 0.54% and 1.24%, respectively. Additionally, we have observed significant increases in metrics such as 7-day Lifetime, which is a crucial indicator of recommendation experience. We also provide practical lessons and insights derived from developing, optimizing, and maintaining a production-scale recommendation system with significant real-world impact.




Abstract:The development of noninvasive brain imaging such as resting-state functional magnetic resonance imaging (rs-fMRI) and its combination with AI algorithm provides a promising solution for the early diagnosis of Autism spectrum disorder (ASD). However, the performance of the current ASD classification based on rs-fMRI still needs to be improved. This paper introduces a classification framework to aid ASD diagnosis based on rs-fMRI. In the framework, we proposed a novel filter feature selection method based on the difference between step distribution curves (DSDC) to select remarkable functional connectivities (FCs) and utilized a multilayer perceptron (MLP) which was pretrained by a simplified Variational Autoencoder (VAE) for classification. We also designed a pipeline consisting of a normalization procedure and a modified hyperbolic tangent (tanh) activation function to replace the original tanh function, further improving the model accuracy. Our model was evaluated by 10 times 10-fold cross-validation and achieved an average accuracy of 78.12%, outperforming the state-of-the-art methods reported on the same dataset. Given the importance of sensitivity and specificity in disease diagnosis, two constraints were designed in our model which can improve the model's sensitivity and specificity by up to 9.32% and 10.21%, respectively. The added constraints allow our model to handle different application scenarios and can be used broadly.