Abstract:Generative Engine Marketing (GEM) is an emerging ecosystem for monetizing generative engines, such as LLM-based chatbots, by seamlessly integrating relevant advertisements into their responses. At the core of GEM lies the generation and evaluation of ad-injected responses. However, existing benchmarks are not specifically designed for this purpose, which limits future research. To address this gap, we propose GEM-Bench, the first comprehensive benchmark for ad-injected response generation in GEM. GEM-Bench includes three curated datasets covering both chatbot and search scenarios, a metric ontology that captures multiple dimensions of user satisfaction and engagement, and several baseline solutions implemented within an extensible multi-agent framework. Our preliminary results indicate that, while simple prompt-based methods achieve reasonable engagement such as click-through rate, they often reduce user satisfaction. In contrast, approaches that insert ads based on pre-generated ad-free responses help mitigate this issue but introduce additional overhead. These findings highlight the need for future research on designing more effective and efficient solutions for generating ad-injected responses in GEM.
Abstract:Video generation models have achieved remarkable progress in the past year. The quality of AI video continues to improve, but at the cost of larger model size, increased data quantity, and greater demand for training compute. In this report, we present Open-Sora 2.0, a commercial-level video generation model trained for only $200k. With this model, we demonstrate that the cost of training a top-performing video generation model is highly controllable. We detail all techniques that contribute to this efficiency breakthrough, including data curation, model architecture, training strategy, and system optimization. According to human evaluation results and VBench scores, Open-Sora 2.0 is comparable to global leading video generation models including the open-source HunyuanVideo and the closed-source Runway Gen-3 Alpha. By making Open-Sora 2.0 fully open-source, we aim to democratize access to advanced video generation technology, fostering broader innovation and creativity in content creation. All resources are publicly available at: https://github.com/hpcaitech/Open-Sora.
Abstract:Integrating large language models (LLMs) and knowledge graphs (KGs) holds great promise for revolutionizing intelligent education, but challenges remain in achieving personalization, interactivity, and explainability. We propose FOKE, a Forest Of Knowledge and Education framework that synergizes foundation models, knowledge graphs, and prompt engineering to address these challenges. FOKE introduces three key innovations: (1) a hierarchical knowledge forest for structured domain knowledge representation; (2) a multi-dimensional user profiling mechanism for comprehensive learner modeling; and (3) an interactive prompt engineering scheme for generating precise and tailored learning guidance. We showcase FOKE's application in programming education, homework assessment, and learning path planning, demonstrating its effectiveness and practicality. Additionally, we implement Scholar Hero, a real-world instantiation of FOKE. Our research highlights the potential of integrating foundation models, knowledge graphs, and prompt engineering to revolutionize intelligent education practices, ultimately benefiting learners worldwide. FOKE provides a principled and unified approach to harnessing cutting-edge AI technologies for personalized, interactive, and explainable educational services, paving the way for further research and development in this critical direction.