Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Weinan Zhang

Multi-LLM-Agent Systems: Techniques and Business Perspectives

Nov 21, 2024

Yingxuan Yang, Qiuying Peng, Jun Wang, Weinan Zhang

Figure 1 for Multi-LLM-Agent Systems: Techniques and Business Perspectives

Figure 2 for Multi-LLM-Agent Systems: Techniques and Business Perspectives

Figure 3 for Multi-LLM-Agent Systems: Techniques and Business Perspectives

Figure 4 for Multi-LLM-Agent Systems: Techniques and Business Perspectives

Abstract:In the era of (multi-modal) large language models, most operational processes can be reformulated and reproduced using LLM agents. The LLM agents can perceive, control, and get feedback from the environment so as to accomplish the given tasks in an autonomous manner. Besides the environment-interaction property, the LLM agents can call various external tools to ease the task completion process. The tools can be regarded as a predefined operational process with private or real-time knowledge that does not exist in the parameters of LLMs. As a natural trend of development, the tools for calling are becoming autonomous agents, thus the full intelligent system turns out to be a multi-LLM-agent system (MLAS). This paper discusses the technical and business landscapes of MLAS. Compared to the previous single-LLM-agent system, a MLAS has the advantages of i) higher potential of task-solving performance, ii) higher flexibility for system changing, iii) proprietary data preserving for each participating entity, and iv) feasibility of monetization for each entity. To support the ecosystem of MLAS, we provide a preliminary version of such MLAS protocol considering technical requirements, data privacy, and business incentives. As such, MLAS would be a practical solution to achieve artificial collective intelligence in the near future.

Via

Access Paper or Ask Questions

Unstructured Text Enhanced Open-domain Dialogue System: A Systematic Survey

Nov 14, 2024

Longxuan Ma, Mingda Li, Weinan Zhang, Jiapeng Li, Ting Liu

Figure 1 for Unstructured Text Enhanced Open-domain Dialogue System: A Systematic Survey

Figure 2 for Unstructured Text Enhanced Open-domain Dialogue System: A Systematic Survey

Figure 3 for Unstructured Text Enhanced Open-domain Dialogue System: A Systematic Survey

Figure 4 for Unstructured Text Enhanced Open-domain Dialogue System: A Systematic Survey

Abstract:Incorporating external knowledge into dialogue generation has been proven to benefit the performance of an open-domain Dialogue System (DS), such as generating informative or stylized responses, controlling conversation topics. In this article, we study the open-domain DS that uses unstructured text as external knowledge sources (\textbf{U}nstructured \textbf{T}ext \textbf{E}nhanced \textbf{D}ialogue \textbf{S}ystem, \textbf{UTEDS}). The existence of unstructured text entails distinctions between UTEDS and traditional data-driven DS and we aim to analyze these differences. We first give the definition of the UTEDS related concepts, then summarize the recently released datasets and models. We categorize UTEDS into Retrieval and Generative models and introduce them from the perspective of model components. The retrieval models consist of Fusion, Matching, and Ranking modules, while the generative models comprise Dialogue and Knowledge Encoding, Knowledge Selection, and Response Generation modules. We further summarize the evaluation methods utilized in UTEDS and analyze the current models' performance. At last, we discuss the future development trends of UTEDS, hoping to inspire new research in this field.

* ACM Transactions on Information Systems 40(1): 9:1-9:44 (2022)
* 45 pages, 3 Figures, 11 Tables

Via

Access Paper or Ask Questions

Prove Your Point!: Bringing Proof-Enhancement Principles to Argumentative Essay Generation

Oct 30, 2024

Ruiyu Xiao, Lei Wu, Yuhang Gou, Weinan Zhang, Ting Liu

Figure 1 for Prove Your Point!: Bringing Proof-Enhancement Principles to Argumentative Essay Generation

Figure 2 for Prove Your Point!: Bringing Proof-Enhancement Principles to Argumentative Essay Generation

Figure 3 for Prove Your Point!: Bringing Proof-Enhancement Principles to Argumentative Essay Generation

Figure 4 for Prove Your Point!: Bringing Proof-Enhancement Principles to Argumentative Essay Generation

Abstract:Argumentative essay generation (AEG) aims to generate complete texts on specific controversial topics or debates. Although current AEG methods can generate individual opinions, they often overlook the high-level connections between these opinions. This often leads to the generated results being mired in logical confusion, unable to proof their own arguments effectively. The generated essay may present evidence that contradicts the claims or they may fail to assemble the claims into logical flow. In this paper, we present a unified two-stage framework: Proof-Enhancement and Self-Annotation (PESA) for AEG with a focus on logical enhancement. Specifically, we first construct pseudo-labels for logical information,claims and grounds, using a large language model. We then propose a tree planning approach that introduces proof principles and ensures logical consistency. Extensive experimental results show that, benefiting from proof principle guidance, PESA generates argumentative essays with better logical validity and persuasiveness than strong baseline models.

* EMNLP 2024

Via

Access Paper or Ask Questions

Beyond Positive History: Re-ranking with List-level Hybrid Feedback

Oct 28, 2024

Muyan Weng, Yunjia Xi, Weiwen Liu, Bo Chen, Jianghao Lin, Ruiming Tang, Weinan Zhang, Yong Yu

Figure 1 for Beyond Positive History: Re-ranking with List-level Hybrid Feedback

Figure 2 for Beyond Positive History: Re-ranking with List-level Hybrid Feedback

Figure 3 for Beyond Positive History: Re-ranking with List-level Hybrid Feedback

Figure 4 for Beyond Positive History: Re-ranking with List-level Hybrid Feedback

Abstract:As the last stage of recommender systems, re-ranking generates a re-ordered list that aligns with the user's preference. However, previous works generally focus on item-level positive feedback as history (e.g., only clicked items) and ignore that users provide positive or negative feedback on items in the entire list. This list-level hybrid feedback can reveal users' holistic preferences and reflect users' comparison behavior patterns manifesting within a list. Such patterns could predict user behaviors on candidate lists, thus aiding better re-ranking. Despite appealing benefits, extracting and integrating preferences and behavior patterns from list-level hybrid feedback into re-ranking multiple items remains challenging. To this end, we propose Re-ranking with List-level Hybrid Feedback (dubbed RELIFE). It captures user's preferences and behavior patterns with three modules: a Disentangled Interest Miner to disentangle the user's preferences into interests and disinterests, a Sequential Preference Mixer to learn users' entangled preferences considering the context of feedback, and a Comparison-aware Pattern Extractor to capture user's behavior patterns within each list. Moreover, for better integration of patterns, contrastive learning is adopted to align the behavior patterns of candidate and historical lists. Extensive experiments show that RELIFE significantly outperforms SOTA re-ranking baselines.

Via

Access Paper or Ask Questions

Learning ID-free Item Representation with Token Crossing for Multimodal Recommendation

Oct 25, 2024

Kangning Zhang, Jiarui Jin, Yingjie Qin, Ruilong Su, Jianghao Lin, Yong Yu, Weinan Zhang

Figure 1 for Learning ID-free Item Representation with Token Crossing for Multimodal Recommendation

Figure 2 for Learning ID-free Item Representation with Token Crossing for Multimodal Recommendation

Figure 3 for Learning ID-free Item Representation with Token Crossing for Multimodal Recommendation

Figure 4 for Learning ID-free Item Representation with Token Crossing for Multimodal Recommendation

Abstract:Current multimodal recommendation models have extensively explored the effective utilization of multimodal information; however, their reliance on ID embeddings remains a performance bottleneck. Even with the assistance of multimodal information, optimizing ID embeddings remains challenging for ID-based Multimodal Recommender when interaction data is sparse. Furthermore, the unique nature of item-specific ID embeddings hinders the information exchange among related items and the spatial requirement of ID embeddings increases with the scale of item. Based on these limitations, we propose an ID-free MultimOdal TOken Representation scheme named MOTOR that represents each item using learnable multimodal tokens and connects them through shared tokens. Specifically, we first employ product quantization to discretize each item's multimodal features (e.g., images, text) into discrete token IDs. We then interpret the token embeddings corresponding to these token IDs as implicit item features, introducing a new Token Cross Network to capture the implicit interaction patterns among these tokens. The resulting representations can replace the original ID embeddings and transform the original ID-based multimodal recommender into ID-free system, without introducing any additional loss design. MOTOR reduces the overall space requirements of these models, facilitating information interaction among related items, while also significantly enhancing the model's recommendation capability. Extensive experiments on nine mainstream models demonstrate the significant performance improvement achieved by MOTOR, highlighting its effectiveness in enhancing multimodal recommendation systems.

* 11 pages,6 figures

Via

Access Paper or Ask Questions

Building Dialogue Understanding Models for Low-resource Language Indonesian from Scratch

Oct 24, 2024

Donglin Di, Weinan Zhang, Yue Zhang, Fanglin Wang

Figure 1 for Building Dialogue Understanding Models for Low-resource Language Indonesian from Scratch

Figure 2 for Building Dialogue Understanding Models for Low-resource Language Indonesian from Scratch

Figure 3 for Building Dialogue Understanding Models for Low-resource Language Indonesian from Scratch

Figure 4 for Building Dialogue Understanding Models for Low-resource Language Indonesian from Scratch

Abstract:Making use of off-the-shelf resources of resource-rich languages to transfer knowledge for low-resource languages raises much attention recently. The requirements of enabling the model to reach the reliable performance lack well guided, such as the scale of required annotated data or the effective framework. To investigate the first question, we empirically investigate the cost-effectiveness of several methods to train the intent classification and slot-filling models for Indonesia (ID) from scratch by utilizing the English data. Confronting the second challenge, we propose a Bi-Confidence-Frequency Cross-Lingual transfer framework (BiCF), composed by ``BiCF Mixing'', ``Latent Space Refinement'' and ``Joint Decoder'', respectively, to tackle the obstacle of lacking low-resource language dialogue data. Extensive experiments demonstrate our framework performs reliably and cost-efficiently on different scales of manually annotated Indonesian data. We release a large-scale fine-labeled dialogue dataset (ID-WOZ) and ID-BERT of Indonesian for further research.

Via

Access Paper or Ask Questions

Unleashing the Potential of Multi-Channel Fusion in Retrieval for Personalized Recommendations

Oct 21, 2024

Junjie Huang, Jiarui Qin, Jianghao Lin, Ziming Feng, Yong Yu, Weinan Zhang

Figure 1 for Unleashing the Potential of Multi-Channel Fusion in Retrieval for Personalized Recommendations

Figure 2 for Unleashing the Potential of Multi-Channel Fusion in Retrieval for Personalized Recommendations

Figure 3 for Unleashing the Potential of Multi-Channel Fusion in Retrieval for Personalized Recommendations

Figure 4 for Unleashing the Potential of Multi-Channel Fusion in Retrieval for Personalized Recommendations

Abstract:Recommender systems (RS) are pivotal in managing information overload in modern digital services. A key challenge in RS is efficiently processing vast item pools to deliver highly personalized recommendations under strict latency constraints. Multi-stage cascade ranking addresses this by employing computationally efficient retrieval methods to cover diverse user interests, followed by more precise ranking models to refine the results. In the retrieval stage, multi-channel retrieval is often used to generate distinct item subsets from different candidate generators, leveraging the complementary strengths of these methods to maximize coverage. However, forwarding all retrieved items overwhelms downstream rankers, necessitating truncation. Despite advancements in individual retrieval methods, multi-channel fusion, the process of efficiently merging multi-channel retrieval results, remains underexplored. We are the first to identify and systematically investigate multi-channel fusion in the retrieval stage. Current industry practices often rely on heuristic approaches and manual designs, which often lead to suboptimal performance. Moreover, traditional gradient-based methods like SGD are unsuitable for this task due to the non-differentiable nature of the selection process. In this paper, we explore advanced channel fusion strategies by assigning systematically optimized weights to each channel. We utilize black-box optimization techniques, including the Cross Entropy Method and Bayesian Optimization for global weight optimization, alongside policy gradient-based approaches for personalized merging. Our methods enhance both personalization and flexibility, achieving significant performance improvements across multiple datasets and yielding substantial gains in real-world deployments, offering a scalable solution for optimizing multi-channel fusion in retrieval.

* 12 pages, 8 figures

Via

Access Paper or Ask Questions

ELF-Gym: Evaluating Large Language Models Generated Features for Tabular Prediction

Oct 13, 2024

Yanlin Zhang, Ning Li, Quan Gan, Weinan Zhang, David Wipf, Minjie Wang

Figure 1 for ELF-Gym: Evaluating Large Language Models Generated Features for Tabular Prediction

Figure 2 for ELF-Gym: Evaluating Large Language Models Generated Features for Tabular Prediction

Figure 3 for ELF-Gym: Evaluating Large Language Models Generated Features for Tabular Prediction

Figure 4 for ELF-Gym: Evaluating Large Language Models Generated Features for Tabular Prediction

Abstract:Crafting effective features is a crucial yet labor-intensive and domain-specific task within machine learning pipelines. Fortunately, recent advancements in Large Language Models (LLMs) have shown promise in automating various data science tasks, including feature engineering. But despite this potential, evaluations thus far are primarily based on the end performance of a complete ML pipeline, providing limited insight into precisely how LLMs behave relative to human experts in feature engineering. To address this gap, we propose ELF-Gym, a framework for Evaluating LLM-generated Features. We curated a new dataset from historical Kaggle competitions, including 251 "golden" features used by top-performing teams. ELF-Gym then quantitatively evaluates LLM-generated features by measuring their impact on downstream model performance as well as their alignment with expert-crafted features through semantic and functional similarity assessments. This approach provides a more comprehensive evaluation of disparities between LLMs and human experts, while offering valuable insights into specific areas where LLMs may have room for improvement. For example, using ELF-Gym we empirically demonstrate that, in the best-case scenario, LLMs can semantically capture approximately 56% of the golden features, but at the more demanding implementation level this overlap drops to 13%. Moreover, in other cases LLMs may fail completely, particularly on datasets that require complex features, indicating broad potential pathways for improvement.

Via

Access Paper or Ask Questions

Agentic Information Retrieval

Oct 13, 2024

Weinan Zhang, Junwei Liao, Ning Li, Kounianhua Du

Figure 1 for Agentic Information Retrieval

Figure 2 for Agentic Information Retrieval

Figure 3 for Agentic Information Retrieval

Figure 4 for Agentic Information Retrieval

Abstract:What will information entry look like in the next generation of digital products? Since the 1970s, user access to relevant information has relied on domain-specific architectures of information retrieval (IR). Over the past two decades, the advent of modern IR systems, including web search engines and personalized recommender systems, has greatly improved the efficiency of retrieving relevant information from vast data corpora. However, the core paradigm of these IR systems remains largely unchanged, relying on filtering a predefined set of candidate items. Since 2022, breakthroughs in large language models (LLMs) have begun transforming how information is accessed, establishing a new technical paradigm. In this position paper, we introduce Agentic Information Retrieval (Agentic IR), a novel IR paradigm shaped by the capabilities of LLM agents. Agentic IR expands the scope of accessible tasks and leverages a suite of new techniques to redefine information retrieval. We discuss three types of cutting-edge applications of agentic IR and the challenges faced. We propose that agentic IR holds promise for generating innovative applications, potentially becoming a central information entry point in future digital ecosystems.

* 11 pages, position paper

Via

Access Paper or Ask Questions

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Oct 12, 2024

Jun Wang, Meng Fang, Ziyu Wan, Muning Wen, Jiachen Zhu, Anjie Liu, Ziqin Gong, Yan Song, Lei Chen, Lionel M. Ni(+3 more)

Figure 1 for OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Figure 2 for OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Figure 3 for OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Figure 4 for OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Abstract:In this technical report, we introduce OpenR, an open-source framework designed to integrate key components for enhancing the reasoning capabilities of large language models (LLMs). OpenR unifies data acquisition, reinforcement learning training (both online and offline), and non-autoregressive decoding into a cohesive software platform. Our goal is to establish an open-source platform and community to accelerate the development of LLM reasoning. Inspired by the success of OpenAI's o1 model, which demonstrated improved reasoning abilities through step-by-step reasoning and reinforcement learning, OpenR integrates test-time compute, reinforcement learning, and process supervision to improve reasoning in LLMs. Our work is the first to provide an open-source framework that explores the core techniques of OpenAI's o1 model with reinforcement learning, achieving advanced reasoning capabilities beyond traditional autoregressive methods. We demonstrate the efficacy of OpenR by evaluating it on the MATH dataset, utilising publicly available data and search methods. Our initial experiments confirm substantial gains, with relative improvements in reasoning and performance driven by test-time computation and reinforcement learning through process reward models. The OpenR framework, including code, models, and datasets, is accessible at https://openreasoner.github.io.

Via

Access Paper or Ask Questions