Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuan Qi

Fudan University

Thought-Like-Pro: Enhancing Reasoning of Large Language Models through Self-Driven Prolog-based Chain-of-Though

Jul 18, 2024

Xiaoyu Tan, Yongxin Deng, Xihe Qiu, Weidi Xu, Chao Qu, Wei Chu, Yinghui Xu, Yuan Qi

Figure 1 for Thought-Like-Pro: Enhancing Reasoning of Large Language Models through Self-Driven Prolog-based Chain-of-Though

Figure 2 for Thought-Like-Pro: Enhancing Reasoning of Large Language Models through Self-Driven Prolog-based Chain-of-Though

Abstract:Large language models (LLMs) have shown exceptional performance as general-purpose assistants, excelling across a variety of reasoning tasks. This achievement represents a significant step toward achieving artificial general intelligence (AGI). Despite these advancements, the effectiveness of LLMs often hinges on the specific prompting strategies employed, and there remains a lack of a robust framework to facilitate learning and generalization across diverse reasoning tasks. To address these challenges, we introduce a novel learning framework, THOUGHT-LIKE-PRO In this framework, we utilize imitation learning to imitate the Chain-of-Thought (CoT) process which is verified and translated from reasoning trajectories generated by a symbolic Prolog logic engine. This framework proceeds in a self-driven manner, that enables LLMs to formulate rules and statements from given instructions and leverage the symbolic Prolog engine to derive results. Subsequently, LLMs convert Prolog-derived successive reasoning trajectories into natural language CoT for imitation learning. Our empirical findings indicate that our proposed approach substantially enhances the reasoning abilities of LLMs and demonstrates robust generalization across out-of-distribution reasoning tasks.

Via

Access Paper or Ask Questions

Struct-X: Enhancing Large Language Models Reasoning with Structured Data

Jul 17, 2024

Xiaoyu Tan, Haoyu Wang, Xihe Qiu, Yuan Cheng, Yinghui Xu, Wei Chu, Yuan Qi

Abstract:Structured data, rich in logical and relational information, has the potential to enhance the reasoning abilities of large language models (LLMs). Still, its integration poses a challenge due to the risk of overwhelming LLMs with excessive tokens and irrelevant context information. To address this, we propose Struct-X, a novel framework that operates through five key phases: ``read-model-fill-reflect-reason'' efficiently enabling LLMs to utilize structured data. It begins by encoding structured data into a topological space using graph embeddings, followed by filling in missing entity information with knowledge retrieval modules, and filtering out irrelevant tokens via a self-supervised module. The final phase involves constructing a topological network with selected tokens to further reduce the total token length for more effective LLM inference. Additionally, Struct-X includes an Auxiliary Module trained to generate prompts, aiding LLMs in analyzing structured data. Extensive experiments on benchmarks, including the knowledge graph question-answer task and the long document reading comprehension task, show that Struct-X notably improves LLM reasoning, demonstrating the effectiveness of structured data augmentation in improving LLM inference with complex input context.

Via

Access Paper or Ask Questions

Sign is Not a Remedy: Multiset-to-Multiset Message Passing for Learning on Heterophilic Graphs

May 31, 2024

Langzhang Liang, Sunwoo Kim, Kijung Shin, Zenglin Xu, Shirui Pan, Yuan Qi

Figure 1 for Sign is Not a Remedy: Multiset-to-Multiset Message Passing for Learning on Heterophilic Graphs

Figure 2 for Sign is Not a Remedy: Multiset-to-Multiset Message Passing for Learning on Heterophilic Graphs

Figure 3 for Sign is Not a Remedy: Multiset-to-Multiset Message Passing for Learning on Heterophilic Graphs

Figure 4 for Sign is Not a Remedy: Multiset-to-Multiset Message Passing for Learning on Heterophilic Graphs

Abstract:Graph Neural Networks (GNNs) have gained significant attention as a powerful modeling and inference method, especially for homophilic graph-structured data. To empower GNNs in heterophilic graphs, where adjacent nodes exhibit dissimilar labels or features, Signed Message Passing (SMP) has been widely adopted. However, there is a lack of theoretical and empirical analysis regarding the limitations of SMP. In this work, we unveil some potential pitfalls of SMP and their remedies. We first identify two limitations of SMP: undesirable representation update for multi-hop neighbors and vulnerability against oversmoothing issues. To overcome these challenges, we propose a novel message passing function called Multiset to Multiset GNN(M2M-GNN). Our theoretical analyses and extensive experiments demonstrate that M2M-GNN effectively alleviates the aforementioned limitations of SMP, yielding superior performance in comparison

* Published as a conference paper at ICML 2024

Via

Access Paper or Ask Questions

AI2Apps: A Visual IDE for Building LLM-based AI Agent Applications

Apr 07, 2024

Xin Pang, Zhucong Li, Jiaxiang Chen, Yuan Cheng, Yinghui Xu, Yuan Qi

Figure 1 for AI2Apps: A Visual IDE for Building LLM-based AI Agent Applications

Figure 2 for AI2Apps: A Visual IDE for Building LLM-based AI Agent Applications

Figure 3 for AI2Apps: A Visual IDE for Building LLM-based AI Agent Applications

Figure 4 for AI2Apps: A Visual IDE for Building LLM-based AI Agent Applications

Abstract:We introduce AI2Apps, a Visual Integrated Development Environment (Visual IDE) with full-cycle capabilities that accelerates developers to build deployable LLM-based AI agent Applications. This Visual IDE prioritizes both the Integrity of its development tools and the Visuality of its components, ensuring a smooth and efficient building experience.On one hand, AI2Apps integrates a comprehensive development toolkit ranging from a prototyping canvas and AI-assisted code editor to agent debugger, management system, and deployment tools all within a web-based graphical user interface. On the other hand, AI2Apps visualizes reusable front-end and back-end code as intuitive drag-and-drop components. Furthermore, a plugin system named AI2Apps Extension (AAE) is designed for Extensibility, showcasing how a new plugin with 20 components enables web agent to mimic human-like browsing behavior. Our case study demonstrates substantial efficiency improvements, with AI2Apps reducing token consumption and API calls when debugging a specific sophisticated multimodal agent by approximately 90% and 80%, respectively. The AI2Apps, including an online demo, open-source code, and a screencast video, is now publicly accessible.

Via

Access Paper or Ask Questions

PDETime: Rethinking Long-Term Multivariate Time Series Forecasting from the perspective of partial differential equations

Feb 25, 2024

Shiyi Qi, Zenglin Xu, Yiduo Li, Liangjian Wen, Qingsong Wen, Qifan Wang, Yuan Qi

Figure 1 for PDETime: Rethinking Long-Term Multivariate Time Series Forecasting from the perspective of partial differential equations

Figure 2 for PDETime: Rethinking Long-Term Multivariate Time Series Forecasting from the perspective of partial differential equations

Figure 3 for PDETime: Rethinking Long-Term Multivariate Time Series Forecasting from the perspective of partial differential equations

Figure 4 for PDETime: Rethinking Long-Term Multivariate Time Series Forecasting from the perspective of partial differential equations

Abstract:Recent advancements in deep learning have led to the development of various models for long-term multivariate time-series forecasting (LMTF), many of which have shown promising results. Generally, the focus has been on historical-value-based models, which rely on past observations to predict future series. Notably, a new trend has emerged with time-index-based models, offering a more nuanced understanding of the continuous dynamics underlying time series. Unlike these two types of models that aggregate the information of spatial domains or temporal domains, in this paper, we consider multivariate time series as spatiotemporal data regularly sampled from a continuous dynamical system, which can be represented by partial differential equations (PDEs), with the spatial domain being fixed. Building on this perspective, we present PDETime, a novel LMTF model inspired by the principles of Neural PDE solvers, following the encoding-integration-decoding operations. Our extensive experimentation across seven diverse real-world LMTF datasets reveals that PDETime not only adapts effectively to the intrinsic spatiotemporal nature of the data but also sets new benchmarks, achieving state-of-the-art results

Via

Access Paper or Ask Questions

FuXi-S2S: An accurate machine learning model for global subseasonal forecasts

Dec 15, 2023

Lei Chen, Xiaohui Zhong, Jie Wu, Deliang Chen, Shangping Xie, Qingchen Chao, Chensen Lin, Zixin Hu, Bo Lu, Hao Li(+1 more)

Figure 1 for FuXi-S2S: An accurate machine learning model for global subseasonal forecasts

Figure 2 for FuXi-S2S: An accurate machine learning model for global subseasonal forecasts

Figure 3 for FuXi-S2S: An accurate machine learning model for global subseasonal forecasts

Figure 4 for FuXi-S2S: An accurate machine learning model for global subseasonal forecasts

Abstract:Skillful subseasonal forecasts beyond 2 weeks are crucial for a wide range of applications across various sectors of society. Recently, state-of-the-art machine learning based weather forecasting models have made significant advancements, outperforming the high-resolution forecast (HRES) from the European Centre for Medium-Range Weather Forecasts (ECMWF). However, the full potential of machine learning models in subseasonal forecasts has yet to be fully explored. In this study, we introduce FuXi Subseasonal-to-Seasonal (FuXi-S2S), a machine learning based subseasonal forecasting model that provides global daily mean forecasts up to 42 days, covering 5 upper-air atmospheric variables at 13 pressure levels and 11 surface variables. FuXi-S2S integrates an enhanced FuXi base model with a perturbation module for flow-dependent perturbations in hidden features, and incorporates Perlin noise to perturb initial conditions. The model is developed using 72 years of daily statistics from ECMWF ERA5 reanalysis data. When compared to the ECMWF Subseasonal-to-Seasonal (S2S) reforecasts, the FuXi-S2S forecasts demonstrate superior deterministic and ensemble forecasts for total precipitation (TP), outgoing longwave radiation (OLR), and geopotential at 500 hPa (Z500). Although it shows slightly inferior performance in predicting 2-meter temperature (T2M), it has clear advantages over land area. Regarding the extreme forecasts, FuXi-S2S outperforms ECMWF S2S globally for TP. Furthermore, FuXi-S2S forecasts surpass the ECMWF S2S reforecasts in predicting the Madden Julian Oscillation (MJO), a key source of subseasonal predictability. They extend the skillful prediction of MJO from 30 days to 36 days.

Via

Access Paper or Ask Questions

SemiSAM: Exploring SAM for Enhancing Semi-Supervised Medical Image Segmentation with Extremely Limited Annotations

Dec 11, 2023

Yichi Zhang, Yuan Cheng, Yuan Qi

Figure 1 for SemiSAM: Exploring SAM for Enhancing Semi-Supervised Medical Image Segmentation with Extremely Limited Annotations

Figure 2 for SemiSAM: Exploring SAM for Enhancing Semi-Supervised Medical Image Segmentation with Extremely Limited Annotations

Figure 3 for SemiSAM: Exploring SAM for Enhancing Semi-Supervised Medical Image Segmentation with Extremely Limited Annotations

Abstract:Semi-supervised learning has attracted much attention due to its less dependence on acquiring abundant annotations from experts compared to fully supervised methods, which is especially important for medical image segmentation which typically requires intensive pixel/voxel-wise labeling by domain experts. Although semi-supervised methods can improve the performance by utilizing unlabeled data, there are still gaps between fully supervised methods under extremely limited annotation scenarios. In this paper, we propose a simple yet efficient strategy to explore the usage of the Segment Anything Model (SAM) for enhancing semi-supervised medical image segmentation. Concretely, the segmentation model trained with domain knowledge provides information for localization and generating input prompts to the SAM. Then the generated pseudo-labels of SAM are utilized as additional supervision to assist in the learning procedure of the semi-supervised framework. Experimental results demonstrate that SAM's assistance significantly enhances the performance of existing semi-supervised frameworks, especially when only one or a few labeled images are available.

Via

Access Paper or Ask Questions

PILLOW: Enhancing Efficient Instruction Fine-tuning via Prompt Matching

Dec 09, 2023

Zhenting Qi, Xiaoyu Tan, Shaojie Shi, Chao Qu, Yinghui Xu, Yuan Qi

Abstract:Instruction fine-tuning has conventionally been employed to adapt Large Language Models (LLMs) to a variety of tasks. Nonetheless, this technique often necessitates substantial computational resources, making it impractical for deployment by individuals or small-scale entities. Recently, Low-Rank Adaptation (LoRA) has become a promising alternative, offering high capabilities on par with full tuning with reduced resource overhead. However, attaining satisfactory performance through the fine-tuning of LoRA is a non-trivial challenge. In this paper, we propose PILLOW, which aims to improve LoRA's performance by a discrimination-based prompting method, leveraging LLMs' In-Context Learning ability. PILLOW incorporates a matching network that selects prompts from a user-defined prompt pool, concatenates the selected prompts with the user instruction as input, and performs inference using the LoRA-fine-tuned LLMs. Trained with Reinforcement Learning, PILLOW exhibits commensurate performance on various evaluation metrics compared with typical instruction fine-tuning methods, utilizing only consumer-grade GPU resources and exhibiting a large reduction in computational costs.

* Accepted by EMNLP 2023 (Industry Track), Oral Presentation

Via

Access Paper or Ask Questions

Segment Anything Model with Uncertainty Rectification for Auto-Prompting Medical Image Segmentation

Nov 17, 2023

Yichi Zhang, Shiyao Hu, Chen Jiang, Yuan Cheng, Yuan Qi

Figure 1 for Segment Anything Model with Uncertainty Rectification for Auto-Prompting Medical Image Segmentation

Figure 2 for Segment Anything Model with Uncertainty Rectification for Auto-Prompting Medical Image Segmentation

Figure 3 for Segment Anything Model with Uncertainty Rectification for Auto-Prompting Medical Image Segmentation

Figure 4 for Segment Anything Model with Uncertainty Rectification for Auto-Prompting Medical Image Segmentation

Abstract:The introduction of the Segment Anything Model (SAM) has marked a significant advancement in prompt-driven image segmentation. However, SAM's application to medical image segmentation requires manual prompting of target structures to obtain acceptable performance, which is still labor-intensive. Despite attempts of auto-prompting to turn SAM into a fully automatic manner, it still exhibits subpar performance and lacks of reliability in the field of medical imaging. In this paper, we propose UR-SAM, an uncertainty rectified SAM framework to enhance the robustness and reliability for auto-prompting medical image segmentation. Our method incorporates a prompt augmentation module to estimate the distribution of predictions and generate uncertainty maps, and an uncertainty-based rectification module to further enhance the performance of SAM. Extensive experiments on two public 3D medical datasets covering the segmentation of 35 organs demonstrate that without supplementary training or fine-tuning, our method further improves the segmentation performance with up to 10.7 % and 13.8 % in dice similarity coefficient, demonstrating efficiency and broad capabilities for medical image segmentation without manual prompting.

Via

Access Paper or Ask Questions

InfMLLM: A Unified Framework for Visual-Language Tasks

Nov 12, 2023

Qiang Zhou, Zhibin Wang, Wei Chu, Yinghui Xu, Hao Li, Yuan Qi

Figure 1 for InfMLLM: A Unified Framework for Visual-Language Tasks

Figure 2 for InfMLLM: A Unified Framework for Visual-Language Tasks

Figure 3 for InfMLLM: A Unified Framework for Visual-Language Tasks

Figure 4 for InfMLLM: A Unified Framework for Visual-Language Tasks

Abstract:Large language models (LLMs) have proven their remarkable versatility in handling a comprehensive range of language-centric applications. To expand LLMs' capabilities to a broader spectrum of modal inputs, multimodal large language models (MLLMs) have attracted growing interest. This work delves into enabling LLMs to tackle more vision-language-related tasks, particularly image captioning, visual question answering (VQA,) and visual grounding. To this end, we implemented a three-stage training scheme: starting with lightweight alignment pretraining, then moderate-weight multitask hybrid training, and finally, LLM fine-tuning to improve instruction following capability. Throughout the training process, the requirements on GPU memory gradually increase. To effectively manage the number of visual embeddings passed to the LLM while preserving their positional information, we introduce a straightforward visual adapter module dubbed pool-adapter. Our experiments demonstrate that preserving the positional information of visual embeddings through the pool-adapter is particularly beneficial for tasks like visual grounding. We name our proposed approach InfMLLM and have evaluated it extensively on various benchmark datasets. Our results demonstrate that InfMLLM achieves either state-of-the-art (SOTA) performance or performance comparable to recent MLLMs. The code and model will be made open-source at: \url{https://github.com/mightyzau/InfMLLM}.

* 8

Via

Access Paper or Ask Questions