Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yaqing Wang

Understanding the Uncertainty of LLM Explanations: A Perspective Based on Reasoning Topology

Feb 24, 2025

Longchao Da, Xiaoou Liu, Jiaxin Dai, Lu Cheng, Yaqing Wang, Hua Wei

Abstract:Understanding the uncertainty in large language model (LLM) explanations is important for evaluating their faithfulness and reasoning consistency, and thus provides insights into the reliability of LLM's output regarding a question. In this work, we propose a novel framework that quantifies uncertainty in LLM explanations through a reasoning topology perspective. By designing a structural elicitation strategy, we guide the LLMs to frame the explanations of an answer into a graph topology. This process decomposes the explanations into the knowledge related sub-questions and topology-based reasoning structures, which allows us to quantify uncertainty not only at the semantic level but also from the reasoning path. It further brings convenience to assess knowledge redundancy and provide interpretable insights into the reasoning process. Our method offers a systematic way to interpret the LLM reasoning, analyze limitations, and provide guidance for enhancing robustness and faithfulness. This work pioneers the use of graph-structured uncertainty measurement in LLM explanations and demonstrates the potential of topology-based quantification.

* 15 pages, 6 figures

Via

Access Paper or Ask Questions

ConML: A Universal Meta-Learning Framework with Task-Level Contrastive Learning

Oct 08, 2024

Shiguang Wu, Yaqing Wang, Yatao Bian, Quanming Yao

Figure 1 for ConML: A Universal Meta-Learning Framework with Task-Level Contrastive Learning

Figure 2 for ConML: A Universal Meta-Learning Framework with Task-Level Contrastive Learning

Figure 3 for ConML: A Universal Meta-Learning Framework with Task-Level Contrastive Learning

Figure 4 for ConML: A Universal Meta-Learning Framework with Task-Level Contrastive Learning

Abstract:Meta-learning enables learning systems to adapt quickly to new tasks, similar to humans. To emulate this human-like rapid learning and enhance alignment and discrimination abilities, we propose ConML, a universal meta-learning framework that can be applied to various meta-learning algorithms without relying on specific model architectures nor target models. The core of ConML is task-level contrastive learning, which extends contrastive learning from the representation space in unsupervised learning to the model space in meta-learning. By leveraging task identity as an additional supervision signal during meta-training, we contrast the outputs of the meta-learner in the model space, minimizing inner-task distance (between models trained on different subsets of the same task) and maximizing inter-task distance (between models from different tasks). We demonstrate that ConML integrates seamlessly with optimization-based, metric-based, and amortization-based meta-learning algorithms, as well as in-context learning, resulting in performance improvements across diverse few-shot learning tasks.

Via

Access Paper or Ask Questions

GraphIC: A Graph-Based In-Context Example Retrieval Model for Multi-Step Reasoning

Oct 03, 2024

Jiale Fu, Yaqing Wang, Simeng Han, Jiaming Fan, Chen Si, Xu Yang

Figure 1 for GraphIC: A Graph-Based In-Context Example Retrieval Model for Multi-Step Reasoning

Figure 2 for GraphIC: A Graph-Based In-Context Example Retrieval Model for Multi-Step Reasoning

Figure 3 for GraphIC: A Graph-Based In-Context Example Retrieval Model for Multi-Step Reasoning

Figure 4 for GraphIC: A Graph-Based In-Context Example Retrieval Model for Multi-Step Reasoning

Abstract:In-context learning (ICL) enables large language models (LLMs) to generalize to new tasks by incorporating a few in-context examples (ICEs) directly in the input, without updating parameters. However, the effectiveness of ICL heavily relies on the selection of ICEs, and conventional text-based embedding methods are often inadequate for tasks that require multi-step reasoning, such as mathematical and logical problem solving. This is due to the bias introduced by shallow semantic similarities that fail to capture the deeper reasoning structures required for these tasks. We present GraphIC, a novel approach that leverages graph-based representations of reasoning processes, coupled with Bayesian Networks (BNs) to select ICEs. Graph structures inherently filter out shallow semantics while preserving the core reasoning structure. Importantly, BNs capture the dependency of a node's attributes on its parent nodes, closely mirroring the hierarchical nature of human cognition-where each thought is shaped by preceding ones. This makes BNs particularly well-suited for multi-step reasoning tasks, aligning the process more closely with human-like reasoning. Extensive experiments across three types of reasoning tasks (mathematical reasoning, code generation, and logical reasoning) demonstrate that GraphIC outperforms both training-free and training-based models in selecting ICEs, excelling in terms of both effectiveness and efficiency. We show that GraphIC enhances ICL's performance and interoperability, significantly advancing ICE selection for multi-step reasoning tasks.

Via

Access Paper or Ask Questions

Development and Testing of a Vine Robot for Urban Search and Rescue in Confined Rubble Environments

Sep 16, 2024

Zheyu Zhou, Yaqing Wang, Elliot W. Hawkes, Chen Li

Figure 1 for Development and Testing of a Vine Robot for Urban Search and Rescue in Confined Rubble Environments

Figure 2 for Development and Testing of a Vine Robot for Urban Search and Rescue in Confined Rubble Environments

Figure 3 for Development and Testing of a Vine Robot for Urban Search and Rescue in Confined Rubble Environments

Figure 4 for Development and Testing of a Vine Robot for Urban Search and Rescue in Confined Rubble Environments

Abstract:The request for fast response and safe operation after natural and man-made disasters in urban environments has spurred the development of robotic systems designed to assist in search and rescue operations within complex rubble sites. Traditional Unmanned Aerial Vehicles (UAVs) and Unmanned Ground Vehicles (UGVs) face significant limitations in such confined and obstructed environments. This paper introduces a novel vine robot designed to navigate dense rubble, drawing inspiration from natural growth mechanisms found in plants. Unlike conventional robots, vine robots are soft robots that can grow by everting their material, allowing them to navigate through narrow spaces and obstacles. The prototype presented in this study incorporates pneumatic muscles for steering and oscillation, an equation-based robot length control plus feedback pressure regulating system for extending and retracting the robot body. We conducted a series of controlled experiments in an artificial rubble testbed to assess the robot performance under varying environmental conditions and robot parameters, including volume ratio, environmental weight, oscillation, and steering. The results show that the vine robot can achieve significant penetration depths in cluttered environments with mixed obstacle sizes and weights, and can maintain repeated trajectories, demonstrating potential for mapping and navigating complex underground paths. Our findings highlight the suitability of the vine robot for urban search and rescue missions, with further research planned to enhance its robustness and deployability in real-world scenarios.

* 6 pages, 6 figures

Via

Access Paper or Ask Questions

FIARSE: Model-Heterogeneous Federated Learning via Importance-Aware Submodel Extraction

Jul 28, 2024

Feijie Wu, Xingchen Wang, Yaqing Wang, Tianci Liu, Lu Su, Jing Gao

Figure 1 for FIARSE: Model-Heterogeneous Federated Learning via Importance-Aware Submodel Extraction

Figure 2 for FIARSE: Model-Heterogeneous Federated Learning via Importance-Aware Submodel Extraction

Figure 3 for FIARSE: Model-Heterogeneous Federated Learning via Importance-Aware Submodel Extraction

Figure 4 for FIARSE: Model-Heterogeneous Federated Learning via Importance-Aware Submodel Extraction

Abstract:In federated learning (FL), accommodating clients' varied computational capacities poses a challenge, often limiting the participation of those with constrained resources in global model training. To address this issue, the concept of model heterogeneity through submodel extraction has emerged, offering a tailored solution that aligns the model's complexity with each client's computational capacity. In this work, we propose Federated Importance-Aware Submodel Extraction (FIARSE), a novel approach that dynamically adjusts submodels based on the importance of model parameters, thereby overcoming the limitations of previous static and dynamic submodel extraction methods. Compared to existing works, the proposed method offers a theoretical foundation for the submodel extraction and eliminates the need for additional information beyond the model parameters themselves to determine parameter importance, significantly reducing the overhead on clients. Extensive experiments are conducted on various datasets to showcase superior performance of the proposed FIARSE.

Via

Access Paper or Ask Questions

Warming Up Cold-Start CTR Prediction by Learning Item-Specific Feature Interactions

Jul 14, 2024

Yaqing Wang, Hongming Piao, Daxiang Dong, Quanming Yao, Jingbo Zhou

Abstract:In recommendation systems, new items are continuously introduced, initially lacking interaction records but gradually accumulating them over time. Accurately predicting the click-through rate (CTR) for these items is crucial for enhancing both revenue and user experience. While existing methods focus on enhancing item ID embeddings for new items within general CTR models, they tend to adopt a global feature interaction approach, often overshadowing new items with sparse data by those with abundant interactions. Addressing this, our work introduces EmerG, a novel approach that warms up cold-start CTR prediction by learning item-specific feature interaction patterns. EmerG utilizes hypernetworks to generate an item-specific feature graph based on item characteristics, which is then processed by a Graph Neural Network (GNN). This GNN is specially tailored to provably capture feature interactions at any order through a customized message passing mechanism. We further design a meta learning strategy that optimizes parameters of hypernetworks and GNN across various item CTR prediction tasks, while only adjusting a minimal set of item-specific parameters within each task. This strategy effectively reduces the risk of overfitting when dealing with limited data. Extensive experiments on benchmark datasets validate that EmerG consistently performs the best given no, a few and sufficient instances of new items.

* KDD 2024

Via

Access Paper or Ask Questions

Knowledge-Aware Parsimony Learning: A Perspective from Relational Graphs

Jun 29, 2024

Quanming Yao, Yongqi Zhang, Yaqing Wang, Nan Yin, James Kwok, Qiang Yang

Abstract:The scaling law, a strategy that involves the brute-force scaling of the training dataset and learnable parameters, has become a prevalent approach for developing stronger learning models. In this paper, we examine its rationale in terms of learning from relational graphs. We demonstrate that directly adhering to such a scaling law does not necessarily yield stronger models due to architectural incompatibility and representation bottlenecks. To tackle this challenge, we propose a novel framework for learning from relational graphs via knowledge-aware parsimony learning. Our method draws inspiration from the duality between data and knowledge inherent in these graphs. Specifically, we first extract knowledge (like symbolic logic and physical laws) during the learning process, and then apply combinatorial generalization to the task at hand. This extracted knowledge serves as the ``building blocks'' for achieving parsimony learning. By applying this philosophy to architecture, parameters, and inference, we can effectively achieve versatile, sample-efficient, and interpretable learning. Experimental results show that our proposed framework surpasses methods that strictly follow the traditional scaling-up roadmap. This highlights the importance of incorporating knowledge in the development of next-generation learning technologies.

Via

Access Paper or Ask Questions

Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models

Jun 20, 2024

Yuan Zhong, Xiaochen Wang, Jiaqi Wang, Xiaokun Zhang, Yaqing Wang, Mengdi Huai, Cao Xiao, Fenglong Ma

Figure 1 for Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models

Figure 2 for Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models

Figure 3 for Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models

Figure 4 for Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models

Abstract:Synthesizing electronic health records (EHR) data has become a preferred strategy to address data scarcity, improve data quality, and model fairness in healthcare. However, existing approaches for EHR data generation predominantly rely on state-of-the-art generative techniques like generative adversarial networks, variational autoencoders, and language models. These methods typically replicate input visits, resulting in inadequate modeling of temporal dependencies between visits and overlooking the generation of time information, a crucial element in EHR data. Moreover, their ability to learn visit representations is limited due to simple linear mapping functions, thus compromising generation quality. To address these limitations, we propose a novel EHR data generation model called EHRPD. It is a diffusion-based model designed to predict the next visit based on the current one while also incorporating time interval estimation. To enhance generation quality and diversity, we introduce a novel time-aware visit embedding module and a pioneering predictive denoising diffusion probabilistic model (PDDPM). Additionally, we devise a predictive U-Net (PU-Net) to optimize P-DDPM.We conduct experiments on two public datasets and evaluate EHRPD from fidelity, privacy, and utility perspectives. The experimental results demonstrate the efficacy and utility of the proposed EHRPD in addressing the aforementioned limitations and advancing EHR data generation.

Via

Access Paper or Ask Questions

CoRelation: Boosting Automatic ICD Coding Through Contextualized Code Relation Learning

Feb 24, 2024

Junyu Luo, Xiaochen Wang, Jiaqi Wang, Aofei Chang, Yaqing Wang, Fenglong Ma

Figure 1 for CoRelation: Boosting Automatic ICD Coding Through Contextualized Code Relation Learning

Figure 2 for CoRelation: Boosting Automatic ICD Coding Through Contextualized Code Relation Learning

Figure 3 for CoRelation: Boosting Automatic ICD Coding Through Contextualized Code Relation Learning

Figure 4 for CoRelation: Boosting Automatic ICD Coding Through Contextualized Code Relation Learning

Abstract:Automatic International Classification of Diseases (ICD) coding plays a crucial role in the extraction of relevant information from clinical notes for proper recording and billing. One of the most important directions for boosting the performance of automatic ICD coding is modeling ICD code relations. However, current methods insufficiently model the intricate relationships among ICD codes and often overlook the importance of context in clinical notes. In this paper, we propose a novel approach, a contextualized and flexible framework, to enhance the learning of ICD code representations. Our approach, unlike existing methods, employs a dependent learning paradigm that considers the context of clinical notes in modeling all possible code relations. We evaluate our approach on six public ICD coding datasets and the experimental results demonstrate the effectiveness of our approach compared to state-of-the-art baselines.

* LREC-Coling 2024

Via

Access Paper or Ask Questions

Force sensing to reconstruct potential energy landscapes for cluttered large obstacle traversal

Jan 23, 2024

Yaqing Wang, Ling Xu, Chen Li

Abstract:Visual sensing of environmental geometry allows robots to use artificial potential fields to avoid sparse obstacles. Yet robots must further traverse cluttered large obstacles for applications like search and rescue through rubble and planetary exploration across Martain rocks. Recent studies discovered that to traverse cluttered large obstacles, multi-legged insects and insect-inspired robots make strenuous transitions across locomotor modes with major changes in body orientation. When viewed on a potential energy landscape resulting from locomotor-obstacle physical interaction, these are barrier-crossing transitions across landscape basins. This potential energy landscape approach may provide a modeling framework for cluttered large obstacle traversal. Here, we take the next step toward this vision by testing whether force sensing allows the reconstruction of the potential energy landscape. We developed a cockroach-inspired, minimalistic robot capable of sensing obstacle contact forces and torques around its body as it propelled forward against a pair of cluttered grass-like beam obstacles. We performed measurements over many traverses with systematically varied body orientations. Despite the forces and torques not being fully conservative, they well-matched the potential energy landscape gradients and the landscape reconstructed from them well-matched ground truth. In addition, inspired by cockroach observations, we found that robot head oscillation during traversal further improved the accuracies of force sensing and landscape reconstruction. We still need to study how to reconstruct landscape during a single traverse, as in applications, robots have little chance to use multiple traverses to sample the environment systematically and how to find landscape saddles for least-effort transitions to traverse.

Via

Access Paper or Ask Questions