Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Fei Liu

CaDA: Cross-Problem Routing Solver with Constraint-Aware Dual-Attention

Nov 30, 2024

Han Li, Fei Liu, Zhi Zheng, Yu Zhang, Zhenkun Wang

Abstract:Vehicle Routing Problems (VRPs) are significant Combinatorial Optimization (CO) problems holding substantial practical importance. Recently, Neural Combinatorial Optimization (NCO), which involves training deep learning models on extensive data to learn vehicle routing heuristics, has emerged as a promising approach due to its efficiency and the reduced need for manual algorithm design. However, applying NCO across diverse real-world scenarios with various constraints necessitates cross-problem capabilities. Current NCO methods typically employ a unified model lacking a constraint-specific structure, thereby restricting their cross-problem performance. Current multi-task methods for VRPs typically employ a constraint-unaware model, limiting their cross-problem performance. Furthermore, they rely solely on global connectivity, which fails to focus on key nodes and leads to inefficient representation learning. This paper introduces a Constraint-Aware Dual-Attention Model (CaDA), designed to address these limitations. CaDA incorporates a constraint prompt that efficiently represents different problem variants. Additionally, it features a dual-attention mechanism with a global branch for capturing broader graph-wide information and a sparse branch that selectively focuses on the most relevant nodes. We comprehensively evaluate our model on 16 different VRPs and compare its performance against existing cross-problem VRP solvers. CaDA achieves state-of-the-art results across all the VRPs. Our ablation study further confirms that each component of CaDA contributes positively to its cross-problem learning performance.

Via

Access Paper or Ask Questions

Enhancing Character-Level Understanding in LLMs through Token Internal Structure Learning

Nov 26, 2024

Zhu Xu, Zhiqiang Zhao, Zihan Zhang, Yuchi Liu, Quanwei Shen, Fei Liu, Yu Kuang

Abstract:Tokenization techniques such as Byte-Pair Encoding (BPE) and Byte-Level BPE (BBPE) have significantly improved the computational efficiency and vocabulary representation stability of large language models (LLMs) by segmenting text into tokens. However, this segmentation often obscures the internal character structures and sequences within tokens, preventing models from fully learning these intricate details during training. Consequently, LLMs struggle to comprehend the character compositions and positional relationships within tokens, especially when fine-tuned on downstream tasks with limited data. In this paper, we introduce Token Internal Position Awareness (TIPA), a novel approach that enhances LLMs' understanding of internal token structures by training them on reverse character prediction tasks using the tokenizer's own vocabulary. This method enables models to effectively learn and generalize character positions and internal structures. Experimental results demonstrate that LLMs trained with TIPA outperform baseline models in predicting character positions at the token level. Furthermore, when applied to the downstream task of Chinese Spelling Correction (CSC), TIPA not only accelerates model convergence but also significantly improves task performance.

Via

Access Paper or Ask Questions

STRUX: An LLM for Decision-Making with Structured Explanations

Oct 16, 2024

Yiming Lu, Yebowen Hu, Hassan Foroosh, Wei Jin, Fei Liu

Abstract:Countless decisions shape our daily lives, and it is paramount to understand the how and why behind these choices. In this paper, we introduce a new LLM decision-making framework called STRUX, which enhances LLM decision-making by providing structured explanations. These include favorable and adverse facts related to the decision, along with their respective strengths. STRUX begins by distilling lengthy information into a concise table of key facts. It then employs a series of self-reflection steps to determine which of these facts are pivotal, categorizing them as either favorable or adverse in relation to a specific decision. Lastly, we fine-tune an LLM to identify and prioritize these key facts to optimize decision-making. STRUX has been evaluated on the challenging task of forecasting stock investment decisions based on earnings call transcripts and demonstrated superior performance against strong baselines. It enhances decision transparency by allowing users to understand the impact of different factors, representing a meaningful step towards practical decision-making with LLMs.

* 10 pages, 7 figures, submitted to NAACL 2025

Via

Access Paper or Ask Questions

AutoTurb: Using Large Language Models for Automatic Algebraic Model Discovery of Turbulence Closure

Oct 14, 2024

Yu Zhang, Kefeng Zheng, Fei Liu, Qingfu Zhang, Zhenkun Wang

Figure 1 for AutoTurb: Using Large Language Models for Automatic Algebraic Model Discovery of Turbulence Closure

Figure 2 for AutoTurb: Using Large Language Models for Automatic Algebraic Model Discovery of Turbulence Closure

Figure 3 for AutoTurb: Using Large Language Models for Automatic Algebraic Model Discovery of Turbulence Closure

Figure 4 for AutoTurb: Using Large Language Models for Automatic Algebraic Model Discovery of Turbulence Closure

Abstract:Symbolic regression (SR) methods have been extensively investigated to explore explicit algebraic Reynolds stress models (EARSM) for turbulence closure of Reynolds-averaged Navier-Stokes (RANS) equations. The deduced EARSM can be readily implemented in existing computational fluid dynamic (CFD) codes and promotes the identification of physically interpretable turbulence models. The existing SR methods, such as genetic programming, sparse regression, or artificial neural networks, require user-defined functional operators, a library of candidates, or complex optimization algorithms. In this work, a novel framework using LLMs to automatically discover algebraic expressions for correcting the RSM is proposed. The direct observation of Reynolds stress and the indirect output of the CFD simulation are both involved in the training process to guarantee data consistency and avoid numerical stiffness. Constraints of functional complexity and convergence are supplementally imposed in the objective function on account of the tremendous flexibility of LLMs. The evolutionary search is employed for global optimization. The proposed method is performed for separated flow over periodic hills at Re = 10,595. The generalizability of the discovered model is verified on a set of 2D turbulent separated flow configurations with different Reynolds numbers and geometries. It is demonstrated that the corrective RANS can improve the prediction for both the Reynolds stress and mean velocity fields. Compared with algebraic models discovered by other works, the discovered model performs better in accuracy and generalization capability. The proposed approach provides a promising paradigm for using LLMs to improve turbulence modeling for a given class of flows.

Via

Access Paper or Ask Questions

PromptGCN: Bridging Subgraph Gaps in Lightweight GCNs

Oct 14, 2024

Shengwei Ji, Yujie Tian, Fei Liu, Xinlu Li, Le Wu

Figure 1 for PromptGCN: Bridging Subgraph Gaps in Lightweight GCNs

Figure 2 for PromptGCN: Bridging Subgraph Gaps in Lightweight GCNs

Figure 3 for PromptGCN: Bridging Subgraph Gaps in Lightweight GCNs

Figure 4 for PromptGCN: Bridging Subgraph Gaps in Lightweight GCNs

Abstract:Graph Convolutional Networks (GCNs) are widely used in graph-based applications, such as social networks and recommendation systems. Nevertheless, large-scale graphs or deep aggregation layers in full-batch GCNs consume significant GPU memory, causing out of memory (OOM) errors on mainstream GPUs (e.g., 29GB memory consumption on the Ogbnproducts graph with 5 layers). The subgraph sampling methods reduce memory consumption to achieve lightweight GCNs by partitioning the graph into multiple subgraphs and sequentially training GCNs on each subgraph. However, these methods yield gaps among subgraphs, i.e., GCNs can only be trained based on subgraphs instead of global graph information, which reduces the accuracy of GCNs. In this paper, we propose PromptGCN, a novel prompt-based lightweight GCN model to bridge the gaps among subgraphs. First, the learnable prompt embeddings are designed to obtain global information. Then, the prompts are attached into each subgraph to transfer the global information among subgraphs. Extensive experimental results on seven largescale graphs demonstrate that PromptGCN exhibits superior performance compared to baselines. Notably, PromptGCN improves the accuracy of subgraph sampling methods by up to 5.48% on the Flickr dataset. Overall, PromptGCN can be easily combined with any subgraph sampling method to obtain a lightweight GCN model with higher accuracy.

Via

Access Paper or Ask Questions

Stage-Wise and Prior-Aware Neural Speech Phase Prediction

Oct 07, 2024

Fei Liu, Yang Ai, Hui-Peng Du, Ye-Xin Lu, Rui-Chen Zheng, Zhen-Hua Ling

Figure 1 for Stage-Wise and Prior-Aware Neural Speech Phase Prediction

Figure 2 for Stage-Wise and Prior-Aware Neural Speech Phase Prediction

Figure 3 for Stage-Wise and Prior-Aware Neural Speech Phase Prediction

Figure 4 for Stage-Wise and Prior-Aware Neural Speech Phase Prediction

Abstract:This paper proposes a novel Stage-wise and Prior-aware Neural Speech Phase Prediction (SP-NSPP) model, which predicts the phase spectrum from input amplitude spectrum by two-stage neural networks. In the initial prior-construction stage, we preliminarily predict a rough prior phase spectrum from the amplitude spectrum. The subsequent refinement stage transforms the amplitude spectrum into a refined high-quality phase spectrum conditioned on the prior phase. Networks in both stages use ConvNeXt v2 blocks as the backbone and adopt adversarial training by innovatively introducing a phase spectrum discriminator (PSD). To further improve the continuity of the refined phase, we also incorporate a time-frequency integrated difference (TFID) loss in the refinement stage. Experimental results confirm that, compared to neural network-based no-prior phase prediction methods, the proposed SP-NSPP achieves higher phase prediction accuracy, thanks to introducing the coarse phase priors and diverse training criteria. Compared to iterative phase estimation algorithms, our proposed SP-NSPP does not require multiple rounds of staged iterations, resulting in higher generation efficiency.

* Accepted by SLT2024

Via

Access Paper or Ask Questions

Data Extrapolation for Text-to-image Generation on Small Datasets

Oct 02, 2024

Senmao Ye, Fei Liu

Figure 1 for Data Extrapolation for Text-to-image Generation on Small Datasets

Figure 2 for Data Extrapolation for Text-to-image Generation on Small Datasets

Figure 3 for Data Extrapolation for Text-to-image Generation on Small Datasets

Figure 4 for Data Extrapolation for Text-to-image Generation on Small Datasets

Abstract:Text-to-image generation requires large amount of training data to synthesizing high-quality images. For augmenting training data, previous methods rely on data interpolations like cropping, flipping, and mixing up, which fail to introduce new information and yield only marginal improvements. In this paper, we propose a new data augmentation method for text-to-image generation using linear extrapolation. Specifically, we apply linear extrapolation only on text feature, and new image data are retrieved from the internet by search engines. For the reliability of new text-image pairs, we design two outlier detectors to purify retrieved images. Based on extrapolation, we construct training samples dozens of times larger than the original dataset, resulting in a significant improvement in text-to-image performance. Moreover, we propose a NULL-guidance to refine score estimation, and apply recurrent affine transformation to fuse text information. Our model achieves FID scores of 7.91, 9.52 and 5.00 on the CUB, Oxford and COCO datasets. The code and data will be available on GitHub (https://github.com/senmaoy/RAT-Diffusion).

Via

Access Paper or Ask Questions

DeFine: Enhancing LLM Decision-Making with Factor Profiles and Analogical Reasoning

Oct 02, 2024

Yebowen Hu, Xiaoyang Wang, Wenlin Yao, Yiming Lu, Daoan Zhang, Hassan Foroosh, Dong Yu, Fei Liu

Figure 1 for DeFine: Enhancing LLM Decision-Making with Factor Profiles and Analogical Reasoning

Figure 2 for DeFine: Enhancing LLM Decision-Making with Factor Profiles and Analogical Reasoning

Figure 3 for DeFine: Enhancing LLM Decision-Making with Factor Profiles and Analogical Reasoning

Figure 4 for DeFine: Enhancing LLM Decision-Making with Factor Profiles and Analogical Reasoning

Abstract:LLMs are ideal for decision-making due to their ability to reason over long contexts and identify critical factors. However, challenges arise when processing transcripts of spoken speech describing complex scenarios. These transcripts often contain ungrammatical or incomplete sentences, repetitions, hedging, and vagueness. For example, during a company's earnings call, an executive might project a positive revenue outlook to reassure investors, despite significant uncertainty regarding future earnings. It is crucial for LLMs to incorporate this uncertainty systematically when making decisions. In this paper, we introduce DeFine, a new framework that constructs probabilistic factor profiles from complex scenarios. DeFine then integrates these profiles with analogical reasoning, leveraging insights from similar past experiences to guide LLMs in making critical decisions in novel situations. Our framework separates the tasks of quantifying uncertainty in complex scenarios and incorporating it into LLM decision-making. This approach is particularly useful in fields such as medical consultations, negotiations, and political debates, where making decisions under uncertainty is vital.

Via

Access Paper or Ask Questions

Multi-objective Evolution of Heuristic Using Large Language Model

Sep 25, 2024

Shunyu Yao, Fei Liu, Xi Lin, Zhichao Lu, Zhenkun Wang, Qingfu Zhang

Figure 1 for Multi-objective Evolution of Heuristic Using Large Language Model

Figure 2 for Multi-objective Evolution of Heuristic Using Large Language Model

Figure 3 for Multi-objective Evolution of Heuristic Using Large Language Model

Figure 4 for Multi-objective Evolution of Heuristic Using Large Language Model

Abstract:Heuristics are commonly used to tackle diverse search and optimization problems. Design heuristics usually require tedious manual crafting with domain knowledge. Recent works have incorporated large language models (LLMs) into automatic heuristic search leveraging their powerful language and coding capacity. However, existing research focuses on the optimal performance on the target problem as the sole objective, neglecting other criteria such as efficiency and scalability, which are vital in practice. To tackle this challenge, we propose to model heuristic search as a multi-objective optimization problem and consider introducing other practical criteria beyond optimal performance. Due to the complexity of the search space, conventional multi-objective optimization methods struggle to effectively handle multi-objective heuristic search. We propose the first LLM-based multi-objective heuristic search framework, Multi-objective Evolution of Heuristic (MEoH), which integrates LLMs in a zero-shot manner to generate a non-dominated set of heuristics to meet multiple design criteria. We design a new dominance-dissimilarity mechanism for effective population management and selection, which incorporates both code dissimilarity in the search space and dominance in the objective space. MEoH is demonstrated in two well-known combinatorial optimization problems: the online Bin Packing Problem (BPP) and the Traveling Salesman Problem (TSP). Results indicate that a variety of elite heuristics are automatically generated in a single run, offering more trade-off options than existing methods. It successfully achieves competitive or superior performance while improving efficiency up to 10 times. Moreover, we also observe that the multi-objective search introduces novel insights into heuristic design and leads to the discovery of diverse heuristics.

Via

Access Paper or Ask Questions

MEDiC: Autonomous Surgical Robotic Assistance to Maximizing Exposure for Dissection and Cautery

Sep 22, 2024

Xiao Liang, Chung-Pang Wang, Nikhil Uday Shinde, Fei Liu, Florian Richter, Michael Yip

Abstract:Surgical automation has the capability to improve the consistency of patient outcomes and broaden access to advanced surgical care in underprivileged communities. Shared autonomy, where the robot automates routine subtasks while the surgeon retains partial teleoperative control, offers great potential to make an impact. In this paper we focus on one important skill within surgical shared autonomy: Automating robotic assistance to maximize visual exposure and apply tissue tension for dissection and cautery. Ensuring consistent exposure to visualize the surgical site is crucial for both efficiency and patient safety. However, achieving this is highly challenging due to the complexities of manipulating deformable volumetric tissues that are prevalent in surgery.To address these challenges we propose \methodname, a framework for autonomous surgical robotic assistance to \methodfullname. We integrate a differentiable physics model with perceptual feedback to achieve our two key objectives: 1) Maximizing tissue exposure and applying tension for a specified dissection site through visual-servoing conrol and 2) Selecting optimal control positions for a dissection target based on deformable Jacobian analysis. We quantitatively assess our method through repeated real robot experiments on a tissue phantom, and showcase its capabilities through dissection experiments using shared autonomy on real animal tissue.

* Submitted to ICRA 2025

Via

Access Paper or Ask Questions