Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Liang Zhu

ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents

May 29, 2025

Feiteng Fang, Ting-En Lin, Yuchuan Wu, Xiong Liu, Xiang Huang, Dingwei Chen, Jing Ye, Haonan Zhang, Liang Zhu, Hamid Alinejad-Rokny(+3 more)

Abstract:Role-Playing Language Agents (RPLAs) aim to simulate characters for realistic and engaging human-computer interactions. However, traditional reward models often struggle with scalability and adapting to subjective conversational preferences. We propose ChARM, a Character-based Act-adaptive Reward Model, addressing these challenges through two innovations: (1) an act-adaptive margin that significantly enhances learning efficiency and generalizability, and (2) a self-evolution mechanism leveraging large-scale unlabeled data to improve training coverage. Additionally, we introduce RoleplayPref, the first large-scale preference dataset specifically for RPLAs, featuring 1,108 characters, 13 subcategories, and 16,888 bilingual dialogues, alongside RoleplayEval, a dedicated evaluation benchmark. Experimental results show a 13% improvement over the conventional Bradley-Terry model in preference rankings. Furthermore, applying ChARM-generated rewards to preference learning techniques (e.g., direct preference optimization) achieves state-of-the-art results on CharacterEval and RoleplayEval. Code and dataset are available at https://github.com/calubkk/ChARM.

Via

Access Paper or Ask Questions

UTSRMorph: A Unified Transformer and Superresolution Network for Unsupervised Medical Image Registration

Oct 27, 2024

Runshi Zhang, Hao Mo, Junchen Wang, Bimeng Jie, Yang He, Nenghao Jin, Liang Zhu

Abstract:Complicated image registration is a key issue in medical image analysis, and deep learning-based methods have achieved better results than traditional methods. The methods include ConvNet-based and Transformer-based methods. Although ConvNets can effectively utilize local information to reduce redundancy via small neighborhood convolution, the limited receptive field results in the inability to capture global dependencies. Transformers can establish long-distance dependencies via a self-attention mechanism; however, the intense calculation of the relationships among all tokens leads to high redundancy. We propose a novel unsupervised image registration method named the unified Transformer and superresolution (UTSRMorph) network, which can enhance feature representation learning in the encoder and generate detailed displacement fields in the decoder to overcome these problems. We first propose a fusion attention block to integrate the advantages of ConvNets and Transformers, which inserts a ConvNet-based channel attention module into a multihead self-attention module. The overlapping attention block, a novel cross-attention method, uses overlapping windows to obtain abundant correlations with match information of a pair of images. Then, the blocks are flexibly stacked into a new powerful encoder. The decoder generation process of a high-resolution deformation displacement field from low-resolution features is considered as a superresolution process. Specifically, the superresolution module was employed to replace interpolation upsampling, which can overcome feature degradation. UTSRMorph was compared to state-of-the-art registration methods in the 3D brain MR (OASIS, IXI) and MR-CT datasets. The qualitative and quantitative results indicate that UTSRMorph achieves relatively better performance. The code and datasets are publicly available at https://github.com/Runshi-Zhang/UTSRMorph.

* early access in IEEE Transactions on Medical Imaging 2024
* 13pages,10 figures

Via

Access Paper or Ask Questions

PersonaMath: Enhancing Math Reasoning through Persona-Driven Data Augmentation

Oct 02, 2024

Jing Luo, Run Luo, Longze Chen, Liang Zhu, Chang Ao, Jiaming Li, Yukun Chen, Xin Cheng, Wen Yang, Jiayuan Su(+2 more)

Figure 1 for PersonaMath: Enhancing Math Reasoning through Persona-Driven Data Augmentation

Figure 2 for PersonaMath: Enhancing Math Reasoning through Persona-Driven Data Augmentation

Figure 3 for PersonaMath: Enhancing Math Reasoning through Persona-Driven Data Augmentation

Figure 4 for PersonaMath: Enhancing Math Reasoning through Persona-Driven Data Augmentation

Abstract:While closed-source Large Language Models (LLMs) demonstrate strong mathematical problem-solving abilities, open-source models continue to struggle with such tasks. To bridge this gap, we propose a data augmentation approach and introduce PersonaMathQA, a dataset derived from MATH and GSM8K, on which we train the PersonaMath models. Our approach consists of two stages: the first stage is learning from Persona Diversification, and the second stage is learning from Reflection. In the first stage, we regenerate detailed chain-of-thought (CoT) solutions as instructions using a closed-source LLM and introduce a novel persona-driven data augmentation technique to enhance the dataset's quantity and diversity. In the second stage, we incorporate reflection to fully leverage more challenging and valuable questions. Evaluation of our PersonaMath models on MATH and GSM8K reveals that the PersonaMath-7B model (based on LLaMA-2-7B) achieves an accuracy of 24.2% on MATH and 68.7% on GSM8K, surpassing all baseline methods and achieving state-of-the-art performance. Notably, our dataset contains only 70.3K data points-merely 17.8% of MetaMathQA and 27% of MathInstruct-yet our model outperforms these baselines, demonstrating the high quality and diversity of our dataset, which enables more efficient model training. We open-source the PersonaMathQA dataset, PersonaMath models, and our code for public usage.

Via

Access Paper or Ask Questions

DeliLaw: A Chinese Legal Counselling System Based on a Large Language Model

Aug 01, 2024

Nan Xie, Yuelin Bai, Hengyuan Gao, Feiteng Fang, Qixuan Zhao, Zhijian Li, Ziqiang Xue, Liang Zhu, Shiwen Ni, Min Yang

Figure 1 for DeliLaw: A Chinese Legal Counselling System Based on a Large Language Model

Figure 2 for DeliLaw: A Chinese Legal Counselling System Based on a Large Language Model

Figure 3 for DeliLaw: A Chinese Legal Counselling System Based on a Large Language Model

Abstract:Traditional legal retrieval systems designed to retrieve legal documents, statutes, precedents, and other legal information are unable to give satisfactory answers due to lack of semantic understanding of specific questions. Large Language Models (LLMs) have achieved excellent results in a variety of natural language processing tasks, which inspired us that we train a LLM in the legal domain to help legal retrieval. However, in the Chinese legal domain, due to the complexity of legal questions and the rigour of legal articles, there is no legal large model with satisfactory practical application yet. In this paper, we present DeliLaw, a Chinese legal counselling system based on a large language model. DeliLaw integrates a legal retrieval module and a case retrieval module to overcome the model hallucination. Users can consult professional legal questions, search for legal articles and relevant judgement cases, etc. on the DeliLaw system in a dialogue mode. In addition, DeliLaw supports the use of English for counseling. we provide the address of the system: https://data.delilegal.com/lawQuestion.

* CIKM 2024, 5 pages with 3 figures

Via

Access Paper or Ask Questions

CLHA: A Simple yet Effective Contrastive Learning Framework for Human Alignment

Mar 26, 2024

Feiteng Fang, Liang Zhu, Min Yang, Xi Feng, Jinchang Hou, Qixuan Zhao, Chengming Li, Xiping Hu, Ruifeng Xu

Abstract:Reinforcement learning from human feedback (RLHF) is a crucial technique in aligning large language models (LLMs) with human preferences, ensuring these LLMs behave in beneficial and comprehensible ways to users. However, a longstanding challenge in human alignment techniques based on reinforcement learning lies in their inherent complexity and difficulty in training. To address this challenge, we present a simple yet effective Contrastive Learning Framework for Human Alignment (CLHA) to align LLMs with human preferences directly. CLHA employs a novel rescoring strategy to evaluate the noise within the data by considering its inherent quality and dynamically adjusting the training process. Simultaneously, CLHA utilizes pairwise contrastive loss and adaptive supervised fine-tuning loss to adaptively modify the likelihood of generating responses, ensuring enhanced alignment with human preferences. Using advanced methods, CLHA surpasses other algorithms, showcasing superior performance in terms of reward model scores, automatic evaluations, and human assessments on the widely used ``Helpful and Harmless'' dataset.

Via

Access Paper or Ask Questions

Less Can Be More: Unsupervised Graph Pruning for Large-scale Dynamic Graphs

May 18, 2023

Jintang Li, Sheng Tian, Ruofan Wu, Liang Zhu, Welong Zhao, Changhua Meng, Liang Chen, Zibin Zheng, Hongzhi Yin

Abstract:The prevalence of large-scale graphs poses great challenges in time and storage for training and deploying graph neural networks (GNNs). Several recent works have explored solutions for pruning the large original graph into a small and highly-informative one, such that training and inference on the pruned and large graphs have comparable performance. Although empirically effective, current researches focus on static or non-temporal graphs, which are not directly applicable to dynamic scenarios. In addition, they require labels as ground truth to learn the informative structure, limiting their applicability to new problem domains where labels are hard to obtain. To solve the dilemma, we propose and study the problem of unsupervised graph pruning on dynamic graphs. We approach the problem by our proposed STEP, a self-supervised temporal pruning framework that learns to remove potentially redundant edges from input dynamic graphs. From a technical and industrial viewpoint, our method overcomes the trade-offs between the performance and the time & memory overheads. Our results on three real-world datasets demonstrate the advantages on improving the efficacy, robustness, and efficiency of GNNs on dynamic node classification tasks. Most notably, STEP is able to prune more than 50% of edges on a million-scale industrial graph Alipay (7M nodes, 21M edges) while approximating up to 98% of the original performance. Code is available at https://github.com/EdisonLeeeee/STEP.

* Preprint

Via

Access Paper or Ask Questions

MaskGAE: Masked Graph Modeling Meets Graph Autoencoders

May 20, 2022

Jintang Li, Ruofan Wu, Wangbin Sun, Liang Chen, Sheng Tian, Liang Zhu, Changhua Meng, Zibin Zheng, Weiqiang Wang

Figure 1 for MaskGAE: Masked Graph Modeling Meets Graph Autoencoders

Figure 2 for MaskGAE: Masked Graph Modeling Meets Graph Autoencoders

Figure 3 for MaskGAE: Masked Graph Modeling Meets Graph Autoencoders

Figure 4 for MaskGAE: Masked Graph Modeling Meets Graph Autoencoders

Abstract:We present masked graph autoencoder (MaskGAE), a self-supervised learning framework for graph-structured data. Different from previous graph autoencoders (GAEs), MaskGAE adopts masked graph modeling (MGM) as a principled pretext task: masking a portion of edges and attempting to reconstruct the missing part with partially visible, unmasked graph structure. To understand whether MGM can help GAEs learn better representations, we provide both theoretical and empirical evidence to justify the benefits of this pretext task. Theoretically, we establish the connections between GAEs and contrastive learning, showing that MGM significantly improves the self-supervised learning scheme of GAEs. Empirically, we conduct extensive experiments on a number of benchmark datasets, demonstrating the superiority of MaskGAE over several state-of-the-arts on both link prediction and node classification tasks. Our code is publicly available at \url{https://github.com/EdisonLeeeee/MaskGAE}.

* Preprint. Code available at https://github.com/EdisonLeeeee/MaskGAE

Via

Access Paper or Ask Questions

Joint 3-D Positioning and Power Allocation for UAV Relay Aided by Geographic Information

Oct 12, 2021

Pengfei Yi, Liang Zhu, Lipeng Zhu, Zhenyu Xiao, Zhu Han, Xiang-Gen Xia

Figure 1 for Joint 3-D Positioning and Power Allocation for UAV Relay Aided by Geographic Information

Figure 2 for Joint 3-D Positioning and Power Allocation for UAV Relay Aided by Geographic Information

Figure 3 for Joint 3-D Positioning and Power Allocation for UAV Relay Aided by Geographic Information

Figure 4 for Joint 3-D Positioning and Power Allocation for UAV Relay Aided by Geographic Information

Abstract:In this paper, we study to employ geographic information to address the blockage problem of air-to-ground links between UAV and terrestrial nodes. In particular, a UAV relay is deployed to establish communication links from a ground base station to multiple ground users. To improve communication capacity, we first model the blockage effect caused by buildings according to the three-dimensional (3-D) geographic information. Then, an optimization problem is formulated to maximize the minimum capacity among users by jointly optimizing the 3-D position and power allocation of the UAV relay, under the constraints of link capacity, maximum transmit power, and blockage. To solve this complex non-convex problem, a two-loop optimization framework is developed based on Lagrangian relaxation. The outer-loop aims to obtain proper Lagrangian multipliers to ensure the solution of the Lagrangian problem converge to the tightest upper bound on the original problem. The inner-loop solves the Lagrangian problem by applying the block coordinate descent (BCD) and successive convex approximation (SCA) techniques, where UAV 3-D positioning and power allocation are alternately optimized in each iteration. Simulation results confirm that the proposed solution significantly outperforms two benchmark schemes and achieves a performance close to the upper bound on the UAV relay system.

Via

Access Paper or Ask Questions

MOTS: Multiple Object Tracking for General Categories Based On Few-Shot Method

May 19, 2020

Xixi Xu, Chao Lu, Liang Zhu, Xiangyang Xue, Guanxian Chen, Qi Guo, Yining Lin, Zhijian Zhao

Figure 1 for MOTS: Multiple Object Tracking for General Categories Based On Few-Shot Method

Figure 2 for MOTS: Multiple Object Tracking for General Categories Based On Few-Shot Method

Figure 3 for MOTS: Multiple Object Tracking for General Categories Based On Few-Shot Method

Figure 4 for MOTS: Multiple Object Tracking for General Categories Based On Few-Shot Method

Abstract:Most modern Multi-Object Tracking (MOT) systems typically apply REID-based paradigm to hold a balance between computational efficiency and performance. In the past few years, numerous attempts have been made to perfect the systems. Although they presented favorable performance, they were constrained to track specified category. Drawing on the ideas of few shot method, we pioneered a new multi-target tracking system, named MOTS, which is based on metrics but not limited to track specific category. It contains two stages in series: In the first stage, we design the self-Adaptive-matching module to perform simple targets matching, which can complete 88.76% assignments without sacrificing performance on MOT16 training set. In the second stage, a Fine-match Network was carefully designed for unmatched targets. With a newly built TRACK-REID data-set, the Fine-match Network can perform matching of 31 category targets, even generalizes to unseen categories.

* 6 pages

Via

Access Paper or Ask Questions

Towards in-store multi-person tracking using head detection and track heatmaps

May 16, 2020

Aibek Musaev, Jiangping Wang, Liang Zhu, Cheng Li, Yi Chen, Jialin Liu, Wanqi Zhang, Juan Mei, De Wang

Figure 1 for Towards in-store multi-person tracking using head detection and track heatmaps

Figure 2 for Towards in-store multi-person tracking using head detection and track heatmaps

Figure 3 for Towards in-store multi-person tracking using head detection and track heatmaps

Figure 4 for Towards in-store multi-person tracking using head detection and track heatmaps

Abstract:Computer vision algorithms are being implemented across a breadth of industries to enable technological innovations. In this paper, we study the problem of computer vision based customer tracking in retail industry. To this end, we introduce a dataset collected from a camera in an office environment where participants mimic various behaviors of customers in a supermarket. In addition, we describe an illustrative example of the use of this dataset for tracking participants based on a head tracking model in an effort to minimize errors due to occlusion. Furthermore, we propose a model for recognizing customers and staff based on their movement patterns. The model is evaluated using a real-world dataset collected in a supermarket over a 24-hour period that achieves 98\% accuracy during training and 93\% accuracy during evaluation.

Via

Access Paper or Ask Questions