Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yan Wang

Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language Modelling

Jul 16, 2023
Longyue Wang, Zefeng Du, Donghuai Liu, Cai Deng, Dian Yu, Haiyun Jiang, Yan Wang, Leyang Cui, Shuming Shi, Zhaopeng Tu

Figure 1 for Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language Modelling

Figure 2 for Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language Modelling

Figure 3 for Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language Modelling

Figure 4 for Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language Modelling

Modeling discourse -- the linguistic phenomena that go beyond individual sentences, is a fundamental yet challenging aspect of natural language processing (NLP). However, existing evaluation benchmarks primarily focus on the evaluation of inter-sentence properties and overlook critical discourse phenomena that cross sentences. To bridge the gap, we propose Disco-Bench, a benchmark that can evaluate intra-sentence discourse properties across a diverse set of NLP tasks, covering understanding, translation, and generation. Disco-Bench consists of 9 document-level testsets in the literature domain, which contain rich discourse phenomena (e.g. cohesion and coherence) in Chinese and/or English. For linguistic analysis, we also design a diagnostic test suite that can examine whether the target models learn discourse knowledge. We totally evaluate 20 general-, in-domain and commercial models based on Transformer, advanced pretraining architectures and large language models (LLMs). Our results show (1) the challenge and necessity of our evaluation benchmark; (2) fine-grained pretraining based on literary document-level training data consistently improves the modeling of discourse information. We will release the datasets, pretrained models, and leaderboard, which we hope can significantly facilitate research in this field: https://github.com/longyuewangdcu/Disco-Bench.

* Zhaopeng Tu is the corresponding author

Via

Access Paper or Ask Questions

Copy Is All You Need

Jul 13, 2023
Tian Lan, Deng Cai, Yan Wang, Heyan Huang, Xian-Ling Mao

The dominant text generation models compose the output by sequentially selecting words from a fixed vocabulary. In this paper, we formulate text generation as progressively copying text segments (e.g., words or phrases) from an existing text collection. We compute the contextualized representations of meaningful text segments and index them using efficient vector search toolkits. The task of text generation is then decomposed into a series of copy-and-paste operations: at each time step, we seek suitable text spans from the text collection rather than selecting from a standalone vocabulary. Experiments on the standard language modeling benchmark (WikiText-103) show that our approach achieves better generation quality according to both automatic and human evaluations. Besides, its inference efficiency is comparable to token-level autoregressive models thanks to the reduction of decoding steps. We also show that our approach allows for effective domain adaptation by simply switching to domain-specific text collection without extra training. Finally, we observe that our approach attains additional performance gains by simply scaling up to larger text collections, again without further training.\footnote{Our source codes are publicly available at \url{https://github.com/gmftbyGMFTBY/Copyisallyouneed}.}

* The Eleventh International Conference on Learning Representations (ICLR 2023)

Via

Access Paper or Ask Questions

SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency

Jul 01, 2023
Yan Wang, Yuhang Li, Ruihao Gong, Aishan Liu, Yanfei Wang, Jian Hu, Yongqiang Yao, Yunchen Zhang, Tianzi Xiao, Fengwei Yu, Xianglong Liu

Figure 1 for SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency

Figure 2 for SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency

Figure 3 for SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency

Figure 4 for SysNoise: Exploring and Benchmarking Training-Deployment System Inconsistency

Extensive studies have shown that deep learning models are vulnerable to adversarial and natural noises, yet little is known about model robustness on noises caused by different system implementations. In this paper, we for the first time introduce SysNoise, a frequently occurred but often overlooked noise in the deep learning training-deployment cycle. In particular, SysNoise happens when the source training system switches to a disparate target system in deployments, where various tiny system mismatch adds up to a non-negligible difference. We first identify and classify SysNoise into three categories based on the inference stage; we then build a holistic benchmark to quantitatively measure the impact of SysNoise on 20+ models, comprehending image classification, object detection, instance segmentation and natural language processing tasks. Our extensive experiments revealed that SysNoise could bring certain impacts on model robustness across different tasks and common mitigations like data augmentation and adversarial training show limited effects on it. Together, our findings open a new research topic and we hope this work will raise research attention to deep learning deployment systems accounting for model performance. We have open-sourced the benchmark and framework at https://modeltc.github.io/systemnoise_web.

* Proceedings of Machine Learning and Systems 2023
* Proceedings of Machine Learning and Systems. 2023 Mar 18

Via

Access Paper or Ask Questions

Improving the Transferability of Time Series Forecasting with Decomposition Adaptation

Jun 30, 2023
Yan Gao, Yan Wang, Qiang Wang

Figure 1 for Improving the Transferability of Time Series Forecasting with Decomposition Adaptation

Figure 2 for Improving the Transferability of Time Series Forecasting with Decomposition Adaptation

Figure 3 for Improving the Transferability of Time Series Forecasting with Decomposition Adaptation

Figure 4 for Improving the Transferability of Time Series Forecasting with Decomposition Adaptation

Due to effective pattern mining and feature representation, neural forecasting models based on deep learning have achieved great progress. The premise of effective learning is to collect sufficient data. However, in time series forecasting, it is difficult to obtain enough data, which limits the performance of neural forecasting models. To alleviate the data scarcity limitation, we design Sequence Decomposition Adaptation Network (SeDAN) which is a novel transfer architecture to improve forecasting performance on the target domain by aligning transferable knowledge from cross-domain datasets. Rethinking the transferability of features in time series data, we propose Implicit Contrastive Decomposition to decompose the original features into components including seasonal and trend features, which are easier to transfer. Then we design the corresponding adaptation methods for decomposed features in different domains. Specifically, for seasonal features, we perform joint distribution adaptation and for trend features, we design an Optimal Local Adaptation. We conduct extensive experiments on five benchmark datasets for multivariate time series forecasting. The results demonstrate the effectiveness of our SeDAN. It can provide more efficient and stable knowledge transfer.

* 15 pages, 7 figures

Via

Access Paper or Ask Questions

Asymptotic Performance Analysis of Large-Scale Active IRS-Aided Wireless Network

Jun 05, 2023
Yan Wang, Feng Shu, Zhihong Zhuang, Rongen Dong, Qi Zhang, Di Wu, Liang Yang, Jiangzhou Wang

Figure 1 for Asymptotic Performance Analysis of Large-Scale Active IRS-Aided Wireless Network

Figure 2 for Asymptotic Performance Analysis of Large-Scale Active IRS-Aided Wireless Network

Figure 3 for Asymptotic Performance Analysis of Large-Scale Active IRS-Aided Wireless Network

Figure 4 for Asymptotic Performance Analysis of Large-Scale Active IRS-Aided Wireless Network

In this paper, the dominant factor affecting the performance of active intelligent reflecting surface (IRS) aided wireless communication networks in Rayleigh fading channel, namely the average signal-to-noise ratio (SNR) $\gamma_0$ at IRS, is studied. Making use of the weak law of large numbers, its simple asymptotic expression is derived as the number $N$ of IRS elements goes to medium-scale and large-scale. When $N$ tends to large-scale, the asymptotic received SNR at user is proved to be a linear increasing function of a product of $\gamma_0$ and $N$. Subsequently, when the BS transmit power is fixed, there exists an optimal limited reflective power at IRS. At this point, more IRS reflect power will degrade the SNR performance. Additionally, under the total power sum constraint of the BS transmit power and the power reflected by the IRS, an optimal power allocation (PA) strategy is derived and shown to achieve 0.83 bit rate gain over equal PA. Finally, an IRS with finite phase shifters being taken into account, generates phase quantization errors, and further leads to a degradation of receive performance. The corresponding closed-form performance loss expressions for user's asymptotic SNR, achievable rate (AR), and bit error rate (BER) are derived for active IRS. Numerical simulation results show that a 3-bit discrete phase shifter is required to achieve a trivial performance loss for a large-scale active IRS.

Via

Access Paper or Ask Questions

MedNgage: A Dataset for Understanding Engagement in Patient-Nurse Conversations

May 31, 2023
Yan Wang, Heidi Ann Scharf Donovan, Sabit Hassan, Mailhe Alikhani

Figure 1 for MedNgage: A Dataset for Understanding Engagement in Patient-Nurse Conversations

Figure 2 for MedNgage: A Dataset for Understanding Engagement in Patient-Nurse Conversations

Figure 3 for MedNgage: A Dataset for Understanding Engagement in Patient-Nurse Conversations

Figure 4 for MedNgage: A Dataset for Understanding Engagement in Patient-Nurse Conversations

Patients who effectively manage their symptoms often demonstrate higher levels of engagement in conversations and interventions with healthcare practitioners. This engagement is multifaceted, encompassing cognitive and socio-affective dimensions. Consequently, it is crucial for AI systems to understand the engagement in natural conversations between patients and practitioners to better contribute toward patient care. In this paper, we present a novel dataset (MedNgage), which consists of patient-nurse conversations about cancer symptom management. We manually annotate the dataset with a novel framework of categories of patient engagement from two different angles, namely: i) socio-affective engagement (3.1K spans), and ii) cognitive engagement (1.8K spans). Through statistical analysis of the data that is annotated using our framework, we show a positive correlation between patient symptom management outcomes and their engagement in conversations. Additionally, we demonstrate that pre-trained transformer models fine-tuned on our dataset can reliably predict engagement categories in patient-nurse conversations. Lastly, we use LIME (Ribeiro et al., 2016) to analyze the underlying challenges of the tasks that state-of-the-art transformer models encounter. The de-identified data is available for research purposes upon request.

* ACL Findings 2023

Via

Access Paper or Ask Questions

Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate

May 30, 2023
Tian Liang, Zhiwei He, Wenxiang Jiao, Xing Wang, Yan Wang, Rui Wang, Yujiu Yang, Zhaopeng Tu, Shuming Shi

Figure 1 for Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate

Figure 2 for Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate

Figure 3 for Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate

Figure 4 for Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate

Modern large language models (LLMs) like ChatGPT have shown remarkable performance on general language tasks but still struggle on complex reasoning tasks, which drives the research on cognitive behaviors of LLMs to explore human-like problem-solving strategies. Along this direction, one representative strategy is self-reflection, which asks an LLM to refine the solution with the feedback generated by itself iteratively. However, our study shows that such reflection-style methods suffer from the Degeneration-of-Thought (DoT) problem: once the LLM has established confidence in its solutions, it is unable to generate novel thoughts later through reflection even if its initial stance is incorrect. To address the DoT problem, we propose a Multi-Agent Debate (MAD) framework, in which multiple agents express their arguments in the state of "tit for tat" and a judge manages the debate process to obtain a final solution. Clearly, our MAD framework encourages divergent thinking in LLMs which would be helpful for tasks that require deep levels of contemplation. Experiment results on two challenging datasets, commonsense machine translation and counter-intuitive arithmetic reasoning, demonstrate the effectiveness of our MAD framework. Extensive analyses suggest that the adaptive break of debate and the modest level of "tit for tat" state are required for MAD to obtain good performance. Moreover, we find that LLMs might not be a fair judge if different LLMs are used for agents. Codes: https://github.com/Skytliang/Multi-Agents-Debate

* Work in progress

Via

Access Paper or Ask Questions

Joint Uplink and Downlink Resource Allocation Towards Energy-efficient Transmission for URLLC

May 25, 2023
Kang Li, Pengcheng Zhu, Yan Wang, Fu-Chun Zheng, Xiaohu You

Figure 1 for Joint Uplink and Downlink Resource Allocation Towards Energy-efficient Transmission for URLLC

Figure 2 for Joint Uplink and Downlink Resource Allocation Towards Energy-efficient Transmission for URLLC

Figure 3 for Joint Uplink and Downlink Resource Allocation Towards Energy-efficient Transmission for URLLC

Figure 4 for Joint Uplink and Downlink Resource Allocation Towards Energy-efficient Transmission for URLLC

Ultra-reliable and low-latency communications (URLLC) is firstly proposed in 5G networks, and expected to support applications with the most stringent quality-of-service (QoS). However, since the wireless channels vary dynamically, the transmit power for ensuring the QoS requirements of URLLC may be very high, which conflicts with the power limitation of a real system. To fulfill the successful URLLC transmission with finite transmit power, we propose an energy-efficient packet delivery mechanism incorparated with frequency-hopping and proactive dropping in this paper. To reduce uplink outage probability, frequency-hopping provides more chances for transmission so that the failure hardly occurs. To avoid downlink outage from queue clearing, proactive dropping controls overall reliability by introducing an extra error component. With the proposed packet delivery mechanism, we jointly optimize bandwidth allocation and power control of uplink and downlink, antenna configuration, and subchannel assignment to minimize the average total power under the constraint of URLLC transmission requirements. Via theoretical analysis (e.g., the convexity with respect to bandwidth, the independence of bandwidth allocation, the convexity of antenna configuration with inactive constraints), the simplication of finding the global optimal solution for resource allocation is addressed. A three-step method is then proposed to find the optimal solution for resource allocation. Simulation results validate the analysis and show the performance gain by optimizing resource allocation with the proposed packet delivery mechanism.

* 16 pages, 11 figures

Via

Access Paper or Ask Questions

PandaGPT: One Model To Instruction-Follow Them All

May 25, 2023
Yixuan Su, Tian Lan, Huayang Li, Jialu Xu, Yan Wang, Deng Cai

Figure 1 for PandaGPT: One Model To Instruction-Follow Them All

Figure 2 for PandaGPT: One Model To Instruction-Follow Them All

Figure 3 for PandaGPT: One Model To Instruction-Follow Them All

Figure 4 for PandaGPT: One Model To Instruction-Follow Them All

We present PandaGPT, an approach to emPower large lANguage moDels with visual and Auditory instruction-following capabilities. Our pilot experiments show that PandaGPT can perform complex tasks such as detailed image description generation, writing stories inspired by videos, and answering questions about audios. More interestingly, PandaGPT can take multimodal inputs simultaneously and compose their semantics naturally. For example, PandaGPT can connect how objects look in an image/video and how they sound in an audio. To do so, PandaGPT combines the multimodal encoders from ImageBind and the large language models from Vicuna. Notably, only aligned image-text pairs are required for the training of PandaGPT. Thanks to the strong capability of ImageBind in embedding data from different modalities into the same space, PandaGPT displays emergent, i.e. zero-shot, cross-modal behaviors for data other than image and text (e.g., video, audio, depth, thermal, and IMU). We hope that PandaGPT serves as an initial step toward building AGI that can perceive and understand inputs in different modalities holistically, as we humans do. Our project page is at https://panda-gpt.github.io/.

* Technical report, work in progress. Our project page is at https://panda-gpt.github.io/

Via

Access Paper or Ask Questions

Privacy-preserving Adversarial Facial Features

May 08, 2023
Zhibo Wang, He Wang, Shuaifan Jin, Wenwen Zhang, Jiahui Hu, Yan Wang, Peng Sun, Wei Yuan, Kaixin Liu, Kui Ren

Figure 1 for Privacy-preserving Adversarial Facial Features

Figure 2 for Privacy-preserving Adversarial Facial Features

Figure 3 for Privacy-preserving Adversarial Facial Features

Figure 4 for Privacy-preserving Adversarial Facial Features

Face recognition service providers protect face privacy by extracting compact and discriminative facial features (representations) from images, and storing the facial features for real-time recognition. However, such features can still be exploited to recover the appearance of the original face by building a reconstruction network. Although several privacy-preserving methods have been proposed, the enhancement of face privacy protection is at the expense of accuracy degradation. In this paper, we propose an adversarial features-based face privacy protection (AdvFace) approach to generate privacy-preserving adversarial features, which can disrupt the mapping from adversarial features to facial images to defend against reconstruction attacks. To this end, we design a shadow model which simulates the attackers' behavior to capture the mapping function from facial features to images and generate adversarial latent noise to disrupt the mapping. The adversarial features rather than the original features are stored in the server's database to prevent leaked features from exposing facial information. Moreover, the AdvFace requires no changes to the face recognition network and can be implemented as a privacy-enhancing plugin in deployed face recognition systems. Extensive experimental results demonstrate that AdvFace outperforms the state-of-the-art face privacy-preserving methods in defending against reconstruction attacks while maintaining face recognition accuracy.

Via

Access Paper or Ask Questions