Alert button
Picture for Chiyu Zhang

Chiyu Zhang

Alert button

NADI 2023: The Fourth Nuanced Arabic Dialect Identification Shared Task

Oct 24, 2023
Muhammad Abdul-Mageed, AbdelRahim Elmadany, Chiyu Zhang, El Moatez Billah Nagoudi, Houda Bouamor, Nizar Habash

We describe the findings of the fourth Nuanced Arabic Dialect Identification Shared Task (NADI 2023). The objective of NADI is to help advance state-of-the-art Arabic NLP by creating opportunities for teams of researchers to collaboratively compete under standardized conditions. It does so with a focus on Arabic dialects, offering novel datasets and defining subtasks that allow for meaningful comparisons between different approaches. NADI 2023 targeted both dialect identification (Subtask 1) and dialect-to-MSA machine translation (Subtask 2 and Subtask 3). A total of 58 unique teams registered for the shared task, of whom 18 teams have participated (with 76 valid submissions during test phase). Among these, 16 teams participated in Subtask 1, 5 participated in Subtask 2, and 3 participated in Subtask 3. The winning teams achieved 87.27 F1 on Subtask 1, 14.76 Bleu in Subtask 2, and 21.10 Bleu in Subtask 3, respectively. Results show that all three subtasks remain challenging, thereby motivating future work in this area. We describe the methods employed by the participating teams and briefly offer an outlook for NADI.

* arXiv admin note: text overlap with arXiv:2210.09582 
Viaarxiv icon

The Skipped Beat: A Study of Sociopragmatic Understanding in LLMs for 64 Languages

Oct 23, 2023
Chiyu Zhang, Khai Duy Doan, Qisheng Liao, Muhammad Abdul-Mageed

Instruction tuned large language models (LLMs), such as ChatGPT, demonstrate remarkable performance in a wide range of tasks. Despite numerous recent studies that examine the performance of instruction-tuned LLMs on various NLP benchmarks, there remains a lack of comprehensive investigation into their ability to understand cross-lingual sociopragmatic meaning (SM), i.e., meaning embedded within social and interactive contexts. This deficiency arises partly from SM not being adequately represented in any of the existing benchmarks. To address this gap, we present SPARROW, an extensive multilingual benchmark specifically designed for SM understanding. SPARROW comprises 169 datasets covering 13 task types across six primary categories (e.g., anti-social language detection, emotion recognition). SPARROW datasets encompass 64 different languages originating from 12 language families representing 16 writing scripts. We evaluate the performance of various multilingual pretrained language models (e.g., mT5) and instruction-tuned LLMs (e.g., BLOOMZ, ChatGPT) on SPARROW through fine-tuning, zero-shot, and/or few-shot learning. Our comprehensive analysis reveals that existing open-source instruction tuned LLMs still struggle to understand SM across various languages, performing close to a random baseline in some cases. We also find that although ChatGPT outperforms many LLMs, it still falls behind task-specific finetuned models with a gap of 12.19 SPARROW score. Our benchmark is available at: https://github.com/UBC-NLP/SPARROW

* Accepted by EMNLP 2023 Main conference 
Viaarxiv icon

Detecting the Anomalies in LiDAR Pointcloud

Jul 31, 2023
Chiyu Zhang, Ji Han, Yao Zou, Kexin Dong, Yujia Li, Junchun Ding, Xiaoling Han

LiDAR sensors play an important role in the perception stack of modern autonomous driving systems. Adverse weather conditions such as rain, fog and dust, as well as some (occasional) LiDAR hardware fault may cause the LiDAR to produce pointcloud with abnormal patterns such as scattered noise points and uncommon intensity values. In this paper, we propose a novel approach to detect whether a LiDAR is generating anomalous pointcloud by analyzing the pointcloud characteristics. Specifically, we develop a pointcloud quality metric based on the LiDAR points' spatial and intensity distribution to characterize the noise level of the pointcloud, which relies on pure mathematical analysis and does not require any labeling or training as learning-based methods do. Therefore, the method is scalable and can be quickly deployed either online to improve the autonomy safety by monitoring anomalies in the LiDAR data or offline to perform in-depth study of the LiDAR behavior over large amount of data. The proposed approach is studied with extensive real public road data collected by LiDARs with different scanning mechanisms and laser spectrums, and is proven to be able to effectively handle various known and unknown sources of pointcloud anomaly.

Viaarxiv icon

What Constitutes Good Contrastive Learning in Time-Series Forecasting?

Jun 21, 2023
Chiyu Zhang, Qi Yan, Lili Meng, Tristan Sylvain

Figure 1 for What Constitutes Good Contrastive Learning in Time-Series Forecasting?
Figure 2 for What Constitutes Good Contrastive Learning in Time-Series Forecasting?
Figure 3 for What Constitutes Good Contrastive Learning in Time-Series Forecasting?
Figure 4 for What Constitutes Good Contrastive Learning in Time-Series Forecasting?

In recent years, the introduction of self-supervised contrastive learning (SSCL) has demonstrated remarkable improvements in representation learning across various domains, including natural language processing and computer vision. By leveraging the inherent benefits of self-supervision, SSCL enables the pre-training of representation models using vast amounts of unlabeled data. Despite these advances, there remains a significant gap in understanding the impact of different SSCL strategies on time series forecasting performance, as well as the specific benefits that SSCL can bring. This paper aims to address these gaps by conducting a comprehensive analysis of the effectiveness of various training variables, including different SSCL algorithms, learning strategies, model architectures, and their interplay. Additionally, to gain deeper insights into the improvements brought about by SSCL in the context of time-series forecasting, a qualitative analysis of the empirical receptive field is performed. Through our experiments, we demonstrate that the end-to-end training of a Transformer model using the Mean Squared Error (MSE) loss and SSCL emerges as the most effective approach in time series forecasting. Notably, the incorporation of the contrastive objective enables the model to prioritize more pertinent information for forecasting, such as scale and periodic relationships. These findings contribute to a better understanding of the benefits of SSCL in time series forecasting and provide valuable insights for future research in this area.

Viaarxiv icon

LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions

Apr 27, 2023
Minghao Wu, Abdul Waheed, Chiyu Zhang, Muhammad Abdul-Mageed, Alham Fikri Aji

Figure 1 for LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
Figure 2 for LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
Figure 3 for LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
Figure 4 for LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions

Large language models (LLMs) with instruction finetuning demonstrate superior generative capabilities. However, these models are resource intensive. To alleviate this issue, we explore distilling knowledge from instruction-tuned LLMs to much smaller ones. To this end, we carefully develop a large set of 2.58M instructions based on both existing and newly-generated instructions. In addition to being sizeable, we design our instructions to cover a broad set of topics to ensure. A thorough investigation of our instruction data demonstrate their diversity, and we generate responses for these instructions using gpt-3.5-turbo. We then exploit the instructions to tune a host of models, dubbed LaMini-LM, of varying sizes, both from the encoder-decoder as well as the decoder-only families. We evaluate our models both automatically (on 15 different NLP benchmarks) and manually. Results show that our proposed LaMini-LM are on par with competitive baselines while being nearly 10 times smaller in size.

* Work in progress, 20 pages, 8 figures, 13 tables 
Viaarxiv icon

Edge Enhanced Image Style Transfer via Transformers

Jan 02, 2023
Chiyu Zhang, Jun Yang, Zaiyan Dai, Peng Cao

Figure 1 for Edge Enhanced Image Style Transfer via Transformers
Figure 2 for Edge Enhanced Image Style Transfer via Transformers
Figure 3 for Edge Enhanced Image Style Transfer via Transformers
Figure 4 for Edge Enhanced Image Style Transfer via Transformers

In recent years, arbitrary image style transfer has attracted more and more attention. Given a pair of content and style images, a stylized one is hoped that retains the content from the former while catching style patterns from the latter. However, it is difficult to simultaneously keep well the trade-off between the content details and the style features. To stylize the image with sufficient style patterns, the content details may be damaged and sometimes the objects of images can not be distinguished clearly. For this reason, we present a new transformer-based method named STT for image style transfer and an edge loss which can enhance the content details apparently to avoid generating blurred results for excessive rendering on style features. Qualitative and quantitative experiments demonstrate that STT achieves comparable performance to state-of-the-art image style transfer methods while alleviating the content leak problem.

Viaarxiv icon

NADI 2022: The Third Nuanced Arabic Dialect Identification Shared Task

Oct 18, 2022
Muhammad Abdul-Mageed, Chiyu Zhang, AbdelRahim Elmadany, Houda Bouamor, Nizar Habash

Figure 1 for NADI 2022: The Third Nuanced Arabic Dialect Identification Shared Task
Figure 2 for NADI 2022: The Third Nuanced Arabic Dialect Identification Shared Task
Figure 3 for NADI 2022: The Third Nuanced Arabic Dialect Identification Shared Task
Figure 4 for NADI 2022: The Third Nuanced Arabic Dialect Identification Shared Task

We describe findings of the third Nuanced Arabic Dialect Identification Shared Task (NADI 2022). NADI aims at advancing state of the art Arabic NLP, including on Arabic dialects. It does so by affording diverse datasets and modeling opportunities in a standardized context where meaningful comparisons between models and approaches are possible. NADI 2022 targeted both dialect identification (Subtask 1) and dialectal sentiment analysis (Subtask 2) at the country level. A total of 41 unique teams registered for the shared task, of whom 21 teams have actually participated (with 105 valid submissions). Among these, 19 teams participated in Subtask 1 and 10 participated in Subtask 2. The winning team achieved 27.06 F1 on Subtask 1 and F1=75.16 on Subtask 2, reflecting that the two subtasks remain challenging and motivating future work in this area. We describe methods employed by participating teams and offer an outlook for NADI.

* arXiv admin note: text overlap with arXiv:2103.08466 
Viaarxiv icon

Transition to Adulthood for Young People with Intellectual or Developmental Disabilities: Emotion Detection and Topic Modeling

Sep 21, 2022
Yan Liu, Maria Laricheva, Chiyu Zhang, Patrick Boutet, Guanyu Chen, Terence Tracey, Giuseppe Carenini, Richard Young

Transition to Adulthood is an essential life stage for many families. The prior research has shown that young people with intellectual or development disabil-ities (IDD) have more challenges than their peers. This study is to explore how to use natural language processing (NLP) methods, especially unsupervised machine learning, to assist psychologists to analyze emotions and sentiments and to use topic modeling to identify common issues and challenges that young people with IDD and their families have. Additionally, the results were compared to those obtained from young people without IDD who were in tran-sition to adulthood. The findings showed that NLP methods can be very useful for psychologists to analyze emotions, conduct cross-case analysis, and sum-marize key topics from conversational data. Our Python code is available at https://github.com/mlaricheva/emotion_topic_modeling.

* In: Thomson, R., Dancy, C., Pyke, A. (eds) SBP-BRiMS 2022. Lecture Notes in Computer Science, vol 13558. Springer, Cham (2022)  
* Conference proceedings of 2022 SBP-BRiMS 
Viaarxiv icon

Automated Utterance Labeling of Conversations Using Natural Language Processing

Aug 12, 2022
Maria Laricheva, Chiyu Zhang, Yan Liu, Guanyu Chen, Terence Tracey, Richard Young, Giuseppe Carenini

Figure 1 for Automated Utterance Labeling of Conversations Using Natural Language Processing
Figure 2 for Automated Utterance Labeling of Conversations Using Natural Language Processing
Figure 3 for Automated Utterance Labeling of Conversations Using Natural Language Processing
Figure 4 for Automated Utterance Labeling of Conversations Using Natural Language Processing

Conversational data is essential in psychology because it can help researchers understand individuals cognitive processes, emotions, and behaviors. Utterance labelling is a common strategy for analyzing this type of data. The development of NLP algorithms allows researchers to automate this task. However, psychological conversational data present some challenges to NLP researchers, including multilabel classification, a large number of classes, and limited available data. This study explored how automated labels generated by NLP methods are comparable to human labels in the context of conversations on adulthood transition. We proposed strategies to handle three common challenges raised in psychological studies. Our findings showed that the deep learning method with domain adaptation (RoBERTa-CON) outperformed all other machine learning methods; and the hierarchical labelling system that we proposed was shown to help researchers strategically analyze conversational data. Our Python code and NLP model are available at https://github.com/mlaricheva/automated_labeling.

* Accepted in SBP-BRiMS 2022 (Camera-ready version) 
Viaarxiv icon

Decay No More: A Persistent Twitter Dataset for Learning Social Meaning

Apr 10, 2022
Chiyu Zhang, Muhammad Abdul-Mageed, El Moatez Billah Nagoudi

Figure 1 for Decay No More: A Persistent Twitter Dataset for Learning Social Meaning
Figure 2 for Decay No More: A Persistent Twitter Dataset for Learning Social Meaning
Figure 3 for Decay No More: A Persistent Twitter Dataset for Learning Social Meaning
Figure 4 for Decay No More: A Persistent Twitter Dataset for Learning Social Meaning

With the proliferation of social media, many studies resort to social media to construct datasets for developing social meaning understanding systems. For the popular case of Twitter, most researchers distribute tweet IDs without the actual text contents due to the data distribution policy of the platform. One issue is that the posts become increasingly inaccessible over time, which leads to unfair comparisons and a temporal bias in social media research. To alleviate this challenge of data decay, we leverage a paraphrase model to propose a new persistent English Twitter dataset for social meaning (PTSM). PTSM consists of $17$ social meaning datasets in $10$ categories of tasks. We experiment with two SOTA pre-trained language models and show that our PTSM can substitute the actual tweets with paraphrases with marginal performance loss.

* Under review. arXiv admin note: text overlap with arXiv:2108.00356 
Viaarxiv icon