Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zijie Zeng

From Oracle to Noisy Context: Mitigating Contextual Exposure Bias in Speech-LLMs

Mar 25, 2026

Xiaoyong Guo, Nanjie Li, Zijie Zeng, Kai Wang, Hao Huang, Haihua Xu, Wei Shi

Abstract:Contextual automatic speech recognition (ASR) with Speech-LLMs is typically trained with oracle conversation history, but relies on error-prone history at inference, causing a train-test mismatch in the context channel that we term contextual exposure bias. We propose a unified training framework to improve robustness under realistic histories: (i) Teacher Error Knowledge by using Whisper large-v3 hypotheses as training-time history, (ii) Context Dropout to regularize over-reliance on history, and (iii) Direct Preference Optimization (DPO) on curated failure cases. Experiments on TED-LIUM 3 (in-domain) and zero-shot LibriSpeech (out-of-domain) show consistent gains under predicted-history decoding. With a two-utterance history as context, SFT with Whisper hypotheses reduce WER from 5.59% (oracle-history training) to 5.47%, and DPO further improves to 5.17%. Under irrelevant-context attacks, DPO yields the smallest degradation (5.17% -> 5.63%), indicating improved robustness to misleading context. Our code and models are published on https://github.com/XYGuo1996/Contextual_Speech_LLMs.

Via

Access Paper or Ask Questions

Overview of the 2024 ALTA Shared Task: Detect Automatic AI-Generated Sentences for Human-AI Hybrid Articles

Dec 19, 2024

Diego Mollá, Qiongkai Xu, Zijie Zeng, Zhuang Li

Figure 1 for Overview of the 2024 ALTA Shared Task: Detect Automatic AI-Generated Sentences for Human-AI Hybrid Articles

Figure 2 for Overview of the 2024 ALTA Shared Task: Detect Automatic AI-Generated Sentences for Human-AI Hybrid Articles

Figure 3 for Overview of the 2024 ALTA Shared Task: Detect Automatic AI-Generated Sentences for Human-AI Hybrid Articles

Abstract:The ALTA shared tasks have been running annually since 2010. In 2024, the purpose of the task is to detect machine-generated text in a hybrid setting where the text may contain portions of human text and portions machine-generated. In this paper, we present the task, the evaluation criteria, and the results of the systems participating in the shared task.

* Proceedings of the 22nd Annual Workshop of the Australasian Language Technology Association, 2024
* 6 pages, 3 tables, published in ALTA 2024

Via

Access Paper or Ask Questions

Modifying AI, Enhancing Essays: How Active Engagement with Generative AI Boosts Writing Quality

Dec 10, 2024

Kaixun Yang, Mladen Raković, Zhiping Liang, Lixiang Yan, Zijie Zeng, Yizhou Fan, Dragan Gašević, Guanliang Chen

Abstract:Students are increasingly relying on Generative AI (GAI) to support their writing-a key pedagogical practice in education. In GAI-assisted writing, students can delegate core cognitive tasks (e.g., generating ideas and turning them into sentences) to GAI while still producing high-quality essays. This creates new challenges for teachers in assessing and supporting student learning, as they often lack insight into whether students are engaging in meaningful cognitive processes during writing or how much of the essay's quality can be attributed to those processes. This study aimed to help teachers better assess and support student learning in GAI-assisted writing by examining how different writing behaviors, especially those indicative of meaningful learning versus those that are not, impact essay quality. Using a dataset of 1,445 GAI-assisted writing sessions, we applied the cutting-edge method, X-Learner, to quantify the causal impact of three GAI-assisted writing behavioral patterns (i.e., seeking suggestions but not accepting them, seeking suggestions and accepting them as they are, and seeking suggestions and accepting them with modification) on four measures of essay quality (i.e., lexical sophistication, syntactic complexity, text cohesion, and linguistic bias). Our analysis showed that writers who frequently modified GAI-generated text-suggesting active engagement in higher-order cognitive processes-consistently improved the quality of their essays in terms of lexical sophistication, syntactic complexity, and text cohesion. In contrast, those who often accepted GAI-generated text without changes, primarily engaging in lower-order processes, saw a decrease in essay quality. Additionally, while human writers tend to introduce linguistic bias when writing independently, incorporating GAI-generated text-even without modification-can help mitigate this bias.

Via

Access Paper or Ask Questions

Annotation Guidelines-Based Knowledge Augmentation: Towards Enhancing Large Language Models for Educational Text Classification

Jun 03, 2024

Shiqi Liu, Sannyuya Liu, Lele Sha, Zijie Zeng, Dragan Gasevic, Zhi Liu

Figure 1 for Annotation Guidelines-Based Knowledge Augmentation: Towards Enhancing Large Language Models for Educational Text Classification

Figure 2 for Annotation Guidelines-Based Knowledge Augmentation: Towards Enhancing Large Language Models for Educational Text Classification

Figure 3 for Annotation Guidelines-Based Knowledge Augmentation: Towards Enhancing Large Language Models for Educational Text Classification

Figure 4 for Annotation Guidelines-Based Knowledge Augmentation: Towards Enhancing Large Language Models for Educational Text Classification

Abstract:Various machine learning approaches have gained significant popularity for the automated classification of educational text to identify indicators of learning engagement -- i.e. learning engagement classification (LEC). LEC can offer comprehensive insights into human learning processes, attracting significant interest from diverse research communities, including Natural Language Processing (NLP), Learning Analytics, and Educational Data Mining. Recently, Large Language Models (LLMs), such as ChatGPT, have demonstrated remarkable performance in various NLP tasks. However, their comprehensive evaluation and improvement approaches in LEC tasks have not been thoroughly investigated. In this study, we propose the Annotation Guidelines-based Knowledge Augmentation (AGKA) approach to improve LLMs. AGKA employs GPT 4.0 to retrieve label definition knowledge from annotation guidelines, and then applies the random under-sampler to select a few typical examples. Subsequently, we conduct a systematic evaluation benchmark of LEC, which includes six LEC datasets covering behavior classification (question and urgency level), emotion classification (binary and epistemic emotion), and cognition classification (opinion and cognitive presence). The study results demonstrate that AGKA can enhance non-fine-tuned LLMs, particularly GPT 4.0 and Llama 3 70B. GPT 4.0 with AGKA few-shot outperforms full-shot fine-tuned models such as BERT and RoBERTa on simple binary classification datasets. However, GPT 4.0 lags in multi-class tasks that require a deep understanding of complex semantic information. Notably, Llama 3 70B with AGKA is a promising combination based on open-source LLM, because its performance is on par with closed-source GPT 4.0 with AGKA. In addition, LLMs struggle to distinguish between labels with similar names in multi-class classification.

* The manuscript has been submitted for peer review to the IEEE Transactions on Learning Technologies

Via

Access Paper or Ask Questions

Towards Detecting AI-Generated Text within Human-AI Collaborative Hybrid Texts

Mar 06, 2024

Zijie Zeng, Shiqi Liu, Lele Sha, Zhuang Li, Kaixun Yang, Sannyuya Liu, Dragan Gašević, Guanliang Chen

Figure 1 for Towards Detecting AI-Generated Text within Human-AI Collaborative Hybrid Texts

Figure 2 for Towards Detecting AI-Generated Text within Human-AI Collaborative Hybrid Texts

Figure 3 for Towards Detecting AI-Generated Text within Human-AI Collaborative Hybrid Texts

Abstract:This study explores the challenge of sentence-level AI-generated text detection within human-AI collaborative hybrid texts. Existing studies of AI-generated text detection for hybrid texts often rely on synthetic datasets. These typically involve hybrid texts with a limited number of boundaries. We contend that studies of detecting AI-generated content within hybrid texts should cover different types of hybrid texts generated in realistic settings to better inform real-world applications. Therefore, our study utilizes the CoAuthor dataset, which includes diverse, realistic hybrid texts generated through the collaboration between human writers and an intelligent writing system in multi-turn interactions. We adopt a two-step, segmentation-based pipeline: (i) detect segments within a given hybrid text where each segment contains sentences of consistent authorship, and (ii) classify the authorship of each identified segment. Our empirical findings highlight (1) detecting AI-generated sentences in hybrid texts is overall a challenging task because (1.1) human writers' selecting and even editing AI-generated sentences based on personal preferences adds difficulty in identifying the authorship of segments; (1.2) the frequent change of authorship between neighboring sentences within the hybrid text creates difficulties for segment detectors in identifying authorship-consistent segments; (1.3) the short length of text segments within hybrid texts provides limited stylistic cues for reliable authorship determination; (2) before embarking on the detection process, it is beneficial to assess the average length of segments within the hybrid text. This assessment aids in deciding whether (2.1) to employ a text segmentation-based strategy for hybrid texts with longer segments, or (2.2) to adopt a direct sentence-by-sentence classification strategy for those with shorter segments.

* under review; 7 pages

Via

Access Paper or Ask Questions

Towards Automatic Boundary Detection for Human-AI Collaborative Hybrid Essay in Education

Aug 16, 2023

Zijie Zeng, Lele Sha, Yuheng Li, Kaixun Yang, Dragan Gašević, Guanliang Chen

Figure 1 for Towards Automatic Boundary Detection for Human-AI Collaborative Hybrid Essay in Education

Figure 2 for Towards Automatic Boundary Detection for Human-AI Collaborative Hybrid Essay in Education

Figure 3 for Towards Automatic Boundary Detection for Human-AI Collaborative Hybrid Essay in Education

Figure 4 for Towards Automatic Boundary Detection for Human-AI Collaborative Hybrid Essay in Education

Abstract:The recent large language models (LLMs), e.g., ChatGPT, have been able to generate human-like and fluent responses when provided with specific instructions. While admitting the convenience brought by technological advancement, educators also have concerns that students might leverage LLMs to complete their writing assignments and pass them off as their original work. Although many AI content detection studies have been conducted as a result of such concerns, most of these prior studies modeled AI content detection as a classification problem, assuming that a text is either entirely human-written or entirely AI-generated. In this study, we investigated AI content detection in a rarely explored yet realistic setting where the text to be detected is collaboratively written by human and generative LLMs (i.e., hybrid text). We first formalized the detection task as identifying the transition points between human-written content and AI-generated content from a given hybrid text (boundary detection). Then we proposed a two-step approach where we (1) separated AI-generated content from human-written content during the encoder training process; and (2) calculated the distances between every two adjacent prototypes and assumed that the boundaries exist between the two adjacent prototypes that have the furthest distance from each other. Through extensive experiments, we observed the following main findings: (1) the proposed approach consistently outperformed the baseline methods across different experiment settings; (2) the encoder training process can significantly boost the performance of the proposed approach; (3) when detecting boundaries for single-boundary hybrid essays, the proposed approach could be enhanced by adopting a relatively large prototype size, leading to a 22% improvement in the In-Domain evaluation and an 18% improvement in the Out-of-Domain evaluation.

* 9 pages including references, 2 figures

Via

Access Paper or Ask Questions

PAS: A Position-Aware Similarity Measurement for Sequential Recommendation

May 19, 2022

Zijie Zeng, Jing Lin, Weike Pan, Zhong Ming, Zhongqi Lu

Figure 1 for PAS: A Position-Aware Similarity Measurement for Sequential Recommendation

Figure 2 for PAS: A Position-Aware Similarity Measurement for Sequential Recommendation

Figure 3 for PAS: A Position-Aware Similarity Measurement for Sequential Recommendation

Figure 4 for PAS: A Position-Aware Similarity Measurement for Sequential Recommendation

Abstract:The common item-based collaborative filtering framework becomes a typical recommendation method when equipped with a certain item-to-item similarity measurement. On one hand, we realize that a well-designed similarity measurement is the key to providing satisfactory recommendation services. On the other hand, similarity measurements designed for sequential recommendation are rarely studied by the recommender systems community. Hence in this paper, we focus on devising a novel similarity measurement called position-aware similarity (PAS) for sequential recommendation. The proposed PAS is, to our knowledge, the first count-based similarity measurement that concurrently captures the sequential patterns from the historical user behavior data and from the item position information within the input sequences. We conduct extensive empirical studies on four public datasets, in which our proposed PAS-based method exhibits competitive performance even compared to the state-of-the-art sequential recommendation methods, including a very recent similarity-based method and two GNN-based methods.

* International Joint Conference on Neural Networks (IJCNN 2022, Padua, Italy), 8 pages, Camera-Ready Version

Via

Access Paper or Ask Questions