Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hawon Jeong

LLM-Based Multi-Reference Evaluation for Efficient and Robust Assessment of Phrase Break Annotations

Jun 19, 2026

Younghan Park, Hoyeon Lee, Hawon Jeong, Jong-Hwan Kim

Abstract:Reliable evaluation of phrase break annotations is crucial, as subtle variations in prosodic boundaries directly affect the clarity and naturalness of speech. However, existing approaches exhibit major limitations: single-reference evaluation assumes a unique gold phrasing for an utterance despite multiple valid phrasings, while human judgment, though flexible, is labor-intensive and unscalable. To address these, we propose LLM-based Multi-Reference Evaluation (LMRE) for phrase break annotations that models the one-to-many nature of prosodic phrasing and generates multiple valid phrasings from minimal demonstrations. On a Korean testbed of 1,356 annotations covering five strategies, LMRE shows stronger alignment with human judgment than single-reference evaluation in both acceptance behavior and score correlation. Our findings demonstrate that LMRE effectively achieves both scalability and multi-reference support, highlighting the potential of LLMs for evaluation in the speech domain.

* Accepted at Interspeech 2026

Via

Access Paper or Ask Questions

PRePair: Pointwise Reasoning Enhance Pairwise Evaluating for Robust Instruction-Following Assessments

Jun 18, 2024

Hawon Jeong, ChaeHun Park, Jimin Hong, Jaegul Choo

Figure 1 for PRePair: Pointwise Reasoning Enhance Pairwise Evaluating for Robust Instruction-Following Assessments

Figure 2 for PRePair: Pointwise Reasoning Enhance Pairwise Evaluating for Robust Instruction-Following Assessments

Figure 3 for PRePair: Pointwise Reasoning Enhance Pairwise Evaluating for Robust Instruction-Following Assessments

Figure 4 for PRePair: Pointwise Reasoning Enhance Pairwise Evaluating for Robust Instruction-Following Assessments

Abstract:Pairwise evaluation using large language models (LLMs) is widely used for evaluating natural language generation (NLG) tasks. However, the reliability of LLMs is often compromised by biases, such as favoring verbosity and authoritative tone. In the study, we focus on the comparison of two LLM-based evaluation approaches, pointwise and pairwise. Our findings demonstrate that pointwise evaluators exhibit more robustness against undesirable preferences. Further analysis reveals that pairwise evaluators can accurately identify the shortcomings of low-quality outputs even when their judgment is incorrect. These results indicate that LLMs are more severely influenced by their bias in a pairwise evaluation setup. To mitigate this, we propose a hybrid method that integrates pointwise reasoning into pairwise evaluation. Experimental results show that our method enhances the robustness of pairwise evaluators against adversarial samples while preserving accuracy on normal samples.

Via

Access Paper or Ask Questions

ST-RAP: A Spatio-Temporal Framework for Real Estate Appraisal

Aug 21, 2023

Hojoon Lee, Hawon Jeong, Byungkun Lee, Kyungyup Lee, Jaegul Choo

Figure 1 for ST-RAP: A Spatio-Temporal Framework for Real Estate Appraisal

Figure 2 for ST-RAP: A Spatio-Temporal Framework for Real Estate Appraisal

Figure 3 for ST-RAP: A Spatio-Temporal Framework for Real Estate Appraisal

Figure 4 for ST-RAP: A Spatio-Temporal Framework for Real Estate Appraisal

Abstract:In this paper, we introduce ST-RAP, a novel Spatio-Temporal framework for Real estate APpraisal. ST-RAP employs a hierarchical architecture with a heterogeneous graph neural network to encapsulate temporal dynamics and spatial relationships simultaneously. Through comprehensive experiments on a large-scale real estate dataset, ST-RAP outperforms previous methods, demonstrating the significant benefits of integrating spatial and temporal aspects in real estate appraisal. Our code and dataset are available at https://github.com/dojeon-ai/STRAP.

* Accepted to CIKM'23

Via

Access Paper or Ask Questions