Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sunghyun Lee

MAVL: A Multilingual Audio-Video Lyrics Dataset for Animated Song Translation

May 24, 2025

Woohyun Cho, Youngmin Kim, Sunghyun Lee, Youngjae Yu

Abstract:Lyrics translation requires both accurate semantic transfer and preservation of musical rhythm, syllabic structure, and poetic style. In animated musicals, the challenge intensifies due to alignment with visual and auditory cues. We introduce Multilingual Audio-Video Lyrics Benchmark for Animated Song Translation (MAVL), the first multilingual, multimodal benchmark for singable lyrics translation. By integrating text, audio, and video, MAVL enables richer and more expressive translations than text-only approaches. Building on this, we propose Syllable-Constrained Audio-Video LLM with Chain-of-Thought SylAVL-CoT, which leverages audio-video cues and enforces syllabic constraints to produce natural-sounding lyrics. Experimental results demonstrate that SylAVL-CoT significantly outperforms text-based models in singability and contextual accuracy, emphasizing the value of multimodal, multilingual approaches for lyrics translation.

* 28 pages, 8 figures

Via

Access Paper or Ask Questions

Multipath Interference Suppression of Amplitude-Modulated Continuous Wave Scanning LiDAR Based on Bayesian-Optimized XGBoost Ensemble

Nov 07, 2022

Sunghyun Lee, Yoonseop Lim, Wookhyeon Kwon, Yonghwa Park

Abstract:This paper proposes a novel multipath interference (MPI) suppression algorithm based on Bayesian-optimized extreme gradient boosting (XGBoost) ensemble to reduce MPI error in amplitude-modulated continuous wave (AMCW) scanning light detection and ranging (LiDAR). Contrast to this paper, many previous research works have focused on the MPI suppression in conventional AMCW time-of-flight (ToF) sensors with flash type illumination sources. However, the mitigated MPI error of these previous works still remains cm-scale due to the inherent limitation of illumination source and lack of MPI data. Meanwhile, since there exist few previous works for coaxial type AMCW scanning LiDAR, the MPI in such LiDAR still has not been validated. To achieve mm-scale MPI error mitigation regarding aforementioned issues, this paper proposes a MPI error correction algorithm based on Bayesian-optimized XGBoost ensemble and its implementation in coaxial type AMCW scanning LiDAR. To train the XGBoost ensemble, the MPI synthetic dataset generated by customized simulation is used in this paper. According to validation results, the mean absolute error (MAE) of MPI error originally 9.8 mm can be reduced to less than 2 mm by Bayesian-optimized XGBoost in simulation dataset. Such precise MPI mitigation results are also maintained in real object scenes. Specifically, the MAE of MPI error in measurement condition similar with public dataset is reduced to 2.8 mm, which is extremely low compared to other previous works.

Via

Access Paper or Ask Questions