Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:End-to-end Lyrics Alignment for Polyphonic Music Using an Audio-to-Character Recognition Model

Feb 18, 2019

Daniel Stoller, Simon Durand, Sebastian Ewert

Figure 1 for End-to-end Lyrics Alignment for Polyphonic Music Using an Audio-to-Character Recognition Model

Figure 2 for End-to-end Lyrics Alignment for Polyphonic Music Using an Audio-to-Character Recognition Model

Figure 3 for End-to-end Lyrics Alignment for Polyphonic Music Using an Audio-to-Character Recognition Model

Figure 4 for End-to-end Lyrics Alignment for Polyphonic Music Using an Audio-to-Character Recognition Model

Share this with someone who'll enjoy it:

Abstract:Time-aligned lyrics can enrich the music listening experience by enabling karaoke, text-based song retrieval and intra-song navigation, and other applications. Compared to text-to-speech alignment, lyrics alignment remains highly challenging, despite many attempts to combine numerous sub-modules including vocal separation and detection in an effort to break down the problem. Furthermore, training required fine-grained annotations to be available in some form. Here, we present a novel system based on a modified Wave-U-Net architecture, which predicts character probabilities directly from raw audio using learnt multi-scale representations of the various signal components. There are no sub-modules whose interdependencies need to be optimized. Our training procedure is designed to work with weak, line-level annotations available in the real world. With a mean alignment error of 0.35s on a standard dataset our system outperforms the state-of-the-art by an order of magnitude.

* 5 pages (1 for references), 2 figures, 2 tables. Camera-ready version, accepted at the International Conference on Acoustics, Speech, and Signal Processing 2019 (ICASSP)

View paper on

Share this with someone who'll enjoy it:

Title:End-to-end Lyrics Alignment for Polyphonic Music Using an Audio-to-Character Recognition Model

Paper and Code