Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hayato Nishioka

Profy: Interpretable Visualization of Expertise-Dependent Motor Skills Toward Supporting Piano Practice

Jun 09, 2026

Kazuki Kawamura, Fujiki Nakamura, Hayato Nishioka, Momoko Shioki, Shinichi Furuya, Jun Rekimoto

Abstract:The quality of piano performance depends on nuanced timing, articulation, and dynamic control, but practice feedback is often summary-based and hard to act on. We introduce Profy, a weakly supervised system that learns from take-level labels derived from aggregated listener ratings (expert-labeled vs. amateur-labeled) to produce time-aligned highlights for review during piano practice. We collected synchronized 1 kHz key-motion and audio from 73 pianists and used 1,083 valid takes for modeling and evaluation. The model outputs clip-level predictions together with evidence scores on a shared resampled model time base for visualization. On 20 amateur clips from short technique studies annotated by 21 expert pianists, the displayed highlight score aligns with passages that expert pianists marked for review despite training without localized labels (Pearson r=0.61, ROC-AUC 0.75). Rather than summarizing a take with a single global score, Profy helps learners decide where to inspect next by supporting scrubbing, looping, and focused replay of time-localized passages associated with expert-amateur differences.

* Designing Interactive Systems Conference (DIS '26), June 13-17, 2026, Singapore, Singapore

Via

Access Paper or Ask Questions

How does the teacher rate? Observations from the NeuroPiano dataset

Oct 04, 2024

Huan Zhang, Vincent Cheung, Hayato Nishioka, Simon Dixon, Shinichi Furuya

Abstract:This paper provides a detailed analysis of the NeuroPiano dataset, which comprise 104 audio recordings of student piano performances accompanied with 2255 textual feedback and ratings given by professional pianists. We offer a statistical overview of the dataset, focusing on the standardization of annotations and inter-annotator agreement across 12 evaluative questions concerning performance quality. We also explore the predictive relationship between audio features and teacher ratings via machine learning, as well as annotations provided for text analysis of the responses.

Via

Access Paper or Ask Questions

LLaQo: Towards a Query-Based Coach in Expressive Music Performance Assessment

Sep 16, 2024

Huan Zhang, Vincent Cheung, Hayato Nishioka, Simon Dixon, Shinichi Furuya

Figure 1 for LLaQo: Towards a Query-Based Coach in Expressive Music Performance Assessment

Figure 2 for LLaQo: Towards a Query-Based Coach in Expressive Music Performance Assessment

Figure 3 for LLaQo: Towards a Query-Based Coach in Expressive Music Performance Assessment

Figure 4 for LLaQo: Towards a Query-Based Coach in Expressive Music Performance Assessment

Abstract:Research in music understanding has extensively explored composition-level attributes such as key, genre, and instrumentation through advanced representations, leading to cross-modal applications using large language models. However, aspects of musical performance such as stylistic expression and technique remain underexplored, along with the potential of using large language models to enhance educational outcomes with customized feedback. To bridge this gap, we introduce LLaQo, a Large Language Query-based music coach that leverages audio language modeling to provide detailed and formative assessments of music performances. We also introduce instruction-tuned query-response datasets that cover a variety of performance dimensions from pitch accuracy to articulation, as well as contextual performance understanding (such as difficulty and performance techniques). Utilizing AudioMAE encoder and Vicuna-7b LLM backend, our model achieved state-of-the-art (SOTA) results in predicting teachers' performance ratings, as well as in identifying piece difficulty and playing techniques. Textual responses from LLaQo was moreover rated significantly higher compared to other baseline models in a user study using audio-text matching. Our proposed model can thus provide informative answers to open-ended questions related to musical performance from audio data.

Via

Access Paper or Ask Questions