Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jiri Mekyska

Layer-Aware Early Fusion of Acoustic and Linguistic Embeddings for Cognitive Status Classification

Jan 30, 2026

Krystof Novotny, Laureano Moro-Velázquez, Jiri Mekyska

Abstract:Speech contains both acoustic and linguistic patterns that reflect cognitive decline, and therefore models describing only one domain cannot fully capture such complexity. This study investigates how early fusion (EF) of speech and its corresponding transcription text embeddings, with attention to encoder layer depth, can improve cognitive status classification. Using a DementiaBank-derived collection of recordings (1,629 speakers; cognitively normal controls$\unicode{x2013}$CN, Mild Cognitive Impairment$\unicode{x2013}$MCI, and Alzheimer's Disease and Related Dementias$\unicode{x2013}$ADRD), we extracted frame-aligned embeddings from different internal layers of wav2vec 2.0 or Whisper combined with DistilBERT or RoBERTa. Unimodal, EF and late fusion (LF) models were trained with a transformer classifier, optimized, and then evaluated across 10 seeds. Performance consistently peaked in mid encoder layers ($\sim$8$\unicode{x2013}$10), with the single best F1 at Whisper + RoBERTa layer 9 and the best log loss at Whisper + DistilBERT layer 10. Acoustic-only models consistently outperformed text-only variants. EF boosts discrimination for genuinely acoustic embeddings, whereas LF improves probability calibration. Layer choice critically shapes clinical multimodal synergy.

* 5 pages, 3 figures, paper accepted for ICASSP 2026 conference

Via

Access Paper or Ask Questions

Assessment of Developmental Dysgraphia Utilising a Display Tablet

Oct 23, 2024

Jiri Mekyska, Zoltan Galaz, Katarina Safarova, Vojtech Zvoncak, Lukas Cunek, Tomas Urbanek, Jana Marie Havigerova, Jirina Bednarova, Jan Mucha, Michal Gavenciak(+2 more)

Abstract:Even though the computerised assessment of developmental dysgraphia (DD) based on online handwriting processing has increasing popularity, most of the solutions are based on a setup, where a child writes on a paper fixed to a digitizing tablet that is connected to a computer. Although this approach enables the standard way of writing using an inking pen, it is difficult to be administered by children themselves. The main goal of this study is thus to explore, whether the quantitative analysis of online handwriting recorded via a display screen tablet could sufficiently support the assessment of DD as well. For the purpose of this study, we enrolled 144 children (attending the 3rd and 4th class of a primary school), whose handwriting proficiency was assessed by a special education counsellor, and who assessed themselves by the Handwriting Proficiency Screening Questionnaires for Children (HPSQ C). Using machine learning models based on a gradient-boosting algorithm, we were able to support the DD diagnosis with up to 83.6% accuracy. The HPSQ C total score was estimated with a minimum error equal to 10.34 %. Children with DD spent significantly higher time in-air, they had a higher number of pen elevations, a bigger height of on-surface strokes, a lower in-air tempo, and a higher variation in the angular velocity. Although this study shows a promising impact of DD assessment via display tablets, it also accents the fact that modelling of subjective scores is challenging and a complex and data-driven quantification of DD manifestations is needed.

* IGS 2023. Lecture Notes in Computer Science, vol 14285, pp.21-35
* 16 pages

Via

Access Paper or Ask Questions

Graphomotor and Handwriting Disabilities Rating Scale (GHDRS):towards complex and objective assessment

May 28, 2024

Jiri Mekyska, Katarina Safarova, Tomas Urbanek, Jirina Bednarova, Vojtech Zvoncak, Jana Marie Havigerova, Lukas Cunek, Zoltan Galaz, Jan Mucha, Christine Klauszova(+3 more)

Abstract:Graphomotor and handwriting disabilities (GD and HD, respectively) could significantly reduce children's quality of life. Effective remediation depends on proper diagnosis; however, current approaches to diagnosis and assessment of GD and HD have several limitations and knowledge gaps, e.g. they are subjective, they do not facilitate identification of specific manifestations, etc. The aim of this work is to introduce a new scale (GHDRS Graphomotor and Handwriting Disabilities Rating Scale) that will enable experts to perform objective and complex computeraided diagnosis and assessment of GD and HD. The scale supports quantification of 17 manifestations associated with the process/product of drawing/ handwriting. The whole methodology of GHDRS design is made maximally transparent so that it could be adapted for other languages.

* Australian Journalof Learning Difficulties, Routledge, 1-34,2024

Via

Access Paper or Ask Questions

Deep Learning for In-Orbit Cloud Segmentation and Classification in Hyperspectral Satellite Data

Mar 13, 2024

Daniel Kovac, Jan Mucha, Jon Alvarez Justo, Jiri Mekyska, Zoltan Galaz, Krystof Novotny, Radoslav Pitonak, Jan Knezik, Jonas Herec, Tor Arne Johansen

Figure 1 for Deep Learning for In-Orbit Cloud Segmentation and Classification in Hyperspectral Satellite Data

Figure 2 for Deep Learning for In-Orbit Cloud Segmentation and Classification in Hyperspectral Satellite Data

Figure 3 for Deep Learning for In-Orbit Cloud Segmentation and Classification in Hyperspectral Satellite Data

Figure 4 for Deep Learning for In-Orbit Cloud Segmentation and Classification in Hyperspectral Satellite Data

Abstract:This article explores the latest Convolutional Neural Networks (CNNs) for cloud detection aboard hyperspectral satellites. The performance of the latest 1D CNN (1D-Justo-LiuNet) and two recent 2D CNNs (nnU-net and 2D-Justo-UNet-Simple) for cloud segmentation and classification is assessed. Evaluation criteria include precision and computational efficiency for in-orbit deployment. Experiments utilize NASA's EO-1 Hyperion data, with varying spectral channel numbers after Principal Component Analysis. Results indicate that 1D-Justo-LiuNet achieves the highest accuracy, outperforming 2D CNNs, while maintaining compactness with larger spectral channel sets, albeit with increased inference times. However, the performance of 1D CNN degrades with significant channel reduction. In this context, the 2D-Justo-UNet-Simple offers the best balance for in-orbit deployment, considering precision, memory, and time costs. While nnU-net is suitable for on-ground processing, deployment of lightweight 1D-Justo-LiuNet is recommended for high-precision applications. Alternatively, lightweight 2D-Justo-UNet-Simple is recommended for balanced costs between timing and precision in orbit.

* Hyperspectral Satellite Data, Cloud Segmentation, Classification, Convolutional Neural Networks, Principal Component Analysis

Via

Access Paper or Ask Questions

Prodromal Diagnosis of Lewy Body Diseases Based on the Assessment of Graphomotor and Handwriting Difficulties

Jan 20, 2023

Zoltan Galaz, Jiri Mekyska, Jan Mucha, Vojtech Zvoncak, Zdenek Smekal, Marcos Faundez-Zanuy, Lubos Brabenec, Ivona Moravkova, Irena Rektorova

Abstract:To this date, studies focusing on the prodromal diagnosis of Lewy body diseases (LBDs) based on quantitative analysis of graphomotor and handwriting difficulties are missing. In this work, we enrolled 18 subjects diagnosed with possible or probable mild cognitive impairment with Lewy bodies (MCI-LB), 7 subjects having more than 50% probability of developing Parkinson's disease (PD), 21 subjects with both possible/probable MCI-LB and probability of PD > 50%, and 37 age- and gender-matched healthy controls (HC). Each participant performed three tasks: Archimedean spiral drawing (to quantify graphomotor difficulties), sentence writing task (to quantify handwriting difficulties), and pentagon copying test (to quantify cognitive decline). Next, we parameterized the acquired data by various temporal, kinematic, dynamic, spatial, and task-specific features. And finally, we trained classification models for each task separately as well as a model for their combination to estimate the predictive power of the features for the identification of LBDs. Using this approach we were able to identify prodromal LBDs with 74% accuracy and showed the promising potential of computerized objective and non-invasive diagnosis of LBDs based on the assessment of graphomotor and handwriting difficulties.

* In: Carmona-Duarte, C., Diaz, M., Ferrer, M.A., Morales, A. (eds) Intertwining Graphonomics with Human Movements. IGS 2022. Lecture Notes in Computer Science, vol 13424. Springer, Cham
* Print ISBN 978-3-031-19744-4

Via

Access Paper or Ask Questions

Exploration of Various Fractional Order Derivatives in Parkinson's Disease Dysgraphia Analysis

Jan 20, 2023

Jan Mucha, Zoltan Galaz, Jiri Mekyska, Marcos Faundez-Zanuy, Vojtech Zvoncak, Zdenek Smekal, Lubos Brabenec, Irena Rektorova

Abstract:Parkinson's disease (PD) is a common neurodegenerative disorder with a prevalence rate estimated to 2.0% for people aged over 65 years. Cardinal motor symptoms of PD such as rigidity and bradykinesia affect the muscles involved in the handwriting process resulting in handwriting abnormalities called PD dysgraphia. Nowadays, online handwritten signal (signal with temporal information) acquired by the digitizing tablets is the most advanced approach of graphomotor difficulties analysis. Although the basic kinematic features were proved to effectively quantify the symptoms of PD dysgraphia, a recent research identified that the theory of fractional calculus can be used to improve the graphomotor difficulties analysis. Therefore, in this study, we follow up on our previous research, and we aim to explore the utilization of various approaches of fractional order derivative (FD) in the analysis of PD dysgraphia. For this purpose, we used the repetitive loops task from the Parkinson's disease handwriting database (PaHaW). Handwritten signals were parametrized by the kinematic features employing three FD approximations: Gr\"unwald-Letnikov's, Riemann-Liouville's, and Caputo's. Results of the correlation analysis revealed a significant relationship between the clinical state and the handwriting features based on the velocity. The extracted features by Caputo's FD approximation outperformed the rest of the analyzed FD approaches. This was also confirmed by the results of the classification analysis, where the best model trained by Caputo's handwriting features resulted in a balanced accuracy of 79.73% with a sensitivity of 83.78% and a specificity of 75.68%.

Via

Access Paper or Ask Questions

Preliminary experiments on thermal emissivity adjustment for face images

Mar 30, 2022

Marcos Faundez-Zanuy, Xavier Font Aragones, Jiri Mekyska

Figure 1 for Preliminary experiments on thermal emissivity adjustment for face images

Figure 2 for Preliminary experiments on thermal emissivity adjustment for face images

Figure 3 for Preliminary experiments on thermal emissivity adjustment for face images

Figure 4 for Preliminary experiments on thermal emissivity adjustment for face images

Abstract:In this paper we summarize several applications based on thermal imaging. We emphasize the importance of emissivity adjustment for a proper temperature measurement. A new set of face images acquired at different emissivity values with steps of 0.01 is also presented and will be distributed for free for research purposes. Among the utilities, we can mention: a) the possibility to apply corrections once an image is acquired with a wrong emissivity value and it is not possible to acquire a new one; b) privacy protection in thermal images, which can be obtained with a low emissivity factor, which is still suitable for several applications, but hides the identity of a user; c) image processing for improving temperature detection in scenes containing objects of different emissivity.

* in Esposito, A., Faundez-Zanuy, M., Morabito, F., Pasero, E. (eds) Progresses in Artificial Intelligence and Neural Systems. Smart Innovation, Systems and Technologies, vol 184. Springer, Singapore 2021
* 8 pages, published in: Esposito, A., Faundez-Zanuy, M., Morabito, F., Pasero, E. (eds) Progresses in Artificial Intelligence and Neural Systems. Smart Innovation, Systems and Technologies, vol 184. Springer, Singapore

Via

Access Paper or Ask Questions

Contribution of the Temperature of the Objects to the Problem of Thermal Imaging Focusing

Mar 30, 2022

Virginia Espinosa-Duró, Marcos Faundez-Zanuy, Jiri Mekyska

Figure 1 for Contribution of the Temperature of the Objects to the Problem of Thermal Imaging Focusing

Figure 2 for Contribution of the Temperature of the Objects to the Problem of Thermal Imaging Focusing

Figure 3 for Contribution of the Temperature of the Objects to the Problem of Thermal Imaging Focusing

Figure 4 for Contribution of the Temperature of the Objects to the Problem of Thermal Imaging Focusing

Abstract:When focusing an image, depth of field, aperture and distance from the camera to the object, must be taking into account, both, in visible and in infrared spectrum. Our experiments reveal that in addition, the focusing problem in thermal spectrum is also hardly dependent of the temperature of the object itself (and/or the scene).

* 2012 IEEE International Carnahan Conference on Security Technology (ICCST), 2012, pp. 363-366
* 5 pages, published in 2012 IEEE International Carnahan Conference on Security Technology (ICCST), 15-18 Oct. 2012 Boston (MA) USA. arXiv admin note: text overlap with arXiv:2203.08513

Via

Access Paper or Ask Questions

A Naturalistic Database of Thermal Emotional Facial Expressions and Effects of Induced Emotions on Memory

Mar 29, 2022

Anna Esposito, Vincenzo Capuano, Jiri Mekyska, Marcos Faundez-Zanuy

Figure 1 for A Naturalistic Database of Thermal Emotional Facial Expressions and Effects of Induced Emotions on Memory

Figure 2 for A Naturalistic Database of Thermal Emotional Facial Expressions and Effects of Induced Emotions on Memory

Figure 3 for A Naturalistic Database of Thermal Emotional Facial Expressions and Effects of Induced Emotions on Memory

Figure 4 for A Naturalistic Database of Thermal Emotional Facial Expressions and Effects of Induced Emotions on Memory

Abstract:This work defines a procedure for collecting naturally induced emotional facial expressions through the vision of movie excerpts with high emotional contents and reports experimental data ascertaining the effects of emotions on memory word recognition tasks. The induced emotional states include the four basic emotions of sadness, disgust, happiness, and surprise, as well as the neutral emotional state. The resulting database contains both thermal and visible emotional facial expressions, portrayed by forty Italian subjects and simultaneously acquired by appropriately synchronizing a thermal and a standard visible camera. Each subject's recording session lasted 45 minutes, allowing for each mode (thermal or visible) to collect a minimum of 2000 facial expressions from which a minimum of 400 were selected as highly expressive of each emotion category. The database is available to the scientific community and can be obtained contacting one of the authors. For this pilot study, it was found that emotions and/or emotion categories do not affect individual performance on memory word recognition tasks and temperature changes in the face or in some regions of it do not discriminate among emotional states.

* 2012 Cognitive Behavioural Systems. Lecture Notes in Computer Science, vol 7403. Springer, Berlin, Heidelberg
* 15 pages published in Esposito, A., Esposito, A.M., Vinciarelli, A., Hoffmann, R., M\"uller, V.C. (eds) Cognitive Behavioural Systems. Lecture Notes in Computer Science, vol 7403. Springer, Berlin, Heidelberg

Via

Access Paper or Ask Questions

Face segmentation: A comparison between visible and thermal images

Mar 29, 2022

Jiri Mekyska, Virginia Espinosa-Duró, Marcos Faundez-Zanuy

Figure 1 for Face segmentation: A comparison between visible and thermal images

Figure 2 for Face segmentation: A comparison between visible and thermal images

Figure 3 for Face segmentation: A comparison between visible and thermal images

Figure 4 for Face segmentation: A comparison between visible and thermal images

Abstract:Face segmentation is a first step for face biometric systems. In this paper we present a face segmentation algorithm for thermographic images. This algorithm is compared with the classic Viola and Jones algorithm used for visible images. Experimental results reveal that, when segmenting a multispectral (visible and thermal) face database, the proposed algorithm is more than 10 times faster, while the accuracy of face segmentation in thermal images is higher than in case of Viola-Jones

* 44th Annual 2010 IEEE International Carnahan Conference on Security Technology, 2010, pp. 185-189
* 5 pages, published in 44th Annual 2010 IEEE International Carnahan Conference on Security Technology, 2010, pp. 185-189, 5-8 Oct. 2010 San Jose (California, USA)

Via

Access Paper or Ask Questions