Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shyam K Sateesh

Meta-Learning in Audio and Speech Processing: An End to End Comprehensive Review

Aug 19, 2024

Athul Raimon, Shubha Masti, Shyam K Sateesh, Siyani Vengatagiri, Bhaskarjyoti Das

Abstract:This survey overviews various meta-learning approaches used in audio and speech processing scenarios. Meta-learning is used where model performance needs to be maximized with minimum annotated samples, making it suitable for low-sample audio processing. Although the field has made some significant contributions, audio meta-learning still lacks the presence of comprehensive survey papers. We present a systematic review of meta-learning methodologies in audio processing. This includes audio-specific discussions on data augmentation, feature extraction, preprocessing techniques, meta-learners, task selection strategies and also presents important datasets in audio, together with crucial real-world use cases. Through this extensive review, we aim to provide valuable insights and identify future research directions in the intersection of meta-learning and audio processing.

* Survey Paper (15 pages, 1 figure)

Via

Access Paper or Ask Questions

Decoding Human Emotions: Analyzing Multi-Channel EEG Data using LSTM Networks

Aug 19, 2024

Shyam K Sateesh, Sparsh BK, Uma D

Abstract:Emotion recognition from electroencephalogram (EEG) signals is a thriving field, particularly in neuroscience and Human-Computer Interaction (HCI). This study aims to understand and improve the predictive accuracy of emotional state classification through metrics such as valence, arousal, dominance, and likeness by applying a Long Short-Term Memory (LSTM) network to analyze EEG signals. Using a popular dataset of multi-channel EEG recordings known as DEAP, we look towards leveraging LSTM networks' properties to handle temporal dependencies within EEG signal data. This allows for a more comprehensive understanding and classification of emotional parameter states. We obtain accuracies of 89.89%, 90.33%, 90.70%, and 90.54% for arousal, valence, dominance, and likeness, respectively, demonstrating significant improvements in emotion recognition model capabilities. This paper elucidates the methodology and architectural specifics of our LSTM model and provides a benchmark analysis with existing papers.

* 13 pages, 3 figures; accepted at ICDSA '24 Conference, Jaipur, India

Via

Access Paper or Ask Questions