Abstract:This work presents EmoAra, an end-to-end emotion-preserving pipeline for cross-lingual spoken communication, motivated by banking customer service where emotional context affects service quality. EmoAra integrates Speech Emotion Recognition, Automatic Speech Recognition, Machine Translation, and Text-to-Speech to process English speech and deliver an Arabic spoken output while retaining emotional nuance. The system uses a CNN-based emotion classifier, Whisper for English transcription, a fine-tuned MarianMT model for English-to-Arabic translation, and MMS-TTS-Ara for Arabic speech synthesis. Experiments report an F1-score of 94% for emotion classification, translation performance of BLEU 56 and BERTScore F1 88.7%, and an average human evaluation score of 81% on banking-domain translations. The implementation and resources are available at the accompanying GitHub repository.
Abstract:This paper addresses the challenge of improving interaction quality in dialogue based learning by detecting and recommending effective pedagogical strategies in tutor student conversations. We introduce PedagoSense, a pedology grounded system that combines a two stage strategy classifier with large language model generation. The system first detects whether a pedagogical strategy is present using a binary classifier, then performs fine grained classification to identify the specific strategy. In parallel, it recommends an appropriate strategy from the dialogue context and uses an LLM to generate a response aligned with that strategy. We evaluate on human annotated tutor student dialogues, augmented with additional non pedagogical conversations for the binary task. Results show high performance for pedagogical strategy detection and consistent gains when using data augmentation, while analysis highlights where fine grained classes remain challenging. Overall, PedagoSense bridges pedagogical theory and practical LLM based response generation for more adaptive educational technologies.
Abstract:Parameter efficient fine tuning is a way to adapt LLMs to new languages when compute or data are limited, yet adapter pipelines usually choose a global prune ratio by grid search. This practice is computationally expensive and development set intensive, since it repeats training, freezes sparsity, and misses fractional optima. We introduce GRASP LoRA (GRPO Guided Adapter Sparsity Policy), which treats global sparsity as a learnable control variable. A GRPO controller interleaves with training, periodically probing candidate prune ratios on a small micro development set and updating a single global prune ratio online from its reward signal. It operates on merged source and target LoRA adapters on a frozen backbone and replaces grid search with one controller run that learns a prune ratio, followed by a single final merge and prune fine tuning run with pruning fixed to that ratio. On cross lingual transfer from English into Arabic and Chinese, including XL-Sum summarization and MLQA extractive question answering with Llama 3 8B, GRASP LoRA improves semantic faithfulness, content coverage, and answer quality over strong target only and merge and prune baselines. It reduces end to end runtime by multiple times relative to grid search, lowers reliance on large development sets, and makes adapter reuse practical for low resource deployment.