Abstract:Under-resourced languages remain underrepresented in quantitative rhythm research,particularly in systematic intra-branch analysis of acoustic differentiation within closely related linguistic groups.This study investigates acoustic differentiation within the Tani language subgroup by examining speech rhythm in Nyishi and Adi,two under-resourced Tani languages spoken in Arunachal Pradesh,North-East India,using a frequency domain framework based on amplitude modulation(AM) low-frequency(LF) spectrum analysis,commonly referred to as rhythm formant analysis(RFA).The analysis is designed to identify whether intra-branch differentiation follows a hierarchical pattern across rhythmic and spectral domains.From the LF modulation spectrum,three rhythm formant features were derived:Number of Dominant peaks(NDP),Mean Frequency of Dominant Peaks(MFDP),and Variance of Dominant Frequencies(VFDP).In addition,Discrete Cosine Transform (DCT)coefficients and Mel Frequency Cepstral Coefficient(MFCC) were extracted to characterise the spectral modulation structure and broad spectral organisation of the speech signal.Statistical modelling reveals a hierarchical pattern of differentiation,where rhythmic features show consistent but moderate separation,with Nyishi exhibiting higher dominant modulation frequencies as well as greater dispersion than Adi.Classification experiments further support this hierarchy,with rhythm-only features achieved approximately 84-85% classification accuracy.Fusion using MFCC representations improved performance to 90.9% classification accuracy using support vector machine (SVM) and 93.96% using multilayer perceptron (MLP).These findings demonstrate that rhythmic and spectral features encode complementary levels of linguistic variations,with low frequency modulation capturing constrained macro temporal structure and spectral features reflecting finer phonological differentiation.




Abstract:The current work explores long-term speech rhythm variations to classify Mising and Assamese, two low-resourced languages from Assam, Northeast India. We study the temporal information of speech rhythm embedded in low-frequency (LF) spectrograms derived from amplitude (AM) and frequency modulation (FM) envelopes. This quantitative frequency domain analysis of rhythm is supported by the idea of rhythm formant analysis (RFA), originally proposed by Gibbon [1]. We attempt to make the investigation by extracting features derived from trajectories of first six rhythm formants along with two-dimensional discrete cosine transform-based characterizations of the AM and FM LF spectrograms. The derived features are fed as input to a machine learning tool to contrast rhythms of Assamese and Mising. In this way, an improved methodology for empirically investigating rhythm variation structure without prior annotation of the larger unit of the speech signal is illustrated for two low-resourced languages of Northeast India.




Abstract:This paper reports a preliminary study on quantitative frequency domain rhythm cues for classifying five Indian languages: Bengali, Kannada, Malayalam, Marathi, and Tamil. We employ rhythm formant (R-formants) analysis, a technique introduced by Gibbon that utilizes low-frequency spectral analysis of amplitude modulation and frequency modulation envelopes to characterize speech rhythm. Various measures are computed from the LF spectrum, including R-formants, discrete cosine transform-based measures, and spectral measures. Results show that threshold-based and spectral features outperform directly computed R-formants. Temporal pattern of rhythm derived from LF spectrograms provides better language-discriminating cues. Combining all derived features we achieve an accuracy of 69.21% and a weighted F1 score of 69.18% in classifying the five languages. This study demonstrates the potential of RFA in characterizing speech rhythm for Indian language classification.