

Arabic English Speech Emotion Recognition System
The Speech Emotion Recognition (SER) system is an approach to identify individuals' emotions. This is important for human-machine interface applications and for the emerging Metaverse. This work presents a bilingual Arabic-English speech emotion recognition system based on EYASE and RAVDESS datasets. A novel feature set was composed by using spectral and prosodic parameters to obtain high performance at a low computational cost. Different classification models were applied. These machine learning classifiers are Random Forest, Support Vector Machine, Logistic Regression, Multi-Layer Perceptron, and Ensemble learning. The proposed feature set performance was compared to the "Interspeech 2009"challenge feature set, which is considered a benchmark in the field. Promising results were obtained using the proposed feature sets. SVM resulted in the best emotion recognition rate and execution performance. The best accuracies achieved were 85% on RADVESS, and 64% on EYASE. Ensemble learning detected the valence emotion with 90% on RADVESS, and 87.6% on EYASE. © 2023 IEEE.