From Gestures to Audio: A Dataset Building Approach for Egyptian Sign Language Translation to Arabic Speech

By

Ismail Y.

Tarek A.

Zakaria O.

Taha S.

Masoud S.

Sobh K.

Salah A.T.

Khoriba G.

Circuit Theory and Applications

Software and Communications

The communication barriers faced by people with disabilities, particularly the deaf or hard of hearing, nonverbal, deaf-mute, and blind have a significant impact on their quality of life and social inclusion. Our research aims to provide real-time translation from sign language to speech and vice versa. The ability to provide real-time speech-to-text and text-to-sign language translation will help alleviate these barriers, improve communication, and increase social inclusivity for this community ensuring they are not left out in conversations and social interactions. A significant amount of data is required to develop a deep learning model that automatically translates Egyptian sign language to audio. Due to the unavailability of large-scale datasets for Egyptian sign language, we present a dataset-building approach to address this issue. In this paper, we discuss collecting and preprocessing the dataset, which includes extracting audio, video scrapping, cropping the required part of the video containing the sign language translator, and applying human and hand recognition models to identify the translators and their gestures, respectively. The resulting dataset will be used to train deep learning models to translate Egyptian sign language to audio as an end-to-end approach. The dataset will be available in our research GitHub. © 2023 IEEE.