Pirates at ArabicNLU2024: Enhancing Arabic Word Sense Disambiguation using Transformer-Based Approaches

By

Wael T.

Elrefai E.

Makram M.

Selim S.

Khoriba G.

Circuit Theory and Applications

Software and Communications

This paper presents a novel approach to Arabic Word Sense Disambiguation (WSD) leveraging transformer-based models to tackle the complexities of the Arabic language. Utilizing the SALMA dataset, we applied several techniques, including Sentence Transformers with Siamese networks and the SetFit framework optimized for few-shot learning. Our experiments, structured around a robust evaluation framework, achieved a promising F1-score of up to 71%, securing second place in the ArabicNLU 2024: The First Arabic Natural Language Understanding Shared Task competition. These results demonstrate the efficacy of our approach, especially in dealing with the challenges posed by homophones, homographs, and the lack of diacritics in Arabic texts. The proposed methods significantly outperformed traditional WSD techniques, highlighting their potential to enhance the accuracy of Arabic natural language processing applications. ©2024 Association for Computational Linguistics.