Early breast cancer diagnostics based on hierarchical machine learning classification for mammography images

Darweesh M.S.
Adel M.
Anwar A.
Farag O.
Kotb A.
Adel M.
Tawfik A.
Mostafa H.

Breast cancer constitutes a significant threat to women’s health and is considered the second leading cause of their death. Breast cancer is a result of abnormal behavior in the functionality of the normal breast cells. Therefore, breast cells tend to grow uncontrollably, forming a tumor that can be felt like a breast lump. Early diagnosis of breast cancer is proved to reduce the risks of death by providing a better chance of identifying a suitable treatment. Machine learning and artificial intelligence play a key role in healthcare systems by assisting physicians in diagnosing early, better, and treating various diseases. For achieving the early detection of breast cancer, this paper proposes a Machine Learning-based two-level top-down hierarchical approach for breast cancer detection and classification into three classes: normal, benign, and malignant, using the Mammographic Image Analysis Society (MIAS) mammography dataset. Different data preprocessing techniques are applied before using feature extraction techniques and machine learning algorithms for classification. The first classification stage which distinguishes between normal and abnormal cases is comprised of Gray Level Co-occurrence Matrix (GLCM) as a feature extraction technique and random forest as a classifier, followed by the second classification stage which classifies the abnormal cases into benign or malignant cases and is comprised of Local Binary Patterns (LBP) as a feature extraction technique and random forest as a classifier. The classification accuracy for the first stage is 97% and an F1-score of 0.98 and 0.97 for normal and abnormal classes. While for the second stage, the classification accuracy is 75% and an F1-score of 0.76 and 0.74 for benign and malignant classes. The overall hierarchical classification system achieves a classification accuracy of 85%, Matthews correlation coefficient (MCC) of 0.76, and F1-score of 0.98, 0.7, and 0.74 for normal, benign, and malignant test cases. © 2021 The Author(s). This open access article is distributed under a Creative Commons Attribution (CC-BY) 4.0 license.