Optimized Supervised Learning Framework for Gestational Diabetes Mellitus (GDM) Detection based on Recursive Feature Elimination (RFE)
DOI:
https://doi.org/10.11113/mjfas.v21n6.4462Keywords:
Gestational Diabetes Mellitus, Recursive Feature Elimination, Feature Selection, Random Forest.Abstract
Supervised machine learning has been widely applied in healthcare, yet few studies emphasize feature selection's role in enhancing predictive accuracy. Identifying research that exclusively employs feature selection for GDM detection remains challenging. This study designs an optimized supervised learning framework using Recursive Feature Elimination (RFE) to improve early GDM detection and assist medical professionals in patient evaluation. RFE systematically ranks features by iteratively removing the least relevant ones until no further improvement through various estimators. The selected features are then used to build several frameworks based on five well established machine learning models. Results indicate that the integration of Recursive Feature Elimination (RFE) with a Random Forest as both estimator and classifier achieves the highest accuracy (97.31%) and F1-score (96.6%). These findings demonstrate RFE’s effectiveness in optimizing feature selection, enhancing model accuracy, and reducing computational cost through utilizing 6 number of features. In conclusion, incorporating RFE within supervised learning frameworks significantly improves GDM detection. This research contributes to developing automated diagnostic tools that assist healthcare professionals in evaluating patient health and predicting diseases, ultimately enhancing early diagnosis and improving patient outcomes, especially for pregnant women at risk of GDM.
References
Buchanan, T. A., Xiang, A. H., & Page, K. A. (2012). Gestational diabetes mellitus: Risks and management during and after pregnancy. Nature Reviews Endocrinology, 8(11), 639.
Sumathi, A., & Meganathan, S. (2022). Ensemble classifier technique to predict gestational diabetes mellitus (GDM). Computer Systems Science and Engineering, 40(1), 313–325.
Schmidt, M. I., Duncan, B. B., Reichelt, A. J., Branchtein, L., Matos, M. C., Forti, A. C., ... & Yamashita, T. (2001). Gestational diabetes mellitus diagnosed with a 2-h 75-g oral glucose tolerance test and adverse pregnancy outcomes. Diabetes Care, 24(7), 1151–1155.
Mpondo, B. C., Ernest, A., & Dee, H. E. (2015). Gestational diabetes mellitus: Challenges in diagnosis and management. Journal of Diabetes & Metabolic Disorders, 14, 1–7.
Cosson, E., Benchimol, M., Carbillon, L., Pharisien, I., Pariès, J., Valensi, P., & Attali, J. R. (2006). Universal rather than selective screening for gestational diabetes mellitus may improve fetal outcomes. Diabetes & Metabolism, 32(2), 140–146.
Zhang, Z., Yang, X., Zhang, L., & Xia, J. (2022). Machine learning prediction models for gestational diabetes mellitus: Meta-analysis. Journal of Medical Internet Research, 24(3), e26634.
Ahsan, M. M., & Siddique, Z. (2022). Machine learning-based heart disease diagnosis: A systematic literature review. Artificial Intelligence in Medicine, 128, 102289.
Li, W. T., Ma, J., Shende, N., Castaneda, G., Chakladar, J., Tsai, J. C., ... & Ho, C. M. (2020). Using machine learning of clinical data to diagnose COVID-19: A systematic review and meta-analysis. BMC Medical Informatics and Decision Making, 20, 1–13.
Atallah, R., & Al-Mousa, A. (2019, Nov 13–15). Heart disease detection using machine learning majority voting ensemble method. 2019 2nd International Conference on New Trends in Computing Sciences (ICTCS) (pp. 1–6). IEEE.
Terrada, O., Cherradi, B., Raihani, A., & Bouattane, O. (2019, April 24–25). Classification and prediction of atherosclerosis diseases using machine learning algorithms. 2019 5th International Conference on Optimization and Applications (ICOA) (pp. 1–5). IEEE.
Lu, J., Song, E., Ghoneim, A., & Alrashoud, M. (2020). Machine learning for assisting cervical cancer diagnosis: An ensemble approach. Future Generation Computer Systems, 106, 199–205.
Qin, J., Chen, L., Liu, Y., Liu, C., Feng, C., & Chen, B. (2019). A machine learning methodology for diagnosing chronic kidney disease. IEEE Access, 8, 20991–21002.
Abdar, M., Książek, W., Acharya, U. R., Tan, R.-S., Makarenkov, V., & Pławiak, P. (2019). A new machine learning technique for an accurate diagnosis of coronary artery disease. Computer Methods and Programs in Biomedicine, 179, 104992.
Haq, A. U., Li, J. P., Memon, M. H., Nazir, S., Sun, R., & Khan, T. M. (2020). Intelligent machine learning approach for effective recognition of diabetes in e-healthcare using clinical data. Sensors, 20(9), 2649.
Samb, M. L., Camara, F., Ndiaye, S., Slimani, Y., & Esseghir, M. A. (2012). A novel RFE-SVM-based feature selection approach for classification. International Journal of Advanced Science and Technology, 43(1), 27–36.
Brownlee, J. (2020). Recursive feature elimination (RFE) for feature selection in Python. Machine Learning Mastery, 25.
Abdulkareem, S. A., & Abdulkareem, Z. O. (2021). An evaluation of the Wisconsin breast cancer dataset using ensemble classifiers and RFE feature selection. International Journal of Science, Basic and Applied Research, 55(2), 67–80.
Zhang, B., Dong, X., Hu, Y., Jiang, X., & Li, G. (2023). Classification and prediction of spinal disease based on the SMOTE-RFE-XGBoost model. PeerJ Computer Science, 9, e1280.
Sachdeva, R. K., Singh, K. D., Bathla, P., Jain, A., Choudhury, T., & Kotecha, K. (2023). Empowering hepatitis diagnosis using RFE feature selection. 2023 7th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT) (pp. 1–5). IEEE.
Shen, J., Liu, J., Shankar, A., Zhong, B., Ranjan, R., & Wang, X. (2020). An innovative artificial intelligence–based app for the diagnosis of gestational diabetes mellitus (GDM-AI): Development study. Journal of Medical Internet Research, 22(9), e21573.
Wu, Y.-T., Chen, Y.-H., Wang, C.-H., Wang, C.-T., & Su, P.-F. (2020). Early prediction of high-risk gestational diabetes mellitus via machine learning models. medRxiv, 2020.03.26.20040196. https://doi.org/10.1101/2020.03.26.20040196
Theerthagiri, P. (2022). Predictive analysis of cardiovascular disease using gradient boosting–based learning and recursive feature elimination technique. Intelligent Systems with Applications, 16, 200121.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Muhammad Amir As’ari, Nur Dalila Mohd Zulkifli

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.














