Performance Measurement: Machine Learning as a Complement to DEA for Continuous Efficiency Estimation


  • Yousef Khoubrane EMINES - School of Industrial Management, University Mohammed VI Polytechnic, Hay Moulay Rachid Ben Guerir, 43150, Morocco
  • Noor Asiah Ramli School of Mathematical Sciences, College of Computing, Informatics and Mathematics, Universiti Teknologi MARA, 40450 Shah Alam, Selangor, Malaysia
  • Siti Shaliza Mohd Khairi School of Mathematical Sciences, College of Computing, Informatics and Mathematics, Universiti Teknologi MARA, 40450 Shah Alam, Selangor, Malaysia



Data Envelopment Analysis (DEA), Machine Learning (ML), Performance Measurement


Data Envelopment Analysis (DEA) is a well-established non-parametric technique for performance measurement to access the efficiency of Decision-Making Units (DMUs). However, its inability to predict the efficiency values of new DMUs without re-conducting the analysis on the entire dataset has led to the integration of Machine Learning (ML) in previous studies to address this limitation. Yet, such integration often lacks a thorough evaluation of ML's adaptability in replacing current DEA process. This paper presents the results of an empirical study that employed eight ML models, two DEA variants, and a dataset of S&P500 companies. The findings demonstrated ML’s  remarkable precision in predicting efficiency values derived from a single DEA run and comparable performance in predicting the efficiency of new DMUs, thus eliminating the need for repeated DEA. This discovery highlights ML’s robustness as a complementary tool for DEA in continuous efficiency estimation, rendering the practice of re-conducting DEA unnecessary. Notably, boosting models within the Ensemble Learning category consistently outperformed other models, highlighting their effectiveness in the context of DEA efficiency prediction. Particularly, CatBoost demonstrated its superiority as the top-performing model, followed by LightGBM in the second position in most cases. When extended to five enlarged datasets, it shows that the model exhibits superior R² values in the CRS scenario.   


Charnes, A., Cooper, W. W., & Rhodes, E. (1978). Measuring the efficiency of decision making units. European Journal of Operational Research, 2(6), 429-444.

Yu, X., & Lou, W. (2023). An exploration of prediction performance based on projection pursuit regression in conjunction with data envelopment analysis: a comparison with artificial neural networks and support vector regression. Mathematics, 11(23), 4775-4803.

Zhu, N., Zhu, C., & Emrouznejad, A. (2021). A combined machine learning algorithms and DEA method for measuring and predicting the efficiency of Chinese manufacturing listed companies. Journal of Management Science and Engineering, 6(4), 435-448.

Ramli, N. A., Khairi, S. S. M., & Razlan, N. A. (2018). Performance measurement of islamic and conventional banking in Malaysia using two-stage analysis of DEA model. International Journal of Academic Research in Business and Social Sciences, 8(4), 1185-1197.

Yang, Y., & Guo, L. (2021). Research on Diagnostic Test and Treatment for Higher Education System. 2021 IEEE 3rd International Conference on Frontiers Technology of Information and Computer (ICFTIC), 291-300. IEEE.

Zuluaga, R., Camelo-Guarín, A., & De La Hoz, E. (2023). Assessing the relative impact of Colombian higher education institutions using fuzzy data envelopment analysis (fuzzy-DEA) in state evaluations. Journal on Efficiency and Responsibility in Education and Science, 16(4), 299-312.

Katharakis, G., Katharaki, M., & Katostaras, T. (2014). An empirical study of comparing DEA and SFA methods to measure hospital units’ efficiency. International Journal of Operational Research, 21(3), 341-364.

Antunes, J., Hadi-Vencheh, A., Jamshidi, A., Tan, Y., & Wanke, P. (2023). TEA-IS: A hybrid DEA-TOPSIS approach for assessing performance and synergy in Chinese health care. Decision Support Systems, 113916-113929.

Mirmozaffari, M., & Kamal, N. (2023). The application of data envelopment analysis to emergency departments and management of emergency conditions: A narrative review. Healthcare, 11(18), 2541-2568.

Emrouznejad, A., & Yang, G. L. (2018). A survey and analysis of the first 40 years of scholarly literature in DEA: 1978–2016. Socio-economic Planning Sciences, 61, 4-8.

Yang, G., Ren, X., Khoveyni, M., & Eslami, R. (2020). Directional congestion in the framework of data envelopment analysis. Journal of Management Science and Engineering, 5(1), 57-75.

Song, Y. Y., Yang, G. L., Yang, J. B., Khoveyni, M., & Xu, D. L. (2018). Using two-layer minimax optimization and DEA to determine attribute weights. Journal of Management Science and Engineering, 3(2), 76-100.

Anouze, A. L. M., & Bou-Hamad, I. (2019). Data envelopment analysis and data mining to efficiency estimation and evaluation. International Journal of Islamic and Middle Eastern Finance and Management, 12(2), 169-190.

Zhang, Z., Xiao, Y., & Niu, H. (2022). DEA and machine learning for performance prediction. Mathematics, 10(10), 1776-1798.

Bowlin, W. F., Charnes, A., Cooper, W. W., & Sherman, H. D. (1984). Data envelopment analysis and regression approaches to efficiency estimation and evaluation. Ann. Oper. Res., 2(1), 113-138.

Athanassopoulos, A. D., & Curram, S. P. (1996). A comparison of data envelopment analysis and artificial neural networks as tools for assessing the efficiency of decision making units. Journal of the Operational Research Society, 47, 1000-1016.

Salehi, V., Veitch, B., & Musharraf, M. (2020). Measuring and improving adaptive capacity in resilient systems by means of an integrated DEA-Machine learning approach. Applied Ergonomics, 82, 102975-102984.

Jomthanachai, S., Wong, W. P., & Lim, C. P. (2021). An application of data envelopment analysis and machine learning approach to risk management. IEEE Access, 9, 85978-85994.

Nishtha, Puri, J., & Setia, G. (2023). Performance prediction of DMUs using integrated DEA-SVR approach with imprecise data: application on Indian banks. Soft Computing, 27(9), 5325-5355.

Guerrero, N. M., Aparicio, J., & Valero-Carreras, D. (2022). Combining data envelopment analysis and machine learning. Mathematics, 10(6), 909-930.

Zhong, K., Wang, Y., Pei, J., Tang, S., & Han, Z. (2021). Super efficiency SBM-DEA and neural network for performance evaluation. Information Processing & Management, 58(6), 102728-102741.

Hong, H. K., Ha, S. H., Shin, C. K., Park, S. C., & Kim, S. H. (1999). Evaluating the efficiency of system integration projects using data envelopment analysis (DEA) and machine learning. Expert Systems with Applications, 16(3), 283-296.

Visbal-Cadavid, D., Mendoza, A. M., & Hoyos, I. Q. (2019). Prediction of efficiency in Colombian higher education institutions with data envelopment analysis and neural networks. Pesquisa Operacional, 39, 261-275.

Babaei Keshteli, H., & Rostamy-Malkhalifeh, M. (2022). A combined machine learning algorithms and Interval DEA method for measuring predicting the efficiency. International Journal of Data Envelopment Analysis, 10(3), 57-64.

Appiahene, P., Missah, Y. M., & Najim, U. (2020). Predicting bank operational efficiency using machine learning algorithm: comparative study of decision tree, random forest, and neural networks. Advances In Fuzzy Systems, 2020, 1-12.

Wei, J., Ye, T., & Zhang, Z. (2021). A machine learning approach to evaluate the performance of rural bank. Complexity, 2021, 1-10.

Thaker, K., Charles, V., Pant, A., & Gherman, T. (2022). A DEA and random forest regression approach to studying bank efficiency and corporate governance. Journal of the Operational Research Society, 73(6), 1258-1277.

Pierre-Louis Danieau. (2021). Financial Data S&P500 companies. Kaggle. Retrieved October 15, 2023 from

Ramanathan, R. (2003). An introduction to data envelopment analysis: A tool for performance measurement. Sage.

Banker, R. D., Charnes, A., & Cooper, W. W. (1984). Some models for estimating technical and scale inefficiencies in data envelopment analysis. Management Science, 30(9), 1078-1092.

Zhou, Z. H. (2021). Machine Learning. Springer Nature, Berlin.

Andrea Raith, Fariza Fauzi and Olga Perederieieva. (2016). pyDEA Documentation. Retrieved October 31, 2023 from

Che, J., & Wang, J. (2014). Short-term load forecasting using a kernel-based support vector regression combination model. Applied Energy, 132, 602-609.