Development of missing data prediction model for carbon monoxide


  • Nurul Latiffah Abd Rani Universiti Sultan Zainal Abidin (UniSZA)
  • Azman Azid Universiti Sultan Zainal Abidin (UniSZA)
  • Muhamad Shirwan Abdullah Sani International Islamic University Malaysia
  • Mohd Saiful Samsudin Universiti Sultan Zainal Abidin (UniSZA)
  • Ku Mohd Kalkausar Ku Yusof Universiti Sultan Zainal Abidin (UniSZA)
  • Siti Noor Syuhada Muhammad Amin Universiti Sultan Zainal Abidin (UniSZA)
  • Saiful Iskandar Khalit Universiti Sultan Zainal Abidin (UniSZA)



Prediction model, carbon monoxide, sensitivity analysis, missing data model


Carbon monoxide (CO) is one of the most important pollutants since it is selected for API calculation. Therefore, it is paramount to ensure that there is no missing data of CO during the analysis. There are numbers of occurrences that may contribute to the missing data problems such as inability of the instrument to record certain parameters. In view of this fact, a CO prediction model needs to be developed to address this problem. A dataset of meteorological and air pollutants value was obtained from the Air Quality Division, Department of Environment Malaysia (DOE). A total of 113112 datasets were used to develop the model using sensitivity analysis (SA) through artificial neural network (ANN). SA showed particulate matter (PM10) and ozone (O3) were the most significant input variables for missing data prediction model of CO. Three hidden nodes were the optimum number to develop the ANN model with the value of R2 equal to 0.5311. Both models (artificial neural network-carbon monoxide-all parameters (ANN-CO-AP) and artificial neural network-carbon monoxide-leave out (ANN-CO-LO)) showed high value of R2 (0.7639 and 0.5311) and low value of RMSE (0.2482 and 0.3506), respectively. These values indicated that the models might only employ the most significant input variables to represent the CO rather than using all input variables.

Author Biographies

Nurul Latiffah Abd Rani, Universiti Sultan Zainal Abidin (UniSZA)

Faculty Bioresources and Food Industry, Universiti Sultan Zainal Abidin (UniSZA)

Azman Azid, Universiti Sultan Zainal Abidin (UniSZA)

Faculty Bioresources and Food Industry, Universiti Sultan Zainal Abidin (UniSZA)

Muhamad Shirwan Abdullah Sani, International Islamic University Malaysia

International Institute for Halal Research and Training,

Mohd Saiful Samsudin, Universiti Sultan Zainal Abidin (UniSZA)

Faculty Bioresources and Food Industry, Universiti Sultan Zainal Abidin (UniSZA)

Ku Mohd Kalkausar Ku Yusof, Universiti Sultan Zainal Abidin (UniSZA)

Faculty Bioresources and Food Industry, Universiti Sultan Zainal Abidin (UniSZA)

Siti Noor Syuhada Muhammad Amin, Universiti Sultan Zainal Abidin (UniSZA)

Faculty of Medicine

Saiful Iskandar Khalit, Universiti Sultan Zainal Abidin (UniSZA)

Faculty Bioresources and Food Industry, Universiti Sultan Zainal Abidin (UniSZA)


Ababneh, M. F., Al-Manaseer, A. O., Btoush, M. H. 2014. PM10 forecasting using soft computing techniques research. Journal of Applied Sciences, Engineering and Technology, 7(16): 3253-3265.

Afroz, R., Hassan, M. N., Ibrahim., N. A. 2003. Review of air pollution and health impacts in Malaysia. Environmental Research, 92(2): 71-77.

Ahmat, H., Yahaya, A. S., Ramli, N. A, 2016. The Malaysia PM10 analysis using extreme value. Journal of Engineering Science and Technology,10(12): 1560 – 1574.

Arhami, M., Kamali, N., Rajabi, M. M. 2013. Predicting hourly air pollutant levels using artificial neural networks coupled with uncertainty analysis by Monte Carlo simulations. Environmental Science and Pollution Research, 20: 4777–4789.

Asadollahfardi, G., Tayebi, J. M., Mehdinejad, M., Rajabipour, M. J. 2016. Short-term prediction of atmospheric concentrations of ground-level ozone in Karaj using artificial neural network. Pollution, 2(4): 475-488.

Awang, N. R., Ramli, N. A., Yahaya, A. S., Elbayoumi, M 2015. Multivariate methods to predict ground level ozone during daytime, nighttime, and critical conversion time in urban areas. Atmospheric Pollution Research, 6:726-734.

Azid, A., Juahir, H., Ezani, E., Toriman, M. E., Endut, A., Rahman, M. N. A., Yunus, K., Kamarudin, M. K. A., Hasnam, C. N. C., Saudi, A. S. M., Umar, R. 2015. Identification source of variation on regional impact of air quality pattern using chemometrics Aerosol and Air Quality Research, 15: 1545–1558.

Azid, A., Juahir, H., Toriman, M. K., Endut, A., Rahman, M. N. A., Kamarudin, M. K. A., Latif, M. T., Saudi, A. S. M., Hasnam, C. N. C., Yunus, K. 2016. Selection of the most significant variables of air pollutants using sensitivity analysis. Journal of Testing and Evaluation, 44(1): 376-384.

Azid, A., Juahir, H., Toriman, M., Kamarudin, M., Saudi, A., Hasnam, C. 2014. Prediction of the level of air pollution using principal component analysis and artificial neural network techniques: a case study in Malaysia. Water, Air, & Soil Pollution, 225(8): 1-14.

Burke, S. 1999. Missing values, outliers, robust statistics and non-parametric methods. LC*GC Europe Online Suplement, 19-24.

Chaloulakou, A., Grivas, G., Spyrellis, N. 2003. Neural network and multiple regression models for PM10 prediction in athens: A comparative assessment, Journal of the Air & Waste Management Association, 53(10): 1183-1190.

Chen, R.Pan, G., Zhang, Y., Xu, Q., Zeng, G., Xu, X. 2011. Ambient carbon monoxide and daily mortality in three Chinese cities: the China air pollution and health effects study (CAPES). Science of the Total Environment, 409(23): 4923-4928.

Chen, W., Tang, H., Zhao, H. 2016. Urban air quality evaluations under two versions of the national ambient air quality standards of China Atmospheric Pollution Research 7: 49-57.

DOE, Department Of Environment Malaysia. 2004. Malaysian Environmental Quality Report.

Esfandani, M. A., Nematzadeh, H. 2016. Predicting air pollution in Tehran: Genetic algorithm and back propagation neural network. Journal of AI and Data Mining, 4(1): 49-54.

Fletcher, D., Goss, E. 1993.Forecasting with neural networks: An application using bankruptcy data. Information and Management, 24:159-167.

Hassanzadeh, S., Hosseinibalam, F. Alizadeh, 2009. R. Statistical models and time series forecasting of sulfur dioxide: A case study Tehran. Environmental Monitoring and Assessment, 155(1): 149-155.

He, H., Lu, W. Z., Xue, Y. 2014.Prediction of particulate matter at street level using artificial neural networks coupling with chaotic particle swarm optimization algorithm. Building and Environment, 78: 111-117.

Honaker, J., King, G., Blackwell, M. 2011. Amelia II: A Program for Missing Data. Journal of Statistical Software, 45(7): 1-47.

Junninen, H., Niska, H., Tuppurainen, K., Ruuskanen, J., Kolehmainen, M. 2004. Methods for imputation of missing values in air quality data sets. Atmospheric Environment, 38: 2895–2907.

Kanniah, K. D., Kaskaoutis, D. G., Lim, H. S., Latif, M. T., Zaman, N. A. F. K., Liew, J. 2016. Overview of atmospheric aerosol studies in Malaysia: Known and unknown. Atmospheric Research, 182: 302–318.

Kukkonen, J., Partanen, L., Karppinen, A., Ruuskanen, J., Junninen, H., Kolehmainen, M., Niska, H., Dorling, S., Chatterton, T., Foxall, R., Cawley, G. 2003. Extensive evaluation of neural network models for the prediction of NO2 and PM10 concentrations, compared with a deterministic modelling system and measurements in centralHel sinki. Atmospheric Environment 37: 4539–4550.

Kumar, A., Goyal, P. 2011. Forecasting of daily air quality index in Delhi. Science of the Total Environment, 409(24): 5517-5523.

Kumar, N., Middey, A., Rao, P. S. 2017. Prediction and examination of seasonal variation of ozone with meteorological parameter through artificial neural network at NEERI, Nagpur, India Urban Climate, 20: 148-167.

Latif, M. T., Azmi, S. Z., Noor, A. D. M., Ismail, A. S., Johny, Z., Idrus, S., Mohamad, A. F., And Mokhtar, M., 2011. The impact of urban growth on regional air quality surrounding the Langat River Basin, Malaysia, Environmentalist, 31(3): 315–324.

Latif, M. T., Dominick, D., Ahamad, F., Khan, M. F., Juneng, L., Hamzah, F. M., Nadzir, M. S. M. 2014. Long term assessment of air quality from a background station on the Malaysian Peninsula. Science of the Total Environment, 482–483: 336–348.

Levy, R. J. 2015. Carbon monoxide pollution and neurodevelopment: A public health concern. Neurotoxicology and Teratology, 49: 31-40.

Little, R. J. A., Rubin, D. B. 1987. Statistical Analysis with Missing Data, New York: Wiley.

Manache, G., Melching, C. S., 2008. Identification of reliable regression- and correlation-based sensitivity measures for importance ranking of water quality model parameters. Environmental Modelling & Software, 23(5): 549–562.

Mohamad, N. D., Ash’aari, Z. H., Othman, M., 2015. Preliminary assessment of air pollutant sources identification at selected monitoring stations in Klang Valley, Malaysia. Procedia Environmental Sciences, 30: 121 – 126.

Mustafa, M., Syed Abdul Kader, S. Z., Sufian, A., 2012. Coping with climate change through air pollution control: Some legal initiatives from Malaysia. 2012 International Conference on Environment, Energy and Biotechnology, Kuala Lumpur, Malaysia. May 5–6, 33,101–105.

Najafpoor, A., Hosseinzadeh, A., Allahyari, S., Javid, A., Esmaily, H. 2014. Modeling of CO and NOX produced by vehicles in Mashhad, 2012. Environmental Health Engineering And Management Journal, 1(1): 45-49.

Nasir, M. F. M., Juahir, H., Roslan, N., Mohd, I., Shafie, N. A., Ramli, N. 2011. Artificial neural networks combined with sensitivity analysis as a prediction model for water quality index in Juru River, Malaysia. International Journal of Environmental Protection, 1(3): 1-8.

Rahimi, A. 2017. Short-term prediction of NO2 and NOx concentrations using multilayer perceptron neural network: A case study of Tabriz, Iran. Ecological Processes, 6:4.

Rani, N. L. A., Azid, A., Khalit, S. I., Gasim, M. B., Juahir, H, 2017. Selected Malaysia air quality pollutants assessment using chemometrics techniques. Journal of Fundamental and Applied Sciences, 9(2): 335-351.

Srinivasan, D., Liew, A.C., Chang, C. S. 1994. A neural network short-term load forecaster. Electric Power Systems Research, 28: 227-234.

Wang, X. K., Lu, W. Z. 2006. Seasonal variation of air pollution index: Hong Kong case study. Chemosphere 63(8): 1261-1272.

Zakaria, N. A., Noor, N. M. 2018.Imputation methods for filling missing data in urban air pollution data for Malaysia. Urbanism Arhitectură. Construcţii, 9(2): 159-166.

Zali, M. A., Retnam, A., Juahir, H., Zain, S. M., Kasim, M. F., Abdullah, B., Saadudin, S. B. 2011. Sensitivity analysis for water quality index (WQI) prediction for Kinta River. Malaysia World Applied Sciences Journal, 14 (Exploring Pathways to Sustainable Living in Malaysia: Solving the Current Environmental Issues): 60-65.

Zare, A. H. 2014. Evaluation of multivariate linear regression and artificial neural networks in prediction of water quality parameters. Journal of Environmental Health Science and Engineering, 12: 40.

Zhang, G., Patuwo, E. B., Hu, M. Y. 1998. Forecasting with artificial neural networks: The state of the Art. International Journal of Forecasting, 14: 35-62.

Zoroufchi, B. K, Fatehifar, E. 2015. Optimal design of air quality monitoring network around an oil refinery plant: A holistic approach. International Journal of Environmental Science and Technology, 12(4): 1331-1342.