Prediction of Multivariate Air Quality Time Series Data using Long Short-Term Memory Network


  • Mohd Aftar Abu Bakar Universiti Kebangsaan Malaysia
  • Noratiqah Mohd Ariff Universiti Kebangsaan Malaysia
  • Mohd Shahrul Mohd Nadzir Universiti Kebangsaan Malaysia
  • Ong Li Wen Universiti Kebangsaan Malaysia
  • Fatin Nur Afiqah Suris Universiti Kebangsaan Malaysia



air quality, Long Short-Term Memory Network (LSTM), Auto-Regressive Integrated Moving Average (ARIMA), forecasting model, multivariate


Malaysia often suffers from haze problems almost every year. Therefore, there is a need for good air quality forecasting model for monitoring and management purposes. In this study, the air quality model based on the Long Short-Term Memory Network (LSTM) and Auto-Regressive Integrated Moving Average (ARIMA) was developed. The prediction of the particulate matter 10 micrometres or less in diameter (PM10) in Malaysia could be made from both models, and their performance was compared. The purpose of comparison between the two models was to determine the most suitable model to use in predicting PM10 since it is the dominant pollutant in Malaysia most of the time, especially during the haze period. This study used air quality data obtained from the Department of Environment Malaysia from July 2017 to June 2019. The results showed that forecasting for PM10 using multivariate LSTM model was better than the univariate LSTM model and univariate ARIMA model with the lowest root mean square error (RMSE) for those selected stations. The model with a lower RMSE value means better models and provide higher accuracy in forecasting for PM10.


Gurjar, B.R., Butler, T.M., Lawrence, M.G. and Lelieveld, J. Evaluation of emissions and air quality in megacities. Atmospheric Environment. 2008. 42(7): 1593-1606.

Zhang, Y., Bouquet, M., Mallet, V., Seigneur, C. and Baklanov, A. Real-time air quality forecasting, part I: history, techniques and current status. Atmospheric Environment. 2012. 60: 632-655.

Zhang, Y., Bouquet, M., Mallet, V., Seigneur, C. and Baklanov, A. Real-time air quality forecasting, part II: state of the science, current research needs and future prospects. Atmospheric Environment. 2012. 60: 656-676.

Vardoulakis, S., Fisher, B.E.A., Pericleous, K. and Flesca, N.G. Modelling air quality in street canyons: a review. Atmospheric Environment. 2003. 37(2): 155-182.

Dong, M., Yang, D., Kuang, Y., He, D., Erdal, S. and Kenski, D. PM2.5 concentration prediction using hidden semi-Markov model-based times series data mining. Expert Systems with Applications. 2009. 36(5): 9046-9055

Donnelly, A., Misstear, B. and Broderick, B. Real time air quality forecasting using integrated parametric and non-parametric regression techniques. Atmospheric Environment. 2015. 103: 53-65.

Du, S.D., Li, T.R., Yang, Y. and Horng, S.J. Deep air quality forecasting using hybrid deep learning framework. Cornell University: PhD Thesis. 2019.

Wang, J., Niu, T. & Wang, R. Research and application of an air quality early warning system based on a modified least squares support vector machine and a cloud model. International Journal Environmental Research and Public Health. 2017. 14(3): 249.

Grivas, G. and Chaloulakou, A. Artificial neural network models for prediction of PM10 hourly concentrations, in the Greater Area of Athens, Greece. Atmospheric Environment. 2006. 40(7): 1216-1229.

Zeyhelgil H.L., Demiroren A. and Sengor N.S. The application of ANN technique to automatic generation control for multi-area power system. International Journal of Electrical Power & Energy Systems. 2002. 24(5): 345-354.

Colah. Understanding LSTM Networks. [2020]. 2015.

Hochreiter, S. and Schmidhuber, J. Long short-term memory. Neural Computation. 1997. 9(8): 1735-1780.

Salman, A.G., Heryadi, Y., Abdurahman, E. and Suparta, W. Single layer & multi-layer long short-term memory (LSTM) model with intermediate variables for weather forecasting. Procedia Computer Science. 2018. 135: 89-98.

Sak, H., Senior, A. and Beaufays, F. Long Short-term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modelling. Google, USA. 2014.

Kratzert, F., Klotz, D., Brenner, C., Schulz, K. and Herrnegger, M. Rainfall-runoff modelling using Long Short-Term Memory (LSTM) networks. Hydrol. Earth Syst. Sci. 2018. 22: 6005-6022.

Department of Environment Malaysia. Haze Facts. [2021]. 2016.

How, C.Y., and Ling, Y.E. The influence of PM2.5 and PM10 on Air Pollution Index (API). Environmental Engineering, Hydraulics and Hydrology: Proceeding of Civil Engineering, Universiti Teknologi Malaysia, Johor, Malaysia. 2016. 3: 132.

Mutalib, S.N.S.A., Juahir, H., Azid, A., Sharif, S.M., Latif, M. T., Aris, A.Z., Zain, S.M. and Dominick, D. Spatial and temporal air quality pattern recognition using environmetric techniques: a case study in Malaysia. Environmental Science Processes & Impacts. 2013. 15: 1717-1728.

Goodfellow, I., Bengio, Y. and Courville, A. Deep Learning. Massachusetts: The MIT Press. 2016.

Ariff, N. M., Zamhawari, N.H. and Bakar, M.A.A. Time series ARIMA models for daily price of palm oil. AIP Conference Proceedings. 2015. 1643: 281-288.

Singh, S., Parmar, K.S., Kumar, J. and Makkhan, S.J.S. Develoment of new hybrid model of discrete wavelet decomposition and autoregressive integrated moving average (ARIMA) models in application to one month forecast the casualties cases of COVID-19. Chaos, Solitions and Fractals. 2020. 109866.

Zhang, R., Guo, Z., Meng, Y., Wang, S., Li, S., Niu, R., Wang, Y., Guo, Q. and Li, Y. Comparison of ARIMA and LSTM in forecasting the indices of HFMD combined and uncombined with exogenous meteorological variables in Ningbo, China. International Journal of Environmental Research and Public Health. 2021. 18: 6174.

Krishan, M., Jha, S., Das, J., Singh, A., Goyal, M.K. and Sekar, C. Air quality modelling using long short-term memory (LSTM) over NCT-Delhi, India. Air Quality, Atmosphere & Health. 2019. 12: 899-908.