Enhancing Paddy Production in Malaysia: A Comparative Analysis of Multiple Regression with External Factors

Authors

  • Ahmad Syakir Mohd Shafri Department of Mathematical Sciences, Faculty of Science, Universiti Teknologi Malaysia, 81310 UTM Johor Bahru, Johor, Malaysia
  • Siti Rohani Mohd Nor Department of Mathematical Sciences, Faculty of Science, Universiti Teknologi Malaysia, 81310 UTM Johor Bahru, Johor, Malaysia
  • Siti Mariam Norrulashikin Department of Mathematical Sciences, Faculty of Science, Universiti Teknologi Malaysia, 81310 UTM Johor Bahru, Johor, Malaysia

DOI:

https://doi.org/10.11113/mjfas.v21n1.3991

Keywords:

Forecasting, MARS, MLR, paddy production, SVR.

Abstract

Rice is a basic food that is consumed by almost half of the world's population, especially in Malaysia. Unfortunately, paddy productivity has recently decreased dramatically, which has hampered local rice supply and forced Malaysia to depend on rice from neighbouring nations. Therefore, the modelling of paddy production is important because it provides an insight that ensure a sufficient supply of a locally produced paddy. In this study, several multiple regression models were used to predict paddy production model in Malaysia from 1980 to 2022. The multiple regression models are then compared, and the best model is determined. The regression model used in this study are Multiple Linear Regression (MLR), Multivariate Adaptive Regression Spline (MARS) and Support Vector Regression (SVR). The developed models will be evaluated using RMSE, MAPE and Ljung-Box Test. These are excellent tools for gauging the precision of the fitted model, thus can be used to evaluate the model. Based on the study, the MARS model is better at modelling Malaysian paddy production from 1980 to 2022 since it has the smallest number of measurement errors. Therefore, MARS model is the best regression model to be used to model paddy production in Malaysia compared to MLR and SVR model. Since all the models have autocorrelation, a more effective approach and model can be presented to overcome the autocorrelation issue in the future.

References

Abiola, O. A., Mad, N. S., Alias, R., & Ismail, A. (2016). Resource-use and allocative efficiency of paddy rice production in Mada, Malaysia. Journal of Economics and Sustainable Development, 7(1).

Adnan, N., & Nordin, S. M. (2020). How covid-19 affects the Malaysian paddy industry? Adoption of green fertilizer as a potential resolution. Environment, Development and Sustainability, 23(6), 8089–8129. https://doi.org/10.1007/s10668-020-00978-6

Alam, M. M., Siwar, C., Talib, B., & Toriman, M. (2014). Impacts of climatic changes on paddy production in Malaysia: Micro study on IADA at North West Selangor. Alam, MM, Siwar, C., Talib, B., and Mohd Ekhwan, 251–258.

Boehmke, B., & Greenwell, B. (2020a, February 1). Hands-on machine learning with R. Chapter 7 Multivariate Adaptive Regression Splines. https://bradleyboehmke.github.io/HOML/mars.html

Daño, E. C., & Samonte, E. D. (2005). Public sector intervention in the rice industry in Malaysia. Southeast Asia Regional Initiatives for Community Empowerment (SEARICE), Quezon City, 2548.

Department of Agriculture. (2021, August 23). Principal Statistics of Paddy and Rice by All Seasons, Malaysia. Retrieved from Department of Agriculture Official Portal.

Dorairaj, D., & Govender, N. T. (2023, April 19). Rice and paddy industry in Malaysia: Governance and policies, research trends, technology adoption and resilience. Frontiers. https://www.frontiersin.org/articles/10.3389/fsufs.2023.1093605/full

Friedman, J. H. (1991). Multivariate adaptive regression splines. The Annals of Statistics, 19(1), 1–67.

Hayes, A. (2023). Multiple linear regression (MLR) definition, formula, and example. Investopedia.

Huang, C., Liu, D. D., & Wang, J. S. (2009). Forecast daily indices of solar activity, F10.7, using support vector regression method. Research in Astronomy and Astrophysics, 9(6), 694.

Kabacoff, R. I. (2017). Multiple (linear) regression. Quick-R: Multiple Regression. https://www.statmethods.net/stats/regression.html

Labjar, H., Cherif, W., Nadir, S., Digua, K., Sallek, B., & Chaair, H. (2016). Support vector machines for modelling phosphocalcic hydroxyapatite by precipitation from a calcium carbonate solution and phosphoric acid solution. Journal of Taibah University for Science, 10(5), 745–754.

Mogaji, K. A. (2016). Geoelectrical parameter-based multivariate regression borehole yield model for predicting aquifer yield in managing groundwater resource sustainability. Journal of Taibah University for Science, 10(4), 584–600.

Prabowo, H., Suhartono, S., & Prastyo, D. D. (2020). The performance of Ramsey test, White test, and Terasvirta test in detecting nonlinearity. Inferensi, 3(1), 1–12.

Sagar, C. (2017, March 8). Building regression models in R using support vector regression. KDnuggets. https://www.kdnuggets.com/2017/03/building-regression-models-support-vector-regression.html

Samsudin, R., Saad, P., & Shabri, A. (2008). A comparison of neural network, ARIMA model and multiple regression analysis in modeling rice yields. Editorial Advisory Board, 113.

Sarena Che Omar, S. A. (2019). The status of the paddy and rice industry in Malaysia. 1–221.

Schölkopf, B., & Smola, A. J. (2002). Learning with kernels: Support vector machines, regularization, optimization, and beyond. MIT Press.

Sun, F. K., Lang, C., & Boning, D. (2021). Adjusting for autocorrelated errors in neural networks for time series. Advances in Neural Information Processing Systems, 34, 29806–29819.

The Investopedia Team. (2024, June). Variance inflation factor (VIF). Investopedia. https://www.investopedia.com/terms/v/variance-inflation-factor.asp

Turney, S. (2024, February 10). Pearson correlation coefficient (R): Guide & examples. Scribbr. https://www.scribbr.com/statistics/pearson-correlation-coefficient/

Vapnik, V. (2013). The nature of statistical learning theory. Springer Science & Business Media.

Downloads

Published

21-02-2025