The Impact of Heterogeneity in High-Ranking Variables Using Precision Farming

Authors

  • Nour Abu Afouna School of Mathematical Sciences, Universiti Sains Malaysia, 11800 USM, Penang, Malaysia
  • Majid Khan Majahar Ali School of Mathematical Sciences, Universiti Sains Malaysia, 11800 USM, Penang, Malaysia

DOI:

https://doi.org/10.11113/mjfas.v20n6.3564

Keywords:

Machine learning, lasso, ridge, elastic net, validation metrics, smart farming.

Abstract

Smart precision farming combines IoT, cloud computing, and big data to optimize agricultural productivity, reduce costs, and advance sustainability through digitalization and intelligent approaches. However, precision farming grapples with challenges like managing complex variables, addressing multicollinearity, handling outliers, ensuring model robustness, and improving accuracy, particularly with smaller or medium-sized datasets. Reducing retraining time and solving the calamity of complexity are necessary to overcome these obstacles and improve machine learning algorithms' performance, scalability, and efficiency—especially when working with big or high-dimensional datasets. In a recent study with 435 drying parameters and 1914 observations. In this study, we employed Ridge, Lasso, and Elastic Net regression techniques to address the challenges of multicollinearity and heterogeneity within our dataset. Traditional regression models, such as ordinary least squares (OLS), often struggle with multicollinearity, leading to unstable and unreliable coefficient estimates. Ridge regression mitigates this issue by adding an L2 penalty, stabilizing the coefficients. Lasso regression introduces an L1 penalty, which further enhances the model by performing variable selection. Elastic Net, a combination of L1 and L2 penalties, effectively handles both multicollinearity and heterogeneity by selecting relevant variables and capturing varying patterns across different subgroups. Our study's use of Ridge, Lasso, and Elastic Net regression techniques has broad practical applications across various fields. In economics, they help identify key indicators for economic forecasting; in healthcare, they improve predictions of patient outcomes for personalized treatment; in finance, they create more stable models for market behavior; and in social sciences, they reveal influential factors in behavioral studies. These methods effectively manage multicollinearity and heterogeneity, making them valuable tools for decision-making and policy development across these domains. The objective was to identify significant drying parameters both before and after heterogeneity, while selecting varying numbers of variables (50, 100, 150, 200, 250, 300) based on validation metrics such as MAPE, MSE, SSE, and R2. The results revealed that the Ridge model demonstrated the highest efficiency, exhibiting the smallest values for MAPE, MSE, SSE, and the largest value for R2, both before and after heterogeneity.

References

IBM. (n.d.). Smart farming. IBM. https://www.ibm.com/topics/smart-farming

Shanthakumari, G., Vignesh, A., Siva Harish, R. V., & Karthick, R. (2024). Advancements in smart agriculture: A comprehensive review of machine learning and IoT approaches. Proceedings of the International Conference on Computing, Communication, and Internet of Things (IC3IoT). https://doi.org/10.1109/ic3iot60841.2024.10550268

Sharanangat, K. (2024). Automated irrigation system in farming by solar energy. Indian Scientific Journal of Research in Engineering and Management, https://doi.org/10.55041/ijsrem34461

Devi, T. B., & Kalnar, Y. (2021). Design consideration of smart solar dryer for precision drying: Smart solar dryer for precision drying. Journal of Agricultural Sciences, 8(2). https://doi.org/10.21921/JAS.V8I2.7297

Villa-Medina, J. F., Porta-García, M. Á., Gutiérrez, J. M., & Porta-Gandara, M. (2023). Solar forced convection dryer for agriproducts monitored by IoT. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4513800

Senthil Kumar, K. L., & Saravanan, B. (2020). Design and fabrication of solar dryer for dehydration of vegetables. AIP Conference Proceedings. https://doi.org/10.1063/5.0019403

Sharma, K. P., Kothari, S., Panwar, N. L., Ram, M., & Patel, M. (2022). Influences of a novel cylindrical solar dryer on farmer’s income and its impact on environment. Environmental Science and Pollution Research. https://doi.org/10.1007/s11356-022-21344-1

Agricorn. (2023, July). Drying and dehydration material handling equipment. https://www.agricorn.in/2023/07/drying-and-dehydration-material-handling-equipment.html

Tagnamas, Z., Idlimam, A., & Lamharrar, A. (2023). Predictive models of beetroot solar drying process through machine learning algorithms. Renewable Energy, 219(Part 2), 119522. https://www.elsevier.com/locate/renene

Udomkun, P., Romuli, S., Schock, S., Mahayothee, B., Sartas, M., Wossen, T., Njukwe, E., Vanlauwe, B., & Müller, J. (2020). Review of solar dryers for agricultural products in Asia and Africa: An innovation landscape approach. Journal of Environmental Management, 268, 110730. https://doi.org/10.1016/j.jenvman.2020.110730

Rizawi, J. A., Naqvi, S. M. D., Salehsulieman, O., Merhawikidanegebreslassie, J., Jemalosmansaleh, Y., Kahsay, Y. B., & Danaityonasamanuel, Y. (2022). Design of solar tray tomato drier. Journal of Eco-Friendly Agriculture, https://doi.org/10.5958/2582-2683.2022.00077.6

Dhande, H. K., Shelare, S. D., & Khope, P. B. (2020). Developing a mixed solar drier for improved postharvest handling of food grains. Agricultural Engineering International: The CIGR Journal.

Kristianto, F. P., & Salim, M. G. (2023). Simulation of a solar drier for Iroko wood (Chlorophora excelsa) in a tropical environment. Eksergi, 20(1). https://doi.org/10.31315/e.v20i1.8166

Schmid, B., Navalho, S., Schulze, P. C., Van De Walle, S., Van Royen, G., Schüler, L., Maia, I. B., Bastos, C. R. V., Baune, M.-C., Januschewski, E., Coelho, A., Pereira, H., Varela, J., Navalho, J., & Cavaco Rodrigues, A. M. (2022). Drying microalgae using an industrial solar dryer: A biomass quality assessment. Foods. https://doi.org/10.3390/foods11131873

Kang, H., Zhang, G., Mu, G., Zhao, C., Huang, H., Kang, C., Li, X., & Zhang, Q. (2022). Design of a greenhouse solar-assisted heat pump dryer for kelp (Laminaria japonica): System performance and drying kinetics. Foods. https://doi.org/10.3390/foods11213509

Culaba, A. B., Atienza, A. H., Ubando, A. T., Mayol, A. P., & Cuello, J. L. (2021). Energy and exergy evaluation of an onshore solar dryer for seaweeds. IOP Conference Series: Materials Science and Engineering. https://doi.org/10.1088/1757-899X/1109/1/012042

Del Rosario, E. Z., & Mateo, W. (2019). Hot water blanching pre-treatments: Enhancing drying of seaweeds (Kappaphycus alvarezii S.). Open Science Journal, 4(1). https://doi.org/10.23954/OSJ.V4I1.2076

Del Rosario, E. Z., & Mateo, W. (2019). Hot water blanching pre-treatments: Enhancing drying of seaweeds (Kappaphycus alvarezii S.). Open Science Journal, 4(1). https://doi.org/10.23954/OSJ.V4I1.2076

Tsukii, R., Imanishi, T., Iriyama, M., Ono, M., Suyama, A., & Miura, E. (2016). Drier and drying system.

Ibidoja, O. J., Shan, F. P., Sulaiman, J., & Ali, M. K. M. (2023). Detecting heterogeneity parameters and hybrid models for precision farming. Journal of Big Data, 10(130). https://doi.org/10.1186/s40537-023-00810-8

Mishra, N., Jain, S. K., Agrawal, N., Jain, N. K., Wadhawan, N., & Panwar, N. L. (2023). Development of drying system by using internet of things for food quality monitoring and controlling. Energy Nexus, 11, 100219. https://doi.org/10.1016/j.nexus.2023.100219

Lu, C., Ge, M., Song, L., Wu, J., Pan, G., & Wang, H. (2023). Energy efficiency evaluation study on the air source heat pump drying system based on internet of things. 2023 IEEE IAS Global Conference on Renewable Energy and Hydrogen Technologies (GlobConHT), 1–8. https://doi.org/10.1109/GlobConHT56829.2023.10087498

Nalendra, A. K., Wahvudi, D., Mujiono, T., & Fuad, N. (2022). IoT-Agri: IoT-based environment control and monitoring system for agriculture. 2022 International Conference on Industrial Cyber-Physical Systems (ICPS). https://doi.org/10.1109/ICIC56845.2022.10006964

IoT-Agri: IoT-based environment control and monitoring system for agriculture. (2022). 2022 International Conference on Industrial Cyber-Physical Systems (ICPS). https://doi.org/10.1109/icic56845.2022.10006964

Fusion of multiple sensors to implement precision agriculture using IoT infrastructure. (2023). Preprints. https://doi.org/10.20944/preprints202304.0119.v1

Bhoyar, N. C. (2023). Smart agriculture system using IoT-based technology. International Journal for Science Technology and Engineering. https://doi.org/10.22214/ijraset.2023.50651

Future IoT applications using artificial intelligence-based sensors: Agriculture. (2022). 2022 IEEE International Conference on Intelligent and Resilient Computing Applications (ICIRCA). https://doi.org/10.1109/icirca54612.2022.9985712

Redondo, J. M., Siqueiros-García, J. M., Bustamante-Zamudio, C., Seara-Pereira, M. F., & Trujillo, H. (2022). Heterogeneity: Method and applications for complex systems analysis. Journal of Physics: Conference Series, 2159(1), 012013. https://doi.org/10.1088/1742-6596/2159/1/012013

Javaid, A., Muthuvalu, M. S., Sulaiman, J., Ismail, M. T., & Ali, M. K. M. (2019). Forecast of the moisture ratio removal during the seaweed drying process using solar drier. AIP Conference Proceedings, 2184(1), 050016. https://doi.org/10.1063/1.5136404

Javaid, A., Ismail, M. T., & Ali, M. K. M. (2019). Model selection for collector efficiency of seaweed drier by using LASSO and multiple regression analysis using 8SC. AIP Conference Proceedings, 2184(1), 050032. https://doi.org/10.1063/1.5136420

Javaid, A., Ismail, M. T., & Ali, M. K. M. (2020). Efficient model selection of collector efficiency in solar dryer using hybrid of LASSO and robust regression. Pertanika Journal of Science & Technology, 28(1), 193–210. https://www.researchgate.net/publication/341089637

Lim, H. Y., Fam, P. S., Javaid, A., & Ali, M. K. M. (2020). Ridge regression as efficient model selection and forecasting of fish drying using V-groove hybrid solar drier. Pertanika Journal of Science & Technology, 28(4), 1179–1202. https://www.researchgate.net/publication/344873493

Javaid, A., Ismail, M. T., & Ali, M. K. M. (2021). Efficient model selection for moisture ratio removal of seaweed using hybrid of sparse and robust regression analysis. Pakistan Journal of Statistics and Operational Research, 17(3), 669–681. https://doi.org/10.18187/pjsor.v17i3.3641

Mukhtar, M., Ali, M. K. M., Ismail, M. T., Hamundu, F. M., Alimuddin, Akhtar, N., & Fudholi, A. (2022). Hybrid model in machine learning–robust regression applied for sustainability agriculture and food security. International Journal of Electrical and Computer Engineering, 12(4), 4457–4468. https://doi.org/10.11591/ijece.v12i4.pp4457-4468

Mukhtar, M., Ali, M. K. M., Javaid, A., Ismail, M. T., & Fudholi, A. (2021). Accurate and hybrid regularization–robust regression model in handling multicollinearity and outlier using 8SC for big data. Mathematical Modelling of Engineering Problems, 8(4), 547–556. http://iieta.org/journals/mmep

Ibidoja, O. J., Shan, F. P., Mukhtar, Sulaiman, J., & Ali, M. K. M. (2023). Robust M-estimators and machine learning algorithms for improving the predictive accuracy of seaweed contaminated big data. Journal of the Nigerian Society of Physical Sciences, 5, 1137. https://doi.org/10.46481/jnsps.2022.1137

Usman, M., Doguwa, S. I., & Alhaji, B. B. (2022). Comparing the prediction accuracy of ridge, lasso, and elastic net regression models with linear regression using breast cancer data. Bayero Journal of Pure and Applied Sciences. https://doi.org/10.4314/bajopas.v14i2.16

Elastic gradient descent, an iterative optimization method approximating the solution paths of the elastic net. (2022). arXiv. https://doi.org/10.48550/arxiv.2202.02146

Performance of lasso and elastic-net methods on non-invasive blood glucose measurement calibration modeling. (2023). Barekeng: Journal of Mathematics and Its Applications, 17(1), 37–42. https://doi.org/10.30598/barekengvol17iss1pp0037-0042

Zhang, J., Nai, W., Luo, K., Leng, P., Yang, Z., Li, D., & Zhang, C. (2021). Elastic network regression based on differential evolution dragonfly algorithm with T-distribution parameters. 2021 IEEE International Conference on Artificial Intelligence and Big Data (ICAIBD). https://doi.org/10.1109/ICAIBD51990.2021.9459070

Ali, M. K. M., Sulaiman, J., Md Yasir, S., & Ruslan, M. (2017). Cubic spline as a powerful tool for processing experimental drying rate data of seaweed using solar drier. Malaysian Journal of Mathematical Sciences, 11, 159–172.

Gujarati, D. N., & Porter, D. C. (2004). Basic econometrics (5th ed.). McGraw-Hill/Irwin.

Obadina, A., Oyewole, O., Sanni, L., & Abiola, S. S. (2006). Fungal enrichment of cassava peels proteins. African Journal of Biotechnology, 5(3), 302–304.

Gormley, T. A., & Matsa, D. A. (2014). Common errors: How to (and not to) control for unobserved heterogeneity. The Review of Financial Studies, 27(2), 617–661.

Cheng, J., Sun, J., Yao, K., Xu, M., & Cao, Y. (2022). A variable selection method based on mutual information and variance inflation factor. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 268, 120652.

Dorugade, A. V. (2014). New ridge parameters for ridge regression. Journal of the Association of Arab Universities for Basic and Applied Sciences, 15, 94–99.

Khalaf, G., & Iguernane, M. (2014). Ridge regression and ill-conditioning. Journal of Modern Applied Statistical Methods, 13(2), 18.

Dorugade, A. V., & Kashid, D. N. (2010). Alternative method for choosing ridge parameter for regression. Applied Mathematical Sciences, 4(9), 447–456.

Pal, D., Bhattacharyya, T., Bhattacharyya, A., Biswas, S., Gangadharan, D., Raha, S., & Sinha, B. (2003). The extent of strangeness equilibration in quark-gluon plasma. Pramana, 60(5), 1083–1087.

Kibria, B. M. G. (2003). Performance of some new ridge regression estimators. Communications in Statistics-Simulation and Computation, 32(2), 419–435.

Muniz, G., & Kibria, B. M. G. (2009). On some ridge regression estimators: An empirical comparison. Communications in Statistics—Simulation and Computation, 38(3), 621–630.

Khalaf, G., Månsson, K., & Shukur, G. (2013). Modified ridge regression estimators. Communications in Statistics-Theory and Methods, 42(8), 1476–1487.

Troskie, C. G., & Chalton, D. O. (1996). A Bayesian estimate for the constants in ridge regression. South African Statistical Journal, 30(2), 119–137.

Tibshirani, R. (2011). Regression shrinkage and selection via the lasso: A retrospective. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(3), 273–282.

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 58(1), 267–288.

Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101(476), 1418–1429.

Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320.

https://blog.lifeqisystem.com/control-limits-in-spc-chart

Scott, M. R., & Willmott, C. J. (2023). Decomposition of the mean absolute error (MAE) into systematic and unsystematic components. PLOS ONE. https://doi.org/10.1371/journal.pone.0279774

Set, F. N., Low, H. C., & Quah, S. H. (2008). Mean squared error: A tool to evaluate the accuracy of parameter estimators in regression.

Ng, S. F., Low, H. C., & Quah, S. H. (2008). Mean squared error: A tool to evaluate the accuracy of parameter estimators in regression (Min Ralat Kuasa Dua - Satu Kaedah untuk Menilai Kejituan Penganggar Parameter dalam Regresi).

Schluchter, M. D. (2014). Mean square error. In Wiley StatsRef: Statistics Reference Online. https://doi.org/10.1002/9781118445112.STAT05906

Bach, F., Cornacchia, E., Pesce, L., & Piccioli, G. (2023). Theory and applications of the sum-of-squares technique. arXiv. https://doi.org/10.48550/arXiv.2306.16255

Theory and applications of the sum-of-squares technique. (2023). arXiv. https://doi.org/10.48550/arxiv.2306.16255

Zhang, D. (2017). A coefficient of determination for generalized linear models. The American Statistician, 71(4), 310–316.

On the use of percent change within rehabilitative ultrasound imaging research: A systematic review with Monte Carlo simulations. (2022). https://doi.org/10.31219/osf.io/k9qg4

Cox, K. S., & Holcomb, Z. C. (2017). Box and whisker plot. In Encyclopedia of Biostatistics. https://doi.org/10.4324/9781003096764-22

Banacos, P. C. (2011). Box and whisker plots for local climate datasets: Interpretation and creation using Excel 2007/2010.

Pranatha, M. D. A., Pramaita, N., Sudarma, M., & Widyantara, I. M. O. (2018). Filtering outlier data using box whisker plot method for fuzzy time series rainfall forecasting. 2018 International Conference on Wireless and Telematics (ICWT). https://doi.org/10.1109/ICWT.2018.8527734

Hall, B. (2006). Box and whisker plots.

Lai, A., Menezes, E., Bennett, A. P., & Triantafyllou, M. S. (2022). Whisker sensor calibration and replication. 2022 IEEE OCEANS Conference. https://doi.org/10.1109/OCEANS47191.2022.9977014

Ghareeb, Z., Ali, S., & Al-Temimi, S. (2023). A comparative study between shrinkage methods (ridge-lasso) using simulation. Periodicals of Engineering and Natural Sciences (PEN). https://doi.org/10.21533/pen.v11i2.3472

Chen, R. C., Dewi, C., Huang, S. W., & Caraka, R. E. (2020). Selecting critical features for data classification based on machine learning methods. Journal of Big Data, 7(1), 1–26.

Ghareeb, Z., Ali, S., & Al-Temimi, S. (2023). A comparative study between shrinkage methods (ridge-lasso) using simulation. Periodicals of Engineering and Natural Sciences (PEN). https://doi.org/10.21533/pen.v11i2.3472

Usman, M., Doguwa, S. I., & Alhaji, B. B. (2022). Comparing the prediction accuracy of ridge, lasso, and elastic net regression models with linear regression using breast cancer data. Bayero Journal of Pure and Applied Sciences. https://doi.org/10.4314/bajopas.v14i2.16

Autcha, A. (2022). The penalized regression and penalized logistic regression of Lasso and elastic net methods for high-dimensional data: A modelling approach. IST Transactions on Applied Mathematics & Modeling. https://doi.org/10.9734/bpi/ist/v3/1695b

Khan, H. R., Bhadra, A., & Howlader, T. (2019). Stability selection for Lasso, Ridge, and Elastic Net implemented with AFT models. Statistical Applications in Genetics and Molecular Biology. https://doi.org/10.1515/SAGMB-2017-0001

García-Nieto, J. P., García-Gonzalo, E., & Paredes-Sánchez, J. P. (2021). Prediction of the critical temperature of a superconductor by using the WOA/MARS, Ridge, Lasso, and Elastic-net machine learning techniques. Neural Computing and Applications. https://doi.org/10.1007/S00521-021-06304-Z

Ahrens, A., Hansen, C., & Schaffer, M. E. (2019). LASSOPACK: Stata module for Lasso, square-root Lasso, Elastic Net, Ridge, adaptive Lasso estimation and cross-validation. Research Papers in Economics.

Nie, R. (2022). Analysis of influencing factors of fiscal revenue in Beijing based on Ridge regression and Lasso regression model. International Journal of New Developments in Engineering and Society. https://doi.org/10.25236/ijndes.2022.060201

Downloads

Published

16-12-2024