Skip to main content

Comparative Performance of Various Imputation Methods for River Flow Data

  • Conference paper
  • First Online:
Recent Advances in Soft Computing and Data Mining (SCDM 2022)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 457))

Included in the following conference series:

  • 251 Accesses

Abstract

River flow is important in hydrological or meteorological research in order to forecast flood, drought events, as well as day-to-day river basin management and it is hard to do so without both real-time and historical data. This is because of hydrology data is frequently incomplete due to a variety of factors, including data loss. Due to the hydrological data is prone to missing values, imputing the missing data is very important to complete the data. In this study, six imputation methods are used which are Mean Imputation and Median Imputation, Multiple Imputation, Normal Ratio, NIPALS and EM Algorithms. The aims of this study are to impute the missing values in river flow dataset using various imputation methods and to apply the ARIMA model on the original and imputed datasets. The experimental result showed that Multiple Imputation that used MCMC method was deemed the best method as it has the lowest value of RMSE and MAE which are 41.23 and 14.09 respectively compared to other methods. It can be concluded that Multiple Imputation is the most robust and adaptable machine learning method, but it is also the most difficult to program in terms of computing complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Sattari, M.T., Rezazadeh-Joudi, A., Kusiak, A.: Assessment of different methods for estimation of missing data in precipitation studies. Hydrol. Res. 48(4), 1032–1044 (2016)

    Article  Google Scholar 

  2. Chen, L., Xu, J., Wang, G., Shen, Z.: Comparison of the multiple imputation approaches for imputing rainfall data series and their applications to watershed models. J. Hydrol. 572, 449–460 (2019)

    Article  Google Scholar 

  3. Mfwango, L.H., Salim, C.J., Kazumba, S.: Estimation of missing river flow data for hydrologic analysis: the case of Great Ruaha River catchment. Hydrol. Curr. Res. 9(2), 1–8 (2018)

    Article  Google Scholar 

  4. Gill, M.K., Asefa, T., Kaheil, Y., McKee, M.: Effect of missing data on performance of learning algorithms for hydrologic predictions: implications to an imputation technique. Water Resour. Res. 43(7), 1–12 (2007)

    Article  Google Scholar 

  5. Stavseth, M.R., Clausen, T., Røislien, J.: How handling missing data may impact conclusions: a comparison of six different imputation methods for categorical questionnaire data. SAGE Open Med. 7, 205031211882291 (2019)

    Article  Google Scholar 

  6. Burhanuddin, S.N.Z.A., Deni, S.M., Ramli, N.M.: Imputation of missing rainfall data using revised normal ratio method. Adv. Sci. Lett. 23(11), 10981–10985 (2017). https://doi.org/10.1166/asl.2017.10203

    Article  Google Scholar 

  7. Hamzah, F.B., Mohd Hamzah, F., Mohd Razali, S.F., Jaafar, O., Abdul Jamil, N.: Imputation methods for recovering streamflow observation: a methodological review. Cogent Environ. Sci. 6(1), 1745133 (2020)

    Article  Google Scholar 

  8. Cheema, J.R.: Some general guidelines for choosing missing data handling methods in educational research. J. Mod. Appl. Statist. Methods 13(2), 53–75 (2014). https://doi.org/10.22237/jmasm/1414814520

  9. Mariana Che Mat Nor, S., Shaharudin, S.M., Ismail, S., Zainuddin, N.H., Tan, M.L.: A comparative study of different imputation methods for daily rainfall data in east-coast Peninsular Malaysia. Bull. Electric. Eng. Inf. 9(2), 1–9 (2020). https://doi.org/10.11591/eei.v9i2.2090

  10. Ekeu-wei, I., Blackburn, G., Pedruco, P.: Infilling missing data in hydrology: solutions using satellite radar altimetry and multiple imputation for data-sparse regions. Water 10(10), 1483 (2018)

    Article  Google Scholar 

  11. Madley-Dowd, P., Hughes, R., Tilling, K., Heron, J.: The proportion of missing data should not be used to guide decisions on multiple imputation. J. Clin. Epidemiol. 110, 63–73 (2019). https://doi.org/10.1016/j.jclinepi.2019.02.016

    Article  Google Scholar 

  12. Suhaime, N., Ghazali, N.A., Nasir, M.Y., Mokhtar, M.I.Z., Ramli, N.A.: Markov chain Monte Carlo method for handling missing data in air quality datasets. Malaysian J. Analyt. Sci. 21(3) (2017). https://doi.org/10.17576/mjas-2017-2103-05

  13. Masseran, N., Razali, A.M., Ibrahim, K., Zaharim, A., Sopian, K.: Application of the single imputation method to estimate missing wind speed data in Malaysia. Res. J. Appl. Sci. Eng. Technol. 6(10), 1780–1784 (2013). https://doi.org/10.19026/rjaset.6.3903

  14. De Silva, R.P., Dayawansa, N.D.K., Ratnasiri, M.D.: A comparison of methods used in estimating missing rainfall data. J. Agricult. Sci. 3(2), 101 (2007). https://doi.org/10.4038/jas.v3i2.8107

    Article  Google Scholar 

  15. Shaharudin, S.M., Andayani, S.K., Binatari, N., Kurniawan, A., Ahmad Basri, M.A., Zainuddin, N.H.: Imputation methods for addressing missing data of monthly rainfall in Yogyakarta, Indonesia. Int. J. Adv. Trends Comput. Sci. Eng. 9(1.4), 646–651 (2020). https://doi.org/10.30534/ijatcse/2020/9091.42020

  16. Firat, M., Dikbas, F., Koc, A.C., Gungor, M.: Analysis of temperature series: estimation of missing data and homogeneity test. Meteorol. Appl. 19(4), 397–406 (2011). https://doi.org/10.1002/met.271

    Article  Google Scholar 

  17. Dastorani, M.T., Moghadamnia, A., Piri, J., Rico-Ramirez, M.: Application of ANN and ANFIS models for reconstructing missing flow data. Environ. Monit. Assess. 166(1–4), 421–434 (2009). https://doi.org/10.1007/s10661-009-1012-8

    Article  Google Scholar 

  18. Nadiatul Adilah, A.A.G., Hannani, H.: Comparison of methods to estimate missing rainfall data for short term period at UMP Gambang. IOP Conf. Ser. Earth Environ. Sci. 682(1), 012027 (2021). https://doi.org/10.1088/1755-1315/682/1/012027

  19. Osman, M.S., Abu-Mahfouz, A.M., Page, P.R.: A survey on data imputation techniques: water distribution system as a use case. IEEE Access 6, 63279–63291 (2018)

    Article  Google Scholar 

  20. Abdulgader, Q.: Time series forecasting using arima methodology with application on census data in Iraq. Sci. J. Univ. Zakho 4(2), 258–268 (2016). https://doi.org/10.25271/2016.4.2.116

  21. Fattah, J., Ezzine, L., Aman, Z., el Moussami, H., Lachhab, A.: Forecasting of demand using ARIMA model. Int. J. Eng. Bus. Manag. 10, 184797901880867 (2018). https://doi.org/10.1177/1847979018808673

    Article  Google Scholar 

  22. Pampaka, M., Hutcheson, G., Williams, J.: Handling missing data: analysis of a challenging data set using multiple imputation. Int. J. Res. Method Educ. 39(1), 19–37 (2014). https://doi.org/10.1080/1743727x.2014.979146

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the Ministry of Higher Education Malaysia (MOHE) for supporting this research under Fundamental Research Grant Scheme Vot No. FRGS/1/2018/STG06/UTHM/03/3 and partially sponsor by Universiti Tun Hussein Onn Malaysia under Multi-Displinary Grant Vot No. H508.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shuhaida Ismail .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Muhaime, N.A.D.A., Arifin, M.A., Ismail, S., Shaharuddin, S.M. (2022). Comparative Performance of Various Imputation Methods for River Flow Data. In: Ghazali, R., Mohd Nawi, N., Deris, M.M., Abawajy, J.H., Arbaiy, N. (eds) Recent Advances in Soft Computing and Data Mining. SCDM 2022. Lecture Notes in Networks and Systems, vol 457. Springer, Cham. https://doi.org/10.1007/978-3-031-00828-3_11

Download citation

Publish with us

Policies and ethics