Skip to main content
Log in

Robust Regression with Data-Dependent Regularization Parameters and Autoregressive Temporal Correlations

  • Published:
Environmental Modeling & Assessment Aims and scope Submit manuscript

Abstract

We introduce robust procedures for analyzing water quality data collected over time. One challenging task in analyzing such data is how to achieve robustness in presence of outliers while maintaining high estimation efficiency so that we can draw valid conclusions and provide useful advices in water management. The robust approach requires specification of a loss function such as the Huber, Tukey’s bisquare and the exponential loss function, and an associated tuning parameter determining the extent of robustness needed. High robustness is at the cost of efficiency loss in parameter loss. To this end, we propose a data-driven method which leads to more efficient parameter estimation. This data-dependent approach allows us to choose a regularization (tuning) parameter that depends on the proportion of “outliers” in the data so that estimation efficiency is maximized. We illustrate the proposed methods using a study on ammonium nitrogen concentrations from two sites in the Huaihe River in China, where the interest is in quantifying the trend in the most recent years while accounting for possible temporal correlations and “irregular” observations in earlier years.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Birkes, D., & Dodge, Y. (1993). Alternative methods of regression. New York: Wiley.

    Book  Google Scholar 

  2. Venables, W.N., & Ripley, B.D. (2002). Modern applied statistics with S-PLUS, 4th Edn. Springer.

  3. Wang, Y.-G., Lin, X., Zhu, M., Bai, Z. (2007). Robust estimation using the Huber function with a data-dependent tuning constant. Journal of Computational and Graphical Statistics, 16(2), 468–481.

    Article  Google Scholar 

  4. Wang, X., Jiang, Y., Huang, M., Zhang, H. (2013). Robust variable selection with exponential squared loss. Journal of the American Statistical Association, 108, 632–643.

    Article  CAS  Google Scholar 

  5. Huber, P.J. (1964). Robust estimation of a location parameter. The Annals of Mathematical Statistics, 35, 73–101.

    Article  Google Scholar 

  6. Schrader, R.M., & Hettmansperger, T.P. (1980). Robust analysis of variance based on upon a likelihood ratio criterion. Biometrika, 67, 93–101.

    Article  Google Scholar 

  7. Wang, J.Z., Chen, T.H., Zhu, C.Z., Peng, S.C. (2014). Trace organic pollutants in sediments from Huaihe River, China: evaluation of sources and ecological risk. Journal of Hydrology, 512, 463–469. https://doi.org/10.1016/j.jhydrol.2014.03.012.

    Article  CAS  Google Scholar 

  8. He, T., Lu, Y., Cui, Y., Luo, Y., Wang, M., Meng, W., Zhang, K., Zhao, F. (2015). Detecting gradual and abrupt changes in water quality time series in response to regional payment programs for watershed services in an agricultural area. Journal of Hydrology, 525, 457–471. https://doi.org/10.1016/j.jhydrol.2015.04.005.

    Article  CAS  Google Scholar 

  9. Tian, D., Zheng, W., Wei, X., Sun, X., Liu, L., Chen, X., Zhang, H., Zhou, Y., Chen, H., Zhang, H., Wang, X., Zhang, R., Jiang, S., Zheng, Y., Yang, G., Qu, W. (2013). Dissolved microcystins in surface and ground waters in regions with high cancer incidence in the Huai River Basin of China. Chemosphere, 91(7), 1064–71. https://doi.org/10.1016/j.chemosphere.2013.01.051.

    Article  CAS  Google Scholar 

  10. Zhang, J.Y., Wang, G.Q., Pagano, T.C., Jin, J.L., Liu, C.S., He, R.M., Liu, Y.L. (2013). Using hydrologic simulation to explore the impacts of climate change on runoff in the Huaihe River Basin of China. Journal of Hydrologic Engineering, 18(11), 1393–1399. https://doi.org/10.1061/(asce)he.1943-5584.0000581.

    Article  Google Scholar 

  11. Wang, Y.-G., Kuhnert, P., Henderson, B. (2011). Load estimation with uncertainties from opportunistic sampling data—a semiparametric approach. Journal of Hydrology, 396(1), 148–157.

    Article  Google Scholar 

  12. Wang, Y.-G., & Tian, T. (2013). Sediment concentration prediction and statistical evaluation for annual load estimation. Journal of Hydrology, 482, 69–78.

    Article  Google Scholar 

  13. Wang, Y.-G., & Lin, X. (2005). Effects of variance-function misspecification in analysis of longitudinal data. Biometrics, 61(2), 413–421.

    Article  Google Scholar 

  14. Wang, Y.-G., & Zhao, Y. (2007). A modified pseudolikelihood approach for analysis of longitudinal data. Biometrics, 63(3), 681–689.

    Article  Google Scholar 

Download references

Funding

This research was funded by the Australian Research Council projects (DP130100766 and DP160104292).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to You-Gan Wang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, N., Wang, YG., Hu, S. et al. Robust Regression with Data-Dependent Regularization Parameters and Autoregressive Temporal Correlations. Environ Model Assess 23, 779–786 (2018). https://doi.org/10.1007/s10666-018-9605-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10666-018-9605-7

Keywords

Navigation