Observation Influence Diagnostic of a Data Assimilation System

Cardinali, Carla

doi:10.1007/978-3-642-35088-7_4

Carla Cardinali³

2430 Accesses
8 Citations

Abstract

The influence matrix is used in ordinary least-squares applications for monitoring statistical multiple-regression analyses. Concepts related to the influence matrix provide diagnostics on the influence of individual data on the analysis, the analysis change that would occur by leaving one observation out, and the effective information content (degrees of freedom for signal) in any sub-set of the analysed data. In this paper, the corresponding concepts are derived in the context of linear statistical data assimilation in Numerical Weather Prediction. An approximate method to compute the diagonal elements of the influence matrix (the self-sensitivities) has been developed for a large-dimension variational data assimilation system (the 4D-Var system of the European Centre for Medium-Range Weather Forecasts). Results show that, in the ECMWF operational system, 18 % of the global influence is due to the assimilated observations, and the complementary 82 % is the influence of the prior (background) information, a short-range forecast containing information from earlier assimilated observations. About 20 % of the observational information is currently provided by surface-based observing systems, and 80 % by satellite systems.A toy-model is developed to illustrate how the observation influence depends on the data assimilation covariance matrices. In particular, the role of high-correlated observation error and high-correlated background error with respect to uncorrelated ones is presented. Low-influence data points usually occur in data-rich areas, while high-influence data points are in data-sparse areas or in dynamically active regions. Background error correlations also play an important role: high correlation diminishes the observation influence and amplifies the importance of the surrounding real and pseudo observations (prior information in observation space). To increase the observation influence in presence of high correlated background error is necessary to introduce the observation error correlation but also observation and background error variances must be of similar size. Incorrect specifications of background and observation error covariance matrices can be identified, interpreted and better understood by the use of influence matrix diagnostics for the variety of observation types and observed variables used in the data assimilation system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bauer P, Buizza R, Cardinali C, Thépaut J-l (2011) Impact of singular vector based satellite data thinning on NWP. Q J R Meteorol Soc 137:286–302
Article Google Scholar
Bormann N, Saarinen S, Kelly G, Thépaut J-N (2003) The spatial structure of observation errors in atmospheric motion vectors from geostationary satellite data. Mon Wea Rev 131:706–718
Article Google Scholar
Cardinali C (2009) Monitoring the forecast impact on the short-range forecast. Q J R Meteorol Soc 135:239–250
Article Google Scholar
Cardinali C, Prates F (2011) Performance measurement with advanced diagnostic tools of all-sky microwave imager radiances in 4D-Var.Q J R Meteorol Soc 137(Issue 661, Part B):2038–2046
Google Scholar
Cardinali C, Pezzulli S, Andersson E (2004) Influence matrix diagnostics of a data assimilation system. Q J R Meteorol Soc. 130:2767–2786
Article Google Scholar
Craven P, Wahba G (1979) Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numer Math 31:377–403
Article Google Scholar
Geer AJ, Bauer P (2011) Observation errors in all-sky data assimilation. Q J R Meteorol Soc 137(Issue 661, Part B):2024–2037
Google Scholar
Healy SB, Thépaut J-N (2006) Assimilation experiments with CHAMP GPS radio occultation measurements. Q J R Meteorol Soc 132:605–623
Article Google Scholar
Hoaglin DC, Welsch RE (1978) The hat matrix in regression and ANOVA. Am Stat 32:17–22, and Corrigenda 32:146
Google Scholar
Hoaglin DC, Mosteller F, Tukey JW (1982) Understanding robust and exploratory data analysis. Wiley Series in Probability and Statistics. Wiley, New York
Google Scholar
Junjie L, Kalnay E, Miyoshi T, Cardinali C (2009) Analysis sensitivity calculation within an ensemble Kalman filter. Q J R Meteorol Soc 135:1842–1851
Article Google Scholar
Langland R, Baker NL (2004) Estimation of observation impact using the NRL atmospheric variational data assimilation adjoint system. Tellus 56A:189–201
Google Scholar
Lorenc A (1986) Analysis methods for numerical weather prediction. Q J R Meteorol Soc 112:1177–1194
Article Google Scholar
Rabier F, Järvinen H, Klinker E, Mahfouf JF, Simmons A (2000) The ECMWF operational implementation of four-dimensional variational assimilation. Part I: experimental results with simplified physics. Q J R Meteorol Soc 126:1143–1170
Article Google Scholar
Rabier F, Fourrié N, Chafaï D, Prunet P (2002) Channel selection methods for infrared atmospheric sounding interferometer radiances. Q J R Meteorol Soc 128:1011–1027
Article Google Scholar
Radnoti G, Bauer P, McNally A, Cardinali C, Healy S, de Rosnay P (2010) ECMWF study on the impact of future developments of the space-based observing system on Numerical Weather Prediction. ECMWF Tec. Memo 638
Google Scholar
Shen X, Huang H, Cressie N (2002) Nonparametric hypothesis testing for a spatial signal. J Am Stat Ass 97:1122–1140
Article Google Scholar
Talagrand O (1997) Assimilation of observations, an introduction. J Meteorol Soc Japan 75(1B):191–209
Google Scholar
Thépaut JN, Hoffman RN Courtier P (1993) Interactions of dynamics and observations in a four-dimensional variational assimilation. Mon Weather Rev 121:3393–3414
Article Google Scholar
Thépaut JN, Courtier P, Belaud G Lemaître G (1996) Dynamical structure functions in four-dimensional variational assimilation: a case study. Q J R Meteorol Soc 122:535–561
Article Google Scholar
Tukey JW (1972) Data analysis, computational and mathematics. Q Appl Math 30:51–65
Google Scholar
Velleman PF, Welsch RE (1981) Efficient computing of regression diagnostics. Am Stat 35:234–242
Google Scholar
Wahba G (1990) Spline models for observational data. SIAM, CBMS-NSF. Regional conference series in applied mathematics, vol 59. Society for Industrial and Applied Mathematics, Philadelphia, p 165
Book Google Scholar
Wahba G, Johnson DR, Gao F, Gong J (1995) Adaptive tuning of numerical weather prediction models: randomized GCV in three- and four-dimensional data assimilation. Mon Weather Rev 123:3358–3369
Article Google Scholar
Ye J (1998) On measuring and correcting the effect of data mining and model selection. J Am Stat Ass 93:120–131
Article Google Scholar
Zhu Y, Gelaro R (2008) Observation sensitivity calculations using the adjoint of the Gridpoint Statistical Interpolation (GSI) analysis system. Mon Weather Rev 136:335–351
Article Google Scholar

Download references

Acknowledgements

The author thanks Olivier Talagrand and Sergio Pezzulli for the fruitful discussions on the subject. Many thanks are given to Mohamed Dahoui and Anne Fouilloux for their precious technical support.

Author information

Authors and Affiliations

Data Assimilation Section, European Centre for Medium-Range Weather Forecast, Shinfield Park, Reading, Berks, RG2 9AX, UK
Carla Cardinali

Authors

Carla Cardinali
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Carla Cardinali .

Editor information

Editors and Affiliations

, Environmental Science and Engineering, Ewha Womans University, Ewhayeodae-gil, Seodaemun-gu 52, Seoul, 120-750, Korea, Republic of (South Korea)
Seon Ki Park
Naval Research Laboratory, Grace Hopper Ave. 7, Monterey, 93943-5502, California, USA
Liang Xu

Appendix

Influence Matrix Calculation in Weighted Regression Data Assimilation Scheme

Under the frequentist approach, the regression equations for observation

$$\mathbf{y} = \mathbf{H}\boldsymbol{\theta } +\boldsymbol{ \epsilon }_{o}$$

and for background

$$\mathbf{x}_{b} =\boldsymbol{\theta } +\boldsymbol{\epsilon }_{b}$$

are assumed to have uncorrelated error vectors $\boldsymbol{\epsilon }_{\mathbf{o}}$ and $\boldsymbol{\epsilon }_{\mathrm{b}}$, zero vector means and variance matrices R and B, respectively. The $\boldsymbol{\theta }$ parameter is the unknown system state (x) of dimension n. These regression equations are summarized as a weighted regression

$$\mathbf{z} = \mathbf{X}\boldsymbol{\theta } +\boldsymbol{ \epsilon }$$

where $\mathbf{z} ={ \left [{\mathbf{y}}^{T}\mathbf{x}_{b}^{T}\right ]}^{T}$ is (m + n) ×1; $\mathbf{X} ={ \left [{\mathbf{H}}^{T}\mathbf{I}_{n}\right ]}^{T}$ is (m + n) ×n and $\boldsymbol{\epsilon } = {[\boldsymbol{\epsilon }_{o}\boldsymbol{\epsilon }_{b}]}^{T}$ is (m + n) ×1 with zero mean and variances matrix

$$\boldsymbol{\Omega } = \left (\begin{array}{cc} \mathbf{R}& 0\\ 0 &\mathbf{B}\\ \end{array} \right )$$

The generalized LS solution for $\boldsymbol{\theta }$ is BLUE and is given by

$$\boldsymbol{\hat{\theta }}= {({\mathbf{X}}^{T}{\boldsymbol{\Omega }}^{-1}\mathbf{X})}^{-1}{\mathbf{X}}^{T}{\boldsymbol{\Omega }}^{-1}\mathbf{z}$$

(4.34)

see Talagrand (1997). After some algebra this equation equals (4.11). Thus

$$\mathbf{z} = \mathbf{X}\boldsymbol{\hat{\theta }} ={ \left [{\mathbf{H}}^{T}\mathbf{x}_{ a}^{T}\mathbf{x}_{ a}^{T}\right ]}^{T} = \mathbf{X}{({\mathbf{X}}^{T}{\boldsymbol{\Omega }}^{-1}\mathbf{X})}^{-1}{\mathbf{X}}^{T}{\boldsymbol{\Omega }}^{-1}\mathbf{z}$$

and by (4.5) the influence matrix becomes

$$\mathbf{S}_{\mathrm{zz}} = \frac{\partial \mathbf{\hat{z}}} {\partial \mathbf{z}} = \frac{\partial \boldsymbol{\hat{\theta }}} {\partial \mathbf{z}} = \left (\begin{array}{cc} \mathbf{S}_{\mathit{yy}} & \mathbf{S}_{\mathit{yb}} \\ \mathbf{S}_{\mathit{by}} & \mathbf{S}_{\mathit{bb}}\\ \end{array} \right ) = \left (\begin{array}{cc} {\mathbf{R}}^{-1}{\mathbf{HAH}}^{T}&{\mathbf{R}}^{-1}\mathbf{HA} \\ {\mathbf{B}}^{-1}{\mathbf{AH}}^{T} & {\mathbf{B}}^{-1}\mathbf{A}\\ \end{array} \right )$$

where $\mathbf{S}_{yy} = \frac{\partial \mathbf{Hx}_{a}} {\partial \mathbf{y}} ;\mathbf{S}_{yb} = \frac{\partial \mathbf{x}_{a}} {\partial \mathbf{y}} ;\mathbf{S}_{\mathit{by}} = \frac{\partial \mathbf{Hx}_{a}} {\partial \mathbf{x}_{b}} ;\mathbf{S}_{\mathit{bb}} = \frac{\partial \mathbf{x}_{a}} {\partial \mathbf{x}_{b}}$. Note that S _yy = S as defined in (4.4).

Generalized LS regression is different from ordinary LS because the influence matrix is not symmetric anymore. For idempotence, using (4.33) it easy to show that $\mathbf{S}_{\mathrm{zz}}\mathbf{S}_{\mathrm{zz}} = \mathbf{S}_{\mathrm{zz}}.$ Finally,

$$\mathbf{S}_{\mathit{bb}} ={ \mathbf{B}}^{-1}\mathbf{A} = \mathbf{I}_{ n} -{\mathbf{H}}^{T}{\mathbf{R}}^{-1}\mathbf{HA}$$

hence,

$$\mathit{tr}(\mathbf{S}_{\mathit{bb}}) = n -\mathit{tr}({\mathbf{H}}^{T}{\mathbf{R}}^{-1}\mathbf{HA}) = n -\mathit{tr}(\mathbf{S}_{\mathit{ yy}})$$

it follows that

$$\mathit{tr}(\mathbf{S}_{\mathit{zz}}) = \mathit{tr}(\mathbf{S}_{\mathit{yy}}) + \mathit{tr}(\mathbf{S}_{\mathit{bb}}) = n$$

The trace of the influence matrix is still equal to the parameter’s dimension.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Cardinali, C. (2013). Observation Influence Diagnostic of a Data Assimilation System. In: Park, S., Xu, L. (eds) Data Assimilation for Atmospheric, Oceanic and Hydrologic Applications (Vol. II). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35088-7_4

Download citation

DOI: https://doi.org/10.1007/978-3-642-35088-7_4
Published: 20 February 2013
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35087-0
Online ISBN: 978-3-642-35088-7
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)

Publish with us

Policies and ethics

Observation Influence Diagnostic of a Data Assimilation System

Abstract

Access this chapter

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation