Abstract
Recovery of missing observations in time-series has been a century-long subject of study, giving rise to two broad classes of methods, namely, one that reconstructs data and the other that directly estimate the statistical properties of the data, largely for univariate processes. In this work, we present a data reconstruction technique for multivariate processes. The proposed method is developed in the framework of sparse optimization while adopting a parametric approach using vector auto-regressive (VAR) models, where both the temporal and spatial correlations can be exploited for efficient data recovery. The primary purpose of recovering the missing data in this work is to develop a directed graphical or a network representation of the multivariate process under study. Existing methods for data-driven network reconstruction are built on the assumption of data being available at regular intervals. In this respect, the proposed method offers an effective methodology for reconstructing weighted causal networks from missing data. The scope of this work is restricted to linear, jointly stationary multivariate processes that can be suitably represented by VAR models of finite order and missing data of the random type. Simulation studies on different data generating processes with varying proportions of missing observations illustrate the efficacy of the proposed method in recovering the multivariate signals and thereby reconstructing weighted causal networks.
Similar content being viewed by others
References
Imtiaz, S., Shah, S.: Treatment of missing values in process data analysis. Can. J. Chem. Eng. 86(5), 838–858 (2008)
Lakshminarayan, K., Harp, S.A., Samad, T.: Imputation of missing data in industrial databases. Appl. Intell. 11(3), 259–275 (1999)
Lomb, N.R.: Least-squares frequency analysis of unequally spaced data. Astrophys. Space Sci. 39(2), 447–462 (1976)
Kasam, A.A., Lee, B.D., Paredis, C.J.: Statistical methods for interpolating missing meteorological data for use in building simulation. In: Building Simulation, vol. 7, pp. 455–465. Tsinghua University Press, Springer (2014). https://doi.org/10.1007/s12273-014-0174-7
Ferrari, G.T., Ozaki, V.: Missing data imputation of climate datasets: implications to modeling extreme drought events. Rev. Bras. Meteorol. 29(1), 21–28 (2014)
Kourti, T., MacGregor, J.F.: Process analysis, monitoring and diagnosis, using multivariate projection methods. Chemom. Intell. Lab. Syst. 28(1), 3–21 (1995)
Scargle, J.D.: Studies in astronomical time-series analysis. ii-statistical aspects of spectral analysis of unevenly spaced data. Astrophys. J. 263, 835–853 (1982)
Warga, A.: Bond returns, liquidity, and missing data. J. Financial Quant. Anal. 27(4), 605–617 (1992)
Babu, P., Stoica, P.: Spectral analysis of nonuniformly sampled data-a review. Digit. Signal Process. 20(2), 359–378 (2010)
Scargle, J.D.: Studies in astronomical time-series analysis. iii-fourier transforms, autocorrelation functions, and cross-correlation functions of unevenly spaced data. Astrophys. J. 343, 874–887 (1989)
Hocke, K., Kämpfer, N.: Gap filling and noise reduction of unevenly sampled data by means of the Lomb–Scargle periodogram. Atmos. Chem. Phys. 9(12), 4197–4206 (2009)
Hocke, K.: Phase estimation with the lomb-scargle periodogram method. In: Annales Geophysicae, vol. 16, pp. 356–358. Copernicus (1998)
Schafer, J.L., Olsen, M.K.: Multiple imputation for multivariate missing-data problems: a data analyst’s perspective. Multivar. Behav. Res. 33(4), 545–571 (1998)
Isaksson, A.J.: Identification of arx-models subject to missing data. IEEE Trans. Autom. Control 38(5), 813–819 (1993)
de Waele, S., Broersen, P.M.T.: Error measures for resampled irregular data. IEEE Trans. Instrum. Meas. 49(2), 216–222 (2000). https://doi.org/10.1109/19.843052
Liu, S., Molenaar, P.C.: ivar: a program for imputing missing data in multivariate time-series using vector autoregressive models. Behav. Res. Methods 46(4), 1138–1148 (2014)
Junger, W., de Leon, A.P.: Imputation of missing data in time-series for air pollutants. Atmos. Environ. 102, 96–104 (2015)
Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006)
Baccalá, L.A., Sameshima, K.: Partial directed coherence: a new concept in neural structure determination. Biol. Cybern. 84(6), 463–474 (2001)
Gigi, S., Tangirala, A.: Reconstructing plant connectivity using directed spectral decomposition. IFAC Proc. Vol. 45(15), 481–486 (2012)
Granger, C.W.: Investi gating causal relations by econometric models and cross-spectral methods. Econometrica 37, 424–438 (1969). https://doi.org/10.2307/1912791
Gigi, S., Tangirala, A.K.: Quantitative analysis of directional strengths in jointly stationary linear multivariate processes. Biol. Cybern. 103(2), 119–133 (2010)
Eichler, M.: A graphical approach for evaluating effective connectivity in neural systems. Philos. Trans. R. Soc. Lond. B Biol. Sci. 360(1457), 953–967 (2005)
Eichler, M.: On the evaluation of information flow in multivariate systems by the directed transfer function. Biol. Cybern. 94(6), 469–482 (2006)
Bahadori, M.T., Liu, Y.: Granger causality analysis in irregular time-series. In: SDM, pp. 660–671. SIAM (2012)
Elad, M.: Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing, 1st edn. Springer, Berlin (2010)
Candes, E.J.: The restricted isometry property and its implications for compressed sensing. Comptes Rendus Math. 346(9), 589–592 (2008)
Perepu, S.K., Tangirala, A.K.: Reconstruction of missing data using compressed sensing techniques with adaptive dictionary. J. Process Control 47, 175–190 (2016)
Wiener, N.: The theory of prediction. Mod. Math. Eng. 1, 125–139 (1956)
Granger, C.W.: Some recent development in a concept of causality. J. Econom. 39(1), 199–211 (1988)
Lütkepohl, H.: New Introduction to Multiple Time-Series Analysis. Springer, Berlin (2005)
Garg, A., Tangirala, A.K.: Interaction assessment in multivariable control systems through causality analysis. IFAC Proc. Vol. 47(1), 585–592 (2014)
Ambat, S.K., Hari, K., et al.: Fusion of sparse reconstruction algorithms for multiple measurement vectors. arXiv preprint arXiv:1504.01705 (2015)
Wooten, R.: Statistical analysis of the relationship between wind speed, pressure and temperature. In: Sixth International Conference on Dynamic Systems and Applications (2011)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Agarwal, P., Tangirala, A.K. Reconstruction of missing data in multivariate processes with applications to causality analysis. Int J Adv Eng Sci Appl Math 9, 196–213 (2017). https://doi.org/10.1007/s12572-017-0198-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12572-017-0198-1