Nearest Neighbour in Least Squares Data Imputation Algorithms for Marketing Data

Wasito, Ito

doi:10.1007/978-1-4939-0742-7_19

Ito Wasito⁵

Part of the book series: Springer Optimization and Its Applications ((SOIA,volume 92))

1117 Accesses

Abstract

Marketing research operates with multivariate data for solving such problems as market segmentation, estimating purchasing power of a market sector, modeling attrition. In many cases, the data collected or supplied for these purposes may have a number of missing entries.The paper is devoted to an empirical evaluation of method for imputation of missing data in the so-called nearest neighbour of least-squares approximation approach, a non-parametric computationally efficient multidimensional technique. We make contributions to each of the two components of the experiment setting: (a) An empirical evaluation of the nearest neighbour in least-squares data imputation algorithm for marketing research (b) experimental comparisons with expectation–maximization (EM) algorithm and multiple imputation (MI) using real marketing data sets. Specifically, we review “global” methods for least-squares data imputation and propose extensions to them based on the nearest neighbours (NN) approach. It appears that NN in the least-squares data imputation algorithm almost always outperforms EM algorithm and is comparable to the multiple imputation approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aha, D.: Editorial. Artif. Intel. Rev. 11, 1–6 (1997)
Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. 39, 1–38 (1977)
MATH MathSciNet Google Scholar
EM Based Imputation Software. http://www.stat.psu.edu/jls/misoftwa.html, http://methcenter.psu.edu/EMCOV.html (1995)
Gabriel, K.R., Zamir, S.: Lower rank approximation of matrices by least squares with any choices of weights. Technometrics 21, 489–298 (1979)
Article MATH Google Scholar
Golub, G.H., Loan, C.F.: Matrix Computation, 2nd edn. John Hopkins University Press, Baltimore (1986)
Google Scholar
Heiser, W.J.: Convergent computation by iterative majorization: theory and applications in multidimensional analysis, In: Krzanowski, W.J. (ed.) Recent Advances in Descriptive Multivariate Analysis, pp. 157–189. Oxford University Press, Oxford (1995)
Google Scholar
Ho, Y., Chung, Y., Lau, K.: Unfolding large-scale marketing data. Int. J. Res. Mark. 27, 119–132 (2010)
Article Google Scholar
Holzinger, K.J., Harman, H.H.: Factor Analysis. University of Chicago Press, Chicago (1941)
Google Scholar
Jollife, I.T.: Principal Component Analysis. Springer, New-York (1986)
Book Google Scholar
Kiers, H.A.L.: Weighted least squares fitting using ordinary least squares algorithms. Psychometrika 62, 251–266 (1997)
Article MATH MathSciNet Google Scholar
Laaksonen, S.: Regression-based nearest neighbour hot decking. Comput. Stat. 15, 65–71 (2000)
Article MATH Google Scholar
Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (1987)
MATH Google Scholar
Mirkin, B.: Mathematical Classification and Clustering. Kluwer Academic, Dordrecht (1996)
Book MATH Google Scholar
Mitchell, T.M.: Machine Learning. McGraw-Hill, London (1997)
MATH Google Scholar
Myrtveit, I., Stensrud, E., Olsson, U.H.: Analyzing data sets with missing data: an empirical evaluation of imputation methods and likelihood-based methods. IEEE Trans. Softw. Eng. 27, 999–1013 (2001)
Article Google Scholar
Rubin, D.B.: Multiple Imputation for Nonresponse in Surveys. Wiley, New York (1987)
Book Google Scholar
Rubin, D.B.: Multiple imputation after 18+ years. J. Am. Stat. Assoc. 91, 473–489 (1996)
Article MATH Google Scholar
Schafer, J.L.: Analysis of Incomplete Multivariate Data. Chapman and Hall, London (1997)
Book MATH Google Scholar
Schafer, J.L.: NORM. http://www.stst.psu.edu/jls/misoftwa.html (1997)
Strauss, R.E., Atanassov, M.N., De Oliveira, J.A.: Evaluation of the principal-component and expectation-maximization methods for estimating missing data in morphometric studies. J. Vertebr. Paleontol. 23(2), 284–296 (2003)
Article Google Scholar
Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Hastie, R., Tibshirani, R., Botsein, D., Altman, R.B.: Missing value estimation methods for DNA microarrays. Bioinformatics 17, 520–525 (2001)
Article Google Scholar
Wasito, I., Mirkin, B.: Nearest neighbour approach in the least-squares data imputation algorithms. Inf. Sci. 169, 1–25 (2005)
Article MATH MathSciNet Google Scholar
Wasito, I., Mirkin, B.: Least squares data imputation with nearest neighbour approach with different missing patterns. Comput. Stat. Data Anal. 50, 926–949 (2006)
Article MathSciNet Google Scholar

Download references

Acknowledgements

The author gratefully acknowledges many comments by reviewers that have been very helpful in improving the presentation.

Author information

Authors and Affiliations

Faculty of Computer Science, University of Indonesia, Kampus UI, Depok, 16424, Indonesia
Ito Wasito

Authors

Ito Wasito
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ito Wasito .

Editor information

Editors and Affiliations

Department of Higher Mathematics, National Research University Higher School of Economics, Moscow, Russia
Fuad Aleskerov
Department of Operations, University of Groningen, Groningen, The Netherlands
Boris Goldengorin
Department of Industrial and Systems Eng, University of Florida, Gainesville, Florida, USA
Panos M. Pardalos

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Wasito, I. (2014). Nearest Neighbour in Least Squares Data Imputation Algorithms for Marketing Data. In: Aleskerov, F., Goldengorin, B., Pardalos, P. (eds) Clusters, Orders, and Trees: Methods and Applications. Springer Optimization and Its Applications, vol 92. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-0742-7_19

Download citation

DOI: https://doi.org/10.1007/978-1-4939-0742-7_19
Published: 03 May 2014
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-0741-0
Online ISBN: 978-1-4939-0742-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics