Skip to main content

Nearest Neighbour in Least Squares Data Imputation Algorithms for Marketing Data

  • Chapter
  • First Online:
Clusters, Orders, and Trees: Methods and Applications

Part of the book series: Springer Optimization and Its Applications ((SOIA,volume 92))

  • 1117 Accesses

Abstract

Marketing research operates with multivariate data for solving such problems as market segmentation, estimating purchasing power of a market sector, modeling attrition. In many cases, the data collected or supplied for these purposes may have a number of missing entries.The paper is devoted to an empirical evaluation of method for imputation of missing data in the so-called nearest neighbour of least-squares approximation approach, a non-parametric computationally efficient multidimensional technique. We make contributions to each of the two components of the experiment setting: (a) An empirical evaluation of the nearest neighbour in least-squares data imputation algorithm for marketing research (b) experimental comparisons with expectation–maximization (EM) algorithm and multiple imputation (MI) using real marketing data sets. Specifically, we review “global” methods for least-squares data imputation and propose extensions to them based on the nearest neighbours (NN) approach. It appears that NN in the least-squares data imputation algorithm almost always outperforms EM algorithm and is comparable to the multiple imputation approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aha, D.: Editorial. Artif. Intel. Rev. 11, 1–6 (1997)

    Google Scholar 

  2. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. 39, 1–38 (1977)

    MATH  MathSciNet  Google Scholar 

  3. EM Based Imputation Software. http://www.stat.psu.edu/jls/misoftwa.html, http://methcenter.psu.edu/EMCOV.html (1995)

  4. Gabriel, K.R., Zamir, S.: Lower rank approximation of matrices by least squares with any choices of weights. Technometrics 21, 489–298 (1979)

    Article  MATH  Google Scholar 

  5. Golub, G.H., Loan, C.F.: Matrix Computation, 2nd edn. John Hopkins University Press, Baltimore (1986)

    Google Scholar 

  6. Heiser, W.J.: Convergent computation by iterative majorization: theory and applications in multidimensional analysis, In: Krzanowski, W.J. (ed.) Recent Advances in Descriptive Multivariate Analysis, pp. 157–189. Oxford University Press, Oxford (1995)

    Google Scholar 

  7. Ho, Y., Chung, Y., Lau, K.: Unfolding large-scale marketing data. Int. J. Res. Mark. 27, 119–132 (2010)

    Article  Google Scholar 

  8. Holzinger, K.J., Harman, H.H.: Factor Analysis. University of Chicago Press, Chicago (1941)

    Google Scholar 

  9. Jollife, I.T.: Principal Component Analysis. Springer, New-York (1986)

    Book  Google Scholar 

  10. Kiers, H.A.L.: Weighted least squares fitting using ordinary least squares algorithms. Psychometrika 62, 251–266 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  11. Laaksonen, S.: Regression-based nearest neighbour hot decking. Comput. Stat. 15, 65–71 (2000)

    Article  MATH  Google Scholar 

  12. Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (1987)

    MATH  Google Scholar 

  13. Mirkin, B.: Mathematical Classification and Clustering. Kluwer Academic, Dordrecht (1996)

    Book  MATH  Google Scholar 

  14. Mitchell, T.M.: Machine Learning. McGraw-Hill, London (1997)

    MATH  Google Scholar 

  15. Myrtveit, I., Stensrud, E., Olsson, U.H.: Analyzing data sets with missing data: an empirical evaluation of imputation methods and likelihood-based methods. IEEE Trans. Softw. Eng. 27, 999–1013 (2001)

    Article  Google Scholar 

  16. Rubin, D.B.: Multiple Imputation for Nonresponse in Surveys. Wiley, New York (1987)

    Book  Google Scholar 

  17. Rubin, D.B.: Multiple imputation after 18+ years. J. Am. Stat. Assoc. 91, 473–489 (1996)

    Article  MATH  Google Scholar 

  18. Schafer, J.L.: Analysis of Incomplete Multivariate Data. Chapman and Hall, London (1997)

    Book  MATH  Google Scholar 

  19. Schafer, J.L.: NORM. http://www.stst.psu.edu/jls/misoftwa.html (1997)

  20. Strauss, R.E., Atanassov, M.N., De Oliveira, J.A.: Evaluation of the principal-component and expectation-maximization methods for estimating missing data in morphometric studies. J. Vertebr. Paleontol. 23(2), 284–296 (2003)

    Article  Google Scholar 

  21. Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Hastie, R., Tibshirani, R., Botsein, D., Altman, R.B.: Missing value estimation methods for DNA microarrays. Bioinformatics 17, 520–525 (2001)

    Article  Google Scholar 

  22. Wasito, I., Mirkin, B.: Nearest neighbour approach in the least-squares data imputation algorithms. Inf. Sci. 169, 1–25 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  23. Wasito, I., Mirkin, B.: Least squares data imputation with nearest neighbour approach with different missing patterns. Comput. Stat. Data Anal. 50, 926–949 (2006)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The author gratefully acknowledges many comments by reviewers that have been very helpful in improving the presentation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ito Wasito .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this chapter

Cite this chapter

Wasito, I. (2014). Nearest Neighbour in Least Squares Data Imputation Algorithms for Marketing Data. In: Aleskerov, F., Goldengorin, B., Pardalos, P. (eds) Clusters, Orders, and Trees: Methods and Applications. Springer Optimization and Its Applications, vol 92. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-0742-7_19

Download citation

Publish with us

Policies and ethics