Skip to main content

Bayesian Ying-Yang Harmony Learning for Local Factor Analysis: A Comparative Investigation

  • Chapter
Oppositional Concepts in Computational Intelligence

Part of the book series: Studies in Computational Intelligence ((SCI,volume 155))

Summary

For unsupervised learning by Local Factor Analysis (LFA), it is important to determine both the component number and the local hidden dimensions appropriately, which is a typical example of model selection. One conventional approach for model selection is to implement a two-phase procedure with the help of model selection criteria, such as AIC, CAIC, BIC(MDL), SRM, CV, etc.. Although all working well given large enough samples, they still suffer from two problems. First, their performances will deteriorate greatly on a small sample size. Second, two-phase procedure requires intensive computation. To tackle the second problem, one type of efforts has been made in the literature, featured by an incremental implementation, e.g. IMoFA and VBMFA. Bayesian Ying-Yang (BYY) harmony learning provides not only a BYY harmony data smoothing criterion (BYY-C) in a two-phase implementation for the first problem, but also an algorithm called automatic BYY harmony learning (BYY-A) that have automatic model selection ability during parameter learning and thus can reduce the computational expense significantly. The lack of systematic comparisons in the literature motivates this work. Comparative experiments are first conducted on synthetic data considering not only different settings including noise, dimension and sample size, but also different evaluations including model selection accuracy and three other applied performances. Thereafter, comparisons are also made on several real world classification datasets. In two-phase implementation, the observations show that BIC and CAIC generally outperform AIC, SRM and CV, while BYY-C is the best for small sample sizes. Moreover, in the cases of a sufficiently large sample size, IMoFA, VBMFA, and BYY-A produce similar performances but with much reduced computational costs, where, still, BYY-A provides better or at least comparably good performances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Akaike, H.: A new look at the statistical model identification. IEEE Trans. Automatic Control 19(6), 716–723 (1974)

    Article  MATH  MathSciNet  Google Scholar 

  2. Barron, A.R., Rissanen, J., Yu, B.: The Minimum Description Length Principle in Coding and Modeling. IEEE Trans. Infor. Thr. 44(6), 2743–2760 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  3. Bengio, Y., Grandvalet, Y.: No Unbiased Estimator of the Variance of K-Fold Cross-Validation. JMLR 5, 1089–1105 (2004)

    MathSciNet  Google Scholar 

  4. Bowman, A.W., Azzalini, A.: Applied Smoothing Techniques for Data Analysis. Oxford Statistical Science Series. Oxford Science Publications (1997)

    Google Scholar 

  5. Bozdogan, H.: Model selection and Akaike’s information criterion (AIC): the general theory and its analytical extensions. Psychometrika 52(3), 345–370 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  6. Bozdogan, H.: Akaike’s Information Criterion and Recent Developments in Information Complexity. J. Math. Psychol. 44(1), 62–91 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  7. Burnham, K.P., Anderson, D.: Model Selection and Multi-Model Inference. Springer, Heidelberg (2002)

    Google Scholar 

  8. Constantinopoulos, C., Titsias, M.K.: Bayesian Feature and Model Selection for Gaussian Mixture Models. IEEE Trans. Pattern Anal. Mach. Intell. 28(6), 1013–1018 (2006)

    Article  Google Scholar 

  9. Figueiredo, M.A.F., Jain, A.K.: Unsupervised Learning of Finite Mixture Models. IEEE Trans. PAMI 24(3), 381–396 (2002)

    Google Scholar 

  10. Ghahramani, Z., Beal, M.J.: Variational Inference for Bayesian Mixtures of Factor Analysers. In: NIPS, pp. 449–455 (1999)

    Google Scholar 

  11. Hinton, G.E., Revow, M., Dayan, P.: Recognizing Handwritten Digits Using Mixtures of Linear Models. In: NIPS, pp. 1015–1022 (1994)

    Google Scholar 

  12. Mclachlan, G.J., Krishnan, T.: The EM Algorithm and Extensions (Wiley Series in Probability and Statistics). Wiley Series in Probability and Statistics. Wiley-Interscience, Chichester (2007)

    Google Scholar 

  13. Parzen, E.: On the estimation of a probability density function and mode. Annals of Mathematical Statistics 33, 1065–1076 (1962)

    Article  MATH  MathSciNet  Google Scholar 

  14. Redner, R., Walker, H.: Mixture Densities, Maximum Likelihood and the EM Algorithm. SIAM Review 26(2), 195–239 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  15. Roberts, S.J., Husmeier, D., Penny, W., Rezek, l.: Bayesian Approaches to Gaussian Mixture Modeling. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1133–1142 (1998)

    Article  Google Scholar 

  16. Rubin, D., Thayer, D.: EM algorithms for ML factor analysis. Psychometrika 47(1), 69–76 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  17. Salah, A.A., Alpaydin, E.: Incremental Mixtures of Factor Analysers. In: Proc. ICPR, vol. 1, pp. 276–279 (2004)

    Google Scholar 

  18. Schwarz, G.: Estimating the dimension of a model. Annals of Statistics 6, 461–464 (1978)

    Article  MATH  MathSciNet  Google Scholar 

  19. Stone, M.: Asymptotics For and Against Cross-Validation. Biometrika 64(1), 29–35 (1977)

    Article  MATH  MathSciNet  Google Scholar 

  20. Tikhonov, A., Arsenin, V.: Solutions of Ill-posed Problems. Winston and Sons (1977)

    Google Scholar 

  21. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)

    MATH  Google Scholar 

  22. Vapnik, V., Sterin, A.: On structural risk minimization or overall risk in a problem of pattern recognition. Automation and Remote Control 10(3), 1495–1503 (1977)

    Google Scholar 

  23. Wand, M., Jones, M.: Kernel Smoothing. Monographs on Statistics and Applied Probability. Chapman and Hall, London (1995)

    MATH  Google Scholar 

  24. Wang, L., Feng, J.: Learning Gaussian mixture models by structural risk minimization. In: Proc. Intnl. Conf. on Machine Learning and Cybernetics 2005, Guangzhou, China, vol. 8, pp. 4858–4863 (2005)

    Google Scholar 

  25. Xu, L.: Bayesian Ying Yang Learning. In: Scholarpedia, vol. 2(3) (1809), http://scholarpedia.org/article/Bayesian_Ying_Yang_Learning

  26. Xu, L.: Bayesian-Kullback coupled Ying-Yang machines: Unified learnings and new results on vector quantization. In: Proc. of ICONIP 1995, Beijing, China, pp. 977–988 (1995)

    Google Scholar 

  27. Xu, L.: Bayesian Ying-Yang system and theory as a unified statistical learning approach (II) From unsupervised learning to supervised learning and temporal modeling. In: Wong, K.W., King, I., Leung, D. (eds.) Theoretical aspects of neural computation: A multidisciplinary perspective, pp. 25–42. Springer, Berlin (1997)

    Google Scholar 

  28. Xu, L.: RBF nets, mixture experts, and Bayesian Ying-Yang learning. Neurocomputing 19(1-3), 223–257 (1998)

    Article  MATH  Google Scholar 

  29. Xu, L.: Best harmony, unified RPCL and automated model selection for unsupervised and supervised learning on Gaussian mixtures, ME-RBF models and three-layer nets. International Journal of Neural Systems 11(1), 3–69 (2001)

    Google Scholar 

  30. Xu, L.: BYY harmony learning, independent state space, and generalized APT financial analyses. IEEE Trans. Neural Networks 12, 822–849 (2001)

    Article  Google Scholar 

  31. Xu, L.: BYY harmony learning, structural RPCL, and topological self-organizing on mixture models. Neural Networks 15(8-9), 1125–1151 (2002)

    Article  Google Scholar 

  32. Xu, L.: Data smoothing regularization, multi-sets-learning, and problem solving strategies. Neural Networks 16(5-6), 817–825 (2003)

    Article  Google Scholar 

  33. Xu, L.: Advances on BYY harmony learning: information theoretic perspective, generalized projection geometry, and independent factor auto-determination. IEEE Trans. Neural Networks 15(5), 885–902 (2004)

    Article  Google Scholar 

  34. Xu, L.: Bayesian Ying-Yang Learning(I): A unified perspective for statistical modeling. In: Zhong, N., Liu, J. (eds.) Intelligent Technologies for Information Analysis, pp. 615–659. Springer, Heidelberg (2004)

    Google Scholar 

  35. Xu, L.: Bayesian Ying-Yang Learning(II): A new mechanism for model selection and regularization. In: Zhong, N., Liu, J. (eds.) Intelligent Technologies for Information Analysis, pp. 661–706. Springer, Heidelberg (2004)

    Google Scholar 

  36. Xu, L.: Fundamentals, Challenges, and Advances of Statistical Learning for Knowledge Discovery and Problem Solving: A BYY Harmony Perspective. In: Proc. of Int. Conf. on Neural Networks and Brain, Beijing, China, pp. 24–55 (2005)

    Google Scholar 

  37. Xu, L.: Trends on Regularization and Model Selection in Statistical Learning: A Perspective from Bayesian Ying Yang Learning. In: Duch, W., Mandziuk, J., Zurada, J.M. (eds.) Studies in Computational Intelligence, pp. 365–406. Springer, Heidelberg (2007)

    Google Scholar 

  38. Xu, L.: A unified perspective and new results on RHT computing, mixture based learning, and multi-learner based problem solving. Pattern Recognition, 2129–2153 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Hamid R. Tizhoosh Mario Ventresca

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Shi, L. (2008). Bayesian Ying-Yang Harmony Learning for Local Factor Analysis: A Comparative Investigation. In: Tizhoosh, H.R., Ventresca, M. (eds) Oppositional Concepts in Computational Intelligence. Studies in Computational Intelligence, vol 155. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70829-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-70829-2_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-70826-1

  • Online ISBN: 978-3-540-70829-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics