Skip to main content

Communication-Free Widened Learning of Bayesian Network Classifiers Using Hashed Fiedler Vectors

  • Conference paper
  • First Online:
Advances in Intelligent Data Analysis XVII (IDA 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11191))

Included in the following conference series:

Abstract

Widening is a method where parallel resources are used to find better solutions from greedy algorithms instead of merely trying to find the same solutions more quickly. To date, every example of Widening has used some form of communication between the parallel workers to maintain their distances from one another in the model space. For the first time, we present a communication-free, widened extension to a standard machine learning algorithm. By using Locality Sensitive Hashing on the Bayesian networks’ Fiedler vectors, we demonstrate the ability to learn classifiers superior to those of standard implementations and to those generated with a greedy heuristic alone.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Isomorphic graphs have similar Fiedler vectors. The converse is not necessarily true.

References

  1. Akbar, Zaenal, Ivanova, Violeta N., Berthold, Michael R.: Parallel data mining revisited. Better, not faster. In: Hollmén, Jaakko, Klawonn, Frank, Tucker, Allan (eds.) IDA 2012. LNCS, vol. 7619, pp. 23–34. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34156-4_4

    Chapter  Google Scholar 

  2. Akl, S.G.: Parallel real-time computation: sometimes quantity means quality. In: Proceedings of International Symposium on Parallel Architectures, Algorithms and Networks, 2000. I-SPAN 2000, pp. 2–11. IEEE (2000)

    Google Scholar 

  3. Bielza, Concha, Larrañaga, Pedro: Discrete Bayesian network classifiers: a survey. ACM Comput. Surv. (CSUR) 47(1), 5 (2014)

    Article  Google Scholar 

  4. Andrei Z. Broder. On the resemblance and containment of documents. In: Proceedings of Compression and Complexity of Sequences 1997, pp. 21–29. IEEE (1997)

    Google Scholar 

  5. Buhler, Jeremy: Efficient large-scale sequence comparison by locality-sensitive hashing. Bioinformatics 17(5), 419–428 (2001)

    Article  Google Scholar 

  6. Charikar, M.S.: Similarity estimation techniques from rounding algorithms. In: Proceedings of the Thiry-Fourth Annual ACM Symposium on Theory of Computing, pp. 380–388. ACM (2002)

    Google Scholar 

  7. Fan-Roon Kim Chung. Spectral Graph Theory. Number 92 in Regional Conference Series in Mathematics. American Mathematical Society, 1997

    Google Scholar 

  8. Coenen, F.: LUCS-KDD DN software (2003)

    Google Scholar 

  9. Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the Twentieth Annual Symposium on Computational Geometry, pp. 253–262. ACM (2004)

    Google Scholar 

  10. Doyle, P.G., Laurie Snell, J.: Random Walks and Electric Networks. Mathematical Association of America (1984)

    Google Scholar 

  11. Fillbrunn, Alexander, Berthold, Michael R.: Diversity-driven widening of hierarchical agglomerative clustering. In: Fromont, Elisa, De Bie, Tijl, van Leeuwen, Matthijs (eds.) IDA 2015. LNCS, vol. 9385, pp. 84–94. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24465-5_8

    Chapter  Google Scholar 

  12. Fillbrunn, Alexander, Wörteler, Leonard, Grossniklaus, Michael, Berthold, Michael R.: Bucket selection: a model-independent diverse selection strategy for widening. In: Adams, Niall, Tucker, Allan, Weston, David (eds.) IDA 2017. LNCS, vol. 10584, pp. 87–98. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68765-0_8

    Chapter  Google Scholar 

  13. Gionis, Aristides, Indyk, Piotr, Motwani, Rajeev: Similarity search in high dimensions via hashing. VLDB 99, 518–529 (1999)

    Google Scholar 

  14. Guo, Krystal, Mohar, Bojan: Hermitian adjacency matrix of digraphs and mixed graphs. J. Graph Theory 85(1), 217–248 (2017)

    Article  MathSciNet  Google Scholar 

  15. Koski, T.J.T., Noble, J.M.: A review of Bayesian networks and structure learning. Mathematica Applicanda 40(1), 53–103 (2012)

    MathSciNet  MATH  Google Scholar 

  16. Kulis, B., Grauman, K.: Kernelized locality-sensitive hashing for scalable image search. In: 12th International Conference on Computer Vision, pp. 2130–7. IEEE (2009)

    Google Scholar 

  17. Larrañaga, Pedro, Karshenas, Hossein, Bielza, Concha, Santana, Roberto: A review on evolutionary algorithms in Bayesian network learning and inference tasks. Inf. Sci. 233, 109–125 (2013)

    Article  MathSciNet  Google Scholar 

  18. Lichman, M.: UCI Machine Learning Repository (2013)

    Google Scholar 

  19. Luo, Bin, Wilson, Richard C., Hancock, Edwin R.: Spectral feature vectors for graph clustering. In: Caelli, Terry, Amin, Adnan, Duin, Robert P.W., de Ridder, Dick, Kamel, Mohamed (eds.) SSPR /SPR 2002. LNCS, vol. 2396, pp. 83–93. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-70659-3_8

    Chapter  Google Scholar 

  20. Marolt, Matija: A mid-level representation for melody-based retrieval in audio collections. IEEE Trans. Multimed. 10(8), 1617–1625 (2008)

    Article  Google Scholar 

  21. Meinl, T.: Maximum-Score Diversity Selection. Ph.D. thesis, University of Konstanz, Konstanz, Germany (2010)

    Google Scholar 

  22. Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers Inc., Burlington (1988)

    MATH  Google Scholar 

  23. Qiu, Huaijun, Hancock, Edwin R.: Graph matching and clustering using spectral partitions. Pattern Recognit. 39(1), 22–34 (2006)

    Article  Google Scholar 

  24. Sampson, Oliver, Berthold, Michael R.: Widened KRIMP: better performance through diverse parallelism. In: Blockeel, Hendrik, van Leeuwen, Matthijs, Vinciotti, Veronica (eds.) IDA 2014. LNCS, vol. 8819, pp. 276–285. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12571-8_24

    Chapter  Google Scholar 

  25. Sampson, Oliver R., Berthold, Michael R.: Widened learning of Bayesian network classifiers. In: Boström, Henrik, Knobbe, Arno, Soares, Carlos, Papapetrou, Panagiotis (eds.) IDA 2016. LNCS, vol. 9897, pp. 215–225. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46349-0_19

    Chapter  Google Scholar 

  26. Satu Elisa Schaeffer: Graph clustering. Comput. Sci. Rev. 1(1), 27–64 (2007)

    Article  Google Scholar 

  27. Scutari, Marco: Learning Bayesian networks with the bnlearn R package. J. Stat. Softw. 35(3), 1–22 (2010)

    Article  MathSciNet  Google Scholar 

  28. Terasawa, Kengo, Tanaka, Yuzuru: Spherical LSH for approximate nearest neighbor search on unit hypersphere. In: Dehne, Frank, Sack, Jörg-Rüdiger, Zeh, Norbert (eds.) WADS 2007. LNCS, vol. 4619, pp. 27–38. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73951-7_4

    Chapter  Google Scholar 

  29. Van Dam, E.R., Haemers, W.H.: Which graphs are determined by their spectrum? Linear Algebra Appl. 373, 241–272 (2003)

    Google Scholar 

  30. Vishveshwara, S., Brinda, K.V., Kannan, N.: Protein structure: insights from graph theory. J. Theor. Comput. Chem. 1(01), 187–211 (2002)

    Article  Google Scholar 

  31. Zhang, Boyu, Liu, Xianglong, Lang, Bo: Fast graph similarity search via locality sensitive hashing. In: Ho, Yo-Sung, Sang, Jitao, Ro, Yong Man, Kim, Junmo, Wu, Fei (eds.) PCM 2015. LNCS, vol. 9314, pp. 623–633. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24075-6_60

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Oliver R. Sampson .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sampson, O.R., Borgelt, C., Berthold, M.R. (2018). Communication-Free Widened Learning of Bayesian Network Classifiers Using Hashed Fiedler Vectors. In: Duivesteijn, W., Siebes, A., Ukkonen, A. (eds) Advances in Intelligent Data Analysis XVII. IDA 2018. Lecture Notes in Computer Science(), vol 11191. Springer, Cham. https://doi.org/10.1007/978-3-030-01768-2_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01768-2_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01767-5

  • Online ISBN: 978-3-030-01768-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics