Skip to main content

Machine-Learning Based Objective Function Selection for Community Detection

  • Conference paper
  • First Online:
Cyber Security, Cryptology, and Machine Learning (CSCML 2022)

Abstract

NECTAR, a Node-centric ovErlapping Community deTection AlgoRithm, presented by Cohen et al., chooses dynamically between two objective functions which to optimize, based on the network on which it is invoked. It was shown that this approach outperforms six state-of-the-art algorithms for overlapping community detection. In this work, we present NECTAR-ML, an extension of the NECTAR algorithm that uses a machine-learning based model for automating the selection of the objective function, trained and evaluated on a dataset of 15,755 synthetic and 7 real-world networks. Our analysis shows that in approximately 90% of the cases our model was able to successfully select the correct objective function. We conducted a competitive analysis of NECTAR and NECTAR-ML. NECTAR-ML was shown to significantly outperform NECTAR’s ability to select the best objective function. We also conducted a competitive analysis of NECTAR-ML and two additional state-of-the-art multi-objective evolutionary community detection algorithms. NECTAR-ML outperformed both algorithms in terms of average detection quality. Multi-objective evolutionary algorithms are considered to be the most popular approach to solve multi-objective optimization problems and the fact that NECTAR-ML significantly outperforms them demonstrates the effectiveness of ML-based objective function selection.

This work was supported in part by the Cyber Security Research Center at Ben-Gurion University.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The dataset was generated using a parallel computing framework we developed, available at https://github.com/asaborn/GenericMultiTasking.

References

  1. Ahn, Y.Y., Bagrow, J.P., Lehmann, S.: Link communities reveal multiscale complexity in networks. Nature 466(7307), 761–764 (2010). https://doi.org/10.1038/nature09182

    Article  Google Scholar 

  2. Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008(10), P10008 (2008). https://doi.org/10.1088/1742-5468/2008/10/p10008

  3. Bornstein, A., Rubin, A., Hendler, D.: Machine-learning based objective function selection for community detection (2022). https://doi.org/10.48550/ARXIV.2203.13495. https://arxiv.org/abs/2203.13495

  4. Brandes, U., et al.: On finding graph clusterings with maximum modularity. In: Brandstädt, A., Kratsch, D., Müller, H. (eds.) WG 2007. LNCS, vol. 4769, pp. 121–132. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74839-7_12

    Chapter  Google Scholar 

  5. Chen, M., Kuzmin, K., Szymanski, B.K.: Extension of modularity density for overlapping community structure. In: 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014), pp. 856–863 (2014). https://doi.org/10.1109/ASONAM.2014.6921686

  6. Clauset, A., Moore, C., Newman, M.E.J.: Hierarchical structure and the prediction of missing links in networks. Nature 453(7191), 98–101 (2008). https://doi.org/10.1038/nature06830

    Article  Google Scholar 

  7. Cohen, Y., Hendler, D., Rubin, A.: Node-centric detection of overlapping communities in social networks. In: Shmueli, E., Barzel, B., Puzis, R. (eds.) NetSci-X 2017. SPC, pp. 1–10. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55471-6_1

    Chapter  Google Scholar 

  8. Collins, L.M., Dent, C.W.: Omega: a general formulation of the rand index of cluster recovery suitable for non-disjoint solutions. Multivar. Behav. Res. 23(2), 231–242 (1988). https://doi.org/10.1207/s15327906mbr2302_6. pMID: 26764947

  9. Flake, G., Lawrence, S., Giles, C., Coetzee, F.: Self-organization and identification of web communities. Computer 35(3), 66–70 (2002). https://doi.org/10.1109/2.989932

    Article  Google Scholar 

  10. Gao, Y., Zhang, H., Zhang, Y.: Overlapping community detection based on conductance optimization in large-scale networks. Phys. A Stat. Mech. Appl. 522, 69–79 (2019). https://doi.org/10.1016/j.physa.2019.01.142. https://www.sciencedirect.com/science/article/pii/S0378437119301487

  11. Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99(12), 7821–7826 (2002). https://doi.org/10.1073/pnas.122653799. https://www.pnas.org/content/99/12/7821

  12. Gong, M., Cai, Q., Chen, X., Ma, L.: Complex network clustering by multiobjective discrete particle swarm optimization based on decomposition. IEEE Trans. Evol. Comput. 18(1), 82–97 (2014). https://doi.org/10.1109/TEVC.2013.2260862

    Article  Google Scholar 

  13. Gong, M., Fu, B., Jiao, L., Du, H.: Memetic algorithm for community detection in networks. Phys. Rev. E 84, 056101 (2011). https://doi.org/10.1103/PhysRevE.84.056101. https://link.aps.org/doi/10.1103/PhysRevE.84.056101

  14. Gong, M., Ma, L., Zhang, Q., Jiao, L.: Community detection in networks by using multiobjective evolutionary algorithm with decomposition. Phys. A Stat. Mech. Appl. 391(15), 4050–4060 (2012). https://doi.org/10.1016/j.physa.2012.03.021. https://www.sciencedirect.com/science/article/pii/S0378437112002579

  15. Gregory, S.: Finding overlapping communities in networks by label propagation. New J. Phys. 12(10), 103018 (2010). https://doi.org/10.1088/1367-2630/12/10/103018

  16. Gregory, S.: Fuzzy overlapping communities in networks. J. Stat. Mech. Theory Exp 2011(02), P02017 (2011). https://doi.org/10.1088/1742-5468/2011/02/p02017

  17. King, A.D., Pržulj, N., Jurisica, I.: Protein complex prediction via cost-based clustering. Bioinformatics 20(17), 3013–3020 (2004). https://doi.org/10.1093/bioinformatics/bth351

  18. Kondor, R.I., Lafferty, J.: Diffusion kernels on graphs and other discrete structures. In: Proceedings of the ICML, pp. 315–322 (2002)

    Google Scholar 

  19. Krogan, N.J., et al.: Global landscape of protein complexes in the yeast saccharomyces cerevisiae. Nature 440(7084), 637–643 (2006)

    Article  Google Scholar 

  20. Lancichinetti, A., Fortunato, S.: Community detection algorithms: a comparative analysis. Phys. Rev. E 80, 056117 (2009). https://doi.org/10.1103/PhysRevE.80.056117. https://link.aps.org/doi/10.1103/PhysRevE.80.056117

  21. Lancichinetti, A., Fortunato, S., Kertész, J.: Detecting the overlapping and hierarchical community structure in complex networks. New J. Phys. 11(3), 033015 (2009). https://doi.org/10.1088/1367-2630/11/3/033015

  22. Lancichinetti, A., Fortunato, S., Radicchi, F.: Benchmark graphs for testing community detection algorithms. Phys. Rev. E 78, 046110 (2008). https://doi.org/10.1103/PhysRevE.78.046110. https://link.aps.org/doi/10.1103/PhysRevE.78.046110

  23. Lee, C., Reid, F., McDaid, A., Hurley, N.: Detecting highly overlapping community structure by greedy clique expansion (2010)

    Google Scholar 

  24. Leskovec, J., Krevl, A.: SNAP Datasets: Stanford large network dataset collection, June 2014. http://snap.stanford.edu/data

  25. Liu, C., Liu, J., Jiang, Z.: A multiobjective evolutionary algorithm based on similarity for community detection from signed social networks. IEEE Trans. Cybernet. 44(12), 2274–2287 (2014). https://doi.org/10.1109/TCYB.2014.2305974

    Article  Google Scholar 

  26. Pizzuti, C.: GA-Net: a genetic algorithm for community detection in social networks. In: Rudolph, G., Jansen, T., Beume, N., Lucas, S., Poloni, C. (eds.) PPSN 2008. LNCS, vol. 5199, pp. 1081–1090. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87700-4_107

    Chapter  Google Scholar 

  27. Pizzuti, C.: A multiobjective genetic algorithm to find communities in complex networks. IEEE Trans. Evol. Comput. 16(3), 418–430 (2012). https://doi.org/10.1109/TEVC.2011.2161090

    Article  Google Scholar 

  28. Pizzuti, C., Rombo, S.E.: Algorithms and tools for protein-protein interaction networks clustering, with a special focus on population-based stochastic methods. Bioinformatics 30(10), 1343–1352 (2014). https://doi.org/10.1093/bioinformatics/btu034

  29. Prat-Pérez, A., Dominguez-Sal, D., Brunat, J.M., Larriba-Pey, J.L.: Shaping communities out of triangles. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM 2012, pp. 1677–1681. Association for Computing Machinery, New York (2012). https://doi.org/10.1145/2396761.2398496

  30. Shi, C., Yan, Z., Cai, Y., Wu, B.: Multi-objective community detection in complex networks. Appl. Soft Comput. 12(2), 850–859 (2012). https://doi.org/10.1016/j.asoc.2011.10.005. https://www.sciencedirect.com/science/article/pii/S1568494611003991

  31. Šíma, J., Schaeffer, S.E.: On the NP-completeness of some graph cluster measures. In: Wiedermann, J., Tel, G., Pokorný, J., Bieliková, M., Štuller, J. (eds.) SOFSEM 2006. LNCS, vol. 3831, pp. 530–537. Springer, Heidelberg (2006). https://doi.org/10.1007/11611257_51

    Chapter  MATH  Google Scholar 

  32. Tasgin, M., Herdagdelen, A., Bingol, H.: Community detection in complex networks using genetic algorithms (2007)

    Google Scholar 

  33. Tian, Y., Yang, S., Zhang, X.: An evolutionary multiobjective optimization based fuzzy method for overlapping community detection. IEEE Trans. Fuzzy Syst. 28(11), 2841–2855 (2020). https://doi.org/10.1109/TFUZZ.2019.2945241

    Article  Google Scholar 

  34. Viamontes Esquivel, A., Rosvall, M.: Compression of flow can reveal overlapping-module organization in networks. Phys. Rev. X 1, 021025 (2011). https://doi.org/10.1103/PhysRevX.1.021025. https://link.aps.org/doi/10.1103/PhysRevX.1.021025

  35. Wen, X., et al.: A maximal clique based multiobjective evolutionary algorithm for overlapping community detection. IEEE Trans. Evol. Comput. 21(3), 363–377 (2017). https://doi.org/10.1109/TEVC.2016.2605501

    Article  Google Scholar 

  36. Xie, J., Kelley, S., Szymanski, B.K.: Overlapping community detection in networks: the state-of-the-art and comparative study. ACM Comput. Surv. 45(4) (2013). https://doi.org/10.1145/2501654.2501657

  37. Xie, J., Szymanski, B.K.: Towards linear time overlapping community detection in social networks. In: Tan, P.-N., Chawla, S., Ho, C.K., Bailey, J. (eds.) PAKDD 2012. LNCS (LNAI), vol. 7302, pp. 25–36. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30220-6_3

    Chapter  Google Scholar 

  38. Yang, J., Leskovec, J.: Community-affiliation graph model for overlapping network community detection. In: 2012 IEEE 12th International Conference on Data Mining, pp. 1170–1175 (2012). https://doi.org/10.1109/ICDM.2012.139

  39. Yang, J., Leskovec, J.: Defining and evaluating network communities based on ground-truth. Knowl. Inf. Syst. 42(1), 181–213 (2015). https://doi.org/10.1007/s10115-013-0693-z10.1007/s10115-013-0693-z

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Asa Bornstein .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bornstein, A., Rubin, A., Hendler, D. (2022). Machine-Learning Based Objective Function Selection for Community Detection. In: Dolev, S., Katz, J., Meisels, A. (eds) Cyber Security, Cryptology, and Machine Learning. CSCML 2022. Lecture Notes in Computer Science, vol 13301. Springer, Cham. https://doi.org/10.1007/978-3-031-07689-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-07689-3_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-07688-6

  • Online ISBN: 978-3-031-07689-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics