Skip to main content

A Survey of Algorithms for Dense Subgraph Discovery

  • Chapter
  • First Online:
Managing and Mining Graph Data

Part of the book series: Advances in Database Systems ((ADBS,volume 40))

Abstract

In this chapter, we present a survey of algorithms for dense subgraph discovery.The problem of dense subgraph discovery is closely related to clustering though the two problems also have a number of differences. For example, the problem of clustering is largely concerned with that of finding a fixed partition in the data, whereas the problem of dense subgraph discovery defines these dense components in a much more flexible way. The problem of dense subgraph discovery may wither be defined over single or multiple graphs. We explore both cases. In the latter case, the problem is also closely related to the problem of the frequent subgraph discovery. This chapter will discuss and organize the literature on this topic effectively in order to make it much more accessible to the reader.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J. Abello, M. G. C. Resende, and S. Sudarsky. Massive quasi-clique detection. In LATIN ’02: Proc. 5th Latin American Symposium on Theoretical Informatics, pages 598–612. Springer-Verlag, 2002.

    Google Scholar 

  2. F. Alkemade, H. A. La Poutre, and H. A. Amman. An agent-based evolutionary trade network simulation. In A. Nagurney, editor, Innovations in Financial and Economic Networks (New Dimensions in Networks), chapter 11, pages 237–255. Edward Elgar Publishing, 2004.

  3. R. Andersen. A local algorithm for finding dense subgraphs. In SODA ’08: Proc. 19th ACM-SIAM Symp. on Discrete Algorithms, pages 1003–1009. Society for Industrial and Applied Mathematics, 2008.

    Google Scholar 

  4. R. Andersen and K. Chellapilla. Finding dense subgraphs with size bounds. In WAW ’09: Proc. 6th Intl. Workshop on Algorithms and Models for the Web-Graph, pages 25–37. Springer-Verlag, 2009.

    Google Scholar 

  5. Anna Nagurney, ed. Innovations in Financial and Economic Networks (New Dimensions in Networks). Edward Elgar Publishing, 2004.

    Google Scholar 

  6. G. Bader and C. Hogue. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics, 4(1):2, 2003.

    Article  Google Scholar 

  7. V. Batagelj and M. Zaversnik. An o(m) algorithm for cores decomposition of networks. CoRR (Computing Research Repository), cs.DS/0310049, 2003.

    Google Scholar 

  8. P. Berkhin. Survey of clustering data mining techniques. In C. N. Jacob Kogan and M. Teboulle, editors, Grouping Multidimensional Data, chapter 2, pages 25–71. Springer Berlin Heidelberg, 2006.

    Chapter  Google Scholar 

  9. V. Boginski, S. Butenko, and P. M. Pardalos. On structural properties of the market graph. In A. Nagurney, editor, Innovations in Financial and Economic Networks (New Dimensions in Networks), chapter 2, pages 29–45. Edward Elgar Publishing, 2004.

  10. V. Boginski, S. Butenko, and P. M. Pardalos. Mining market data: A network approach. Computers and Operations Research, 33(11):3171–3184, 2006.

    Article  MATH  Google Scholar 

  11. A. Z. Broder, S. C. Glassman, M. S. Manasse, and G. Zweig. Syntactic clustering of the web. Comput. Netw. ISDN Syst., 29(8–13):1157–1166, 1997.

    Article  Google Scholar 

  12. C. Bron and J. Kerbosch. Algorithm 457: finding all cliques of an undirected graph. Commun. ACM, 16(9):575–577, 1973.

    Article  MATH  Google Scholar 

  13. D. Bu, Y. Zhao, L. Cai, H. Xue, and X. Z. andH. Lu. X. Z. Topological structure analysis of the protein-protein interaction network in budding yeast. Nucl. Acids Res., 31(9):2443–2450, 2003.

    Article  Google Scholar 

  14. M. Charikar. Greedy approximation algorithms for finding dense components in a graph. In APPROX ’00: Proc. 3rd Intl. Workshop on Approximation Algoritms for Combinatorial Optimization, volume 1913, pages 84–95. Springer, 2000.

    Google Scholar 

  15. X. Du, J. H. Thornton, R. Jin, L. Ding, and V. E. Lee. Migration motif: A spatial-temporal pattern mining approach for financial markets. In KDD ’09: Proc. 15th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining. ACM, 2009.

    Google Scholar 

  16. L. Everett, L.-S. Wang, and S. Hannenhalli. Dense subgraph computation via stochastic search: application to detect transcriptional modules. Bioinformatics, 22(14), July 2006.

    Google Scholar 

  17. G. W. Flake, S. Lawrence, and C. L. Giles. Efficient identification of web communities. In KDD’00: Proc. 6th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, pages 150–160, 2000.

    Google Scholar 

  18. D. Gibson, R. Kumar, and A. Tomkins. Discovering large dense subgraphs in massive graphs. In VLDB ’05: Proc. 31st Intl. Conf. on Very Large Data Bases, pages 721–732. ACM, 2005.

    Google Scholar 

  19. A. V. Goldberg. Finding a maximum density subgraph. Technical report, UC Berkeley, 1984.

    Google Scholar 

  20. G. Grimmett. Precolation. Springer Verlag, 2nd edition, 1999.

    Google Scholar 

  21. E. Hartuv and R. Shamir. A clustering algorithm based on graph connectivity. Inf. Process. Lett., 76(4–6):175–181, 2000.

    Article  MATH  MathSciNet  Google Scholar 

  22. H. Hu, X. Yan, Y. H. 0003, J. Han, and X. J. Zhou. Mining coherent dense subgraphs across massive biological networks for functional discovery. In ISMB (Supplement of Bioinformatics), pages 213–221, 2005.

    Google Scholar 

  23. A. Inokuchi, T. Washio, and H. Motoda. An apriori-based algorithm for mining frequent substructures from graph data. In PKDD ’00: Proc. 4th European Conf. on Principles of Data Mining and Knowledge Discovery, pages 13–23, 2000.

    Google Scholar 

  24. A. K. Jain, M. N. Murty, and P. J. Flynn. Data clustering: a review. ACM Comput. Surv., 31(3):264–323, 1999.

    Article  Google Scholar 

  25. R. Jin, Y. Xiang, N. Ruan, and D. Fuhry. 3-hop: A high-compression indexing scheme for reachability query. In SIGMOD ’09: Proc. ACM SIGMOD Intl. Conf. on Management of Data. ACM, 2009.

    Google Scholar 

  26. B. H. Junker and F. Schreiber. Analysis of Biological Networks. Wiley-Interscience, 2008.

    Google Scholar 

  27. R. Kannan and V. Vinay. Analyzing the structure of large graphs. manuscript, August 1999.

    Google Scholar 

  28. R. M. Karp. Reducibility among combinatorial problems. In R. E. Miller and J. W. Thatcher, editors, Complexity of Computer Computations, 85–103. Plenum, New York, 1972.

    Google Scholar 

  29. G. Kortsarz and D. Peleg. Generating sparse 2-spanners. J. Algorithms, 17(2):222–236, 1994.

    Article  MathSciNet  Google Scholar 

  30. R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Trawling the web for emerging cyber-communities. Computer Networks, 31(11–16):1481–1493, 1999.

    Article  Google Scholar 

  31. M. Kuramochi and G. Karypis. Frequent subgraph discovery. In ICDM ’01: Proc. IEEE Intl. Conf. on Data Mining, pages 313–320. IEEE Computer Society, 2001.

    Google Scholar 

  32. J. Li, K. Sim, G. Liu, and L. Wong. Maximal quasi-bicliques with balanced noise tolerance: Concepts and co-clustering applications. In SDM ’08: Proc. SIAM Intl. Conf. on Data Mining, pages 72–83. SIAM, 2008.

    Google Scholar 

  33. G. Liu and L. Wong. Effective pruning techniques for mining quasicliques. In W. Daelemans, B. Goethals, and K. Morik, editors, ECML/PKDD (2), volume 5212 of Lecture Notes in Computer Science, pages 33–49. Springer, 2008.

    Google Scholar 

  34. R. Luce. Connectivity and generalized cliques in sociometric group structure. Psychometrika, 15(2):169–190, 1950.

    Article  MathSciNet  Google Scholar 

  35. K. Makino and T. Uno. New algorithms for enumerating all maximal cliques. Algorithm Theory - SWAT 2004, pages 260–272, 2004.

    Google Scholar 

  36. H. Matsuda, T. Ishihara, and A. Hashimoto. Classifying molecular sequences using a linkage graph with their pairwise similarities. Theor. Comput. Sci., 210(2):305–325, 1999.

    Article  MATH  MathSciNet  Google Scholar 

  37. R. Mokken. Cliques, clubs and clans. Quality and Quantity, 13(2):161–173, 1979.

    Article  Google Scholar 

  38. J. W. Moon and L. Moser. On cliques in graphs. Israel Journal of Mathematics, 3:23–28, 1965.

    Article  MATH  MathSciNet  Google Scholar 

  39. M. E. J. Newman. The structure and function of complex networks. SIAM REVIEW, 45:167–256, 2003.

    Article  MATH  MathSciNet  Google Scholar 

  40. J. Pei, D. Jiang, and A. Zhang. On mining cross-graph quasi-cliques. In KDD’05: Proc. 11th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, pages 228–238. ACM, 2005.

    Google Scholar 

  41. L. Pitsoulis and M. Resende. Greedy randomized adaptive search procedures. In P. Pardalos and M. Resende, editors, Handbook of Applied Optimization, pages 168–181. Oxford University Press, 2002.

    Google Scholar 

  42. N. Przulj, D. Wigle, and I. Jurisica. Functional topology in a network of protein interactions. Bioinformatics, 20(3):340–348, 2004.

    Article  Google Scholar 

  43. R. Rymon. Search through systematic set enumeration. In Proc. Third Intl. Conf. on Knowledge Representation and Reasoning, 1992.

    Google Scholar 

  44. J. P. Scott. Social Network Analysis: A Handbook. Sage Publications Ltd., 2nd edition, 2000.

    Google Scholar 

  45. S. B. Seidman. Network structure and minimum degree. Social Networks, 5(3):269–287, 1983.

    Article  MathSciNet  Google Scholar 

  46. S. B. Seidman and B. Foster. A graph theoretic generalization of the clique concept. J. Math. Soc., 6(1):139–154, 1978.

    MATH  MathSciNet  Google Scholar 

  47. K. Sim, J. Li, V. Gopalkrishnan, and G. Liu. Mining maximal quasi-bicliques to co-cluster stocks and financial ratios for value investment. In ICDM ’06: Proc. 6th Intl. Conf. on Data Mining, pages 1059–1063. IEEE Computer Society, 2006.

    Google Scholar 

  48. D. K. Slonim. From patterns to pathways: gene expression data analysis comes of age. Nature Genetics, 32:502–508, 2002.

    Article  Google Scholar 

  49. V. Spirin and L. Mirny. Protein complexes and functional modules in molecular networks. Proc. Natl. Academy of Sci., 100(21):1123–1128, 2003.

    Google Scholar 

  50. Y. Takahashi, Y. Sato, H. Suzuki, and S.-i. Sasaki. Recognition of largest common structural fragment among a variety of chemical structures. Analytical Sciences, 3(1):23–28, 1987.

    Article  Google Scholar 

  51. P. Uetz, L. Giot, G. Cagney, T. A. Mansfield, R. S. Judson, J. R. Knight, D. Lockshon, V. Narayan, M. Srinivasan, P. Pochart, A. Qureshi-Emili, Y. Li, B. Godwin, D. Conover, T. Kalbfleisch, G. Vijayadamodar, M. Yang, M. Johnston, S. Fields, and J. M. Rothberg. A comprehensive analysis of protein-protein interactions in saccharomyces cerevisiae. Nature, 403:623–631, 2000.

    Article  Google Scholar 

  52. N. Wang, S. Parthasarathy, K.-L. Tan, and A. K. H. Tung. Csv: visualizing and mining cohesive subgraphs. In SIGMOD ’08: Proc. ACM SIGMOD Intl. Conf. on Management of Data, pages 445–458. ACM, 2008.

    Google Scholar 

  53. S. Wasserman and K. Faust. Social Network Analysis: Methods and Applications. Cambridge University Press, 1994.

    Google Scholar 

  54. S. Wuchty and E. Almaas. Peeling the yeast interaction network. Proteomics, 5(2):444–449, 2205.

    Article  Google Scholar 

  55. X. Yan, X. J. Zhou, and J. Han. Mining closed relational graphs with connectivity constraints. In KDD ’05: Proc. 11th ACM SIGKDD Intl. Conf. on Knowledge Discovery in Data Mining, pages 324–333. ACM, 2005.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Victor E. Lee .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag US

About this chapter

Cite this chapter

Lee, V.E., Ruan, N., Jin, R., Aggarwal, C. (2010). A Survey of Algorithms for Dense Subgraph Discovery. In: Aggarwal, C., Wang, H. (eds) Managing and Mining Graph Data. Advances in Database Systems, vol 40. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-6045-0_10

Download citation

  • DOI: https://doi.org/10.1007/978-1-4419-6045-0_10

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4419-6044-3

  • Online ISBN: 978-1-4419-6045-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics