Skip to main content

Large-Scale Social Network Analysis

  • Chapter
  • First Online:
Large-Scale Data Analytics

Abstract

Social Network Analysis (SNA) is an established discipline for the study of groups of individuals with applications in several areas, like economics, information science, organizational studies and psychology. In the last fifteen years the exponential growth of online Social Network Sites (SNSs) , like Facebook, QQ and Twitter has provided a new challenging application context for SNA methods. However, with respect to traditional SNA application domains these systems are characterized by very large volumes of data, and this has recently led to the development of parallel network analysis algorithms and libraries. In this chapter we provide an overview of the state of the art in the field of large scale social network analysis; in particular, we focus on parallel algorithms and libraries for the computation of network centrality metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://newsroom.fb.com/content/default.aspx?NewsAreaId=22.

  2. 2.

    http://www.bbc.co.uk/news/business-12889048.

  3. 3.

    http://blog.linkedin.com/2011/03/22/linkedin-100-million/.

  4. 4.

    http://blog.flickr.net/en/2010/09/19/5000000000/.

  5. 5.

    http://www.youtube.com/t/press_statistics.

  6. 6.

    Preferential attachment means that the probability that a new node A will be connected to an already existing node B is proportional to the number of edges that B already has.

  7. 7.

    http://www.top500.org/lists/2010/11.

  8. 8.

    http://openmp.org/.

  9. 9.

    http://www.cray.com/Products/XMT/Product/Specifications.aspx.

  10. 10.

    http://www.cineca.it/.

References

  1. Anderson, W., Briggs, P., Hellberg, C.S., Hess, D.W., Khokhlov, A., Lanzagorta, M., Rosenberg, R.: Early experience with scientific programs on the cray MTA-2. In: Proceedings of 2003 ACM/IEEE Conference on Supercomputing, SC’03, Phoenix, p. 46. ACM, New York, (2003). doi:10.1145/1048935.1050196

    Google Scholar 

  2. Aragon, C.R., GSeidel, R.: Randomized search trees. In: Annual IEEE Symposium on Foundations of Computer Science, Research Triangle Park. IEEE Computer Society, Los Alamitos, pp 540–545 (1989). doi:10.1109/SFCS.1989.63531

    Google Scholar 

  3. Asanovic, K., Bodik, R., Demmel, J., Keaveny, T., Keutzer, K., Kubiatowicz, J., Morgan, N., Patterson, D., Sen, K., Wawrzynek, J., Wessel, D., Yelick, K.: A view of the parallel computing landscape. Commun ACM 52, 56–67 (2009)

    Article  Google Scholar 

  4. Bader, D.A., Madduri, K.: Designing multithreaded algorithms for breadth-first search and st-connectivity on the Cray MTA-2. In: Proceedings of International Conference on Parallel Processing, Columbus. IEEE Computer Society, Los Alamitos, pp 523–530 (2006). doi:10.1109/ICPP.2006.34

    Google Scholar 

  5. Bader, D.A., Madduri, K.: Parallel algorithms for evaluating centrality indices in real-world networks. In: Proceedings of 2006 International Conference on Parallel Processing, ICPP’06, Columbus, pp. 539–550. IEEE Computer Society, Washington, DC (2006). doi:10.1109/ICPP.2006.57

    Google Scholar 

  6. Bader, D.A., Madduri, K.: SNAP, small-world network analysis and partitioning: an open-source parallel graph framework for the exploration of large-scale networks. In: Proceedings of International Symposium on Parallel and Distributed Processing, IPDPS, Miami, pp. 1–12 (2008). doi:10.1109/IPDPS.2008.4536261

    Google Scholar 

  7. Bal, H.E., Maassen, J., van Nieuwpoort, R.V., Drost, N., Kemp, R., Palmer, N., Wrzesinska, G., Kielmann, T., Seinstra, F., Jacobs, C.: Real-world distributed computing with Ibis. Computer 43, 54–62 (2010). doi:10.1109/MC.2010.184

    Article  Google Scholar 

  8. Barabasi, A.L., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 11 (1999)

    MathSciNet  Google Scholar 

  9. Barrett, B.W., Berry, J.W., Murphy, R.C., Wheeler, K.B.: Implementing a portable multi-threaded graph library: the MTGL on Qthreads. In: IEEE International Symposium on Parallel & Distributed Processing, IPDPS, Rome, pp. 1–8 (2009). doi:10.1109/IPDPS.2009.5161102

    Google Scholar 

  10. Berry, J.W., Hendrickson, B., Kahan, S., Konecny, P.: Graph software development and performance on the MTA-2 and Eldorado. In: 48th Cray Users Group Meeting, Lugano (2006)

    Google Scholar 

  11. Boost: Boost C++ Libraries. Available at http://www.boost.org/ (2011)

  12. Borkar, S.: Design challenges of technology scaling. IEEE Micro 19(4), 23–29 (1999)

    Article  Google Scholar 

  13. Brandes, U.: A faster algorithm for betweenness centrality. J. Math. Sociol. 25, 163–177 (2001)

    Article  MATH  Google Scholar 

  14. Buluç, A., Gilbert, J.R.: The combinatorial BLAS: design, implementation, and applications. Int. J. High Perform. Comput. Appl. 25, 496–509 (2011). doi:10.1177/1094342011403516

    Article  Google Scholar 

  15. Celli, F., Di Lascio, F., Magnani, M., Pacelli, B., Rossi, L.: Social network data and practices: the case of friendfeed. In: Chai, S.K., Salerno, J., Mabry, P. (eds.) Advances in Social Computing. LNCS, vol. 6007, pp 346–353. Springer, Berlin/Heidelberg (2010). doi:10.1007/978-3-642-12079-4_43

    Chapter  Google Scholar 

  16. Combinatorial BLAS: Combinatorial BLAS Library (MPI reference implementation). Version 1.1, Available at http://gauss.cs.ucsb.edu/~aydin/CombBLAS/html/index.html (2011)

  17. Culler, D., Singh, K.P., Gupta, A.: Parallel Computer Architecture – A Hardware/Software Approach. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  18. Dean, J., Ghemawat, S.: Mapreduce: a flexible data processing tool. Commun. ACM 53(1), 72–77 (2010). doi:10.1145/1629175.1629198

    Article  Google Scholar 

  19. DisNet: DisNet, A Framework for Distributed Graph Computation. Available at http://nd.edu/~dial/software.html (2011)

  20. Du, N., Wang, H., Faloutsos, C.: Analysis of large multi-modal social networks: patterns and a generator. In: Proceedings of the 2010 European conference on Machine Learning and Knowledge Discovery in Databases: Part I, ECML PKDD’10, Barcelona, pp. 393–408. Springer, Berlin/Heidelberg, (2010). http://portal.acm.org/citation.cfm?id=1888258.1888291

  21. Edmonds, N., Hoefler, T., Lumsdaine, A.: A space-efficient parallel algorithm for computing betweenness centrality in distributed memory. In: Proceedings of International Conference on High Performance Computing (HiPC), Dona Paula, pp. 1–10 (IEEE, 2010). doi:10.1109/HIPC.2010.5713180

    Google Scholar 

  22. Erdős, P., Rényi, A.: On random graphs I. Publ Math Debrecen 6, 290–297, 156 (1959)

    Google Scholar 

  23. Evans, B.M., Chi, E.H.: Towards a model of understanding social search. In: Proceedings of the 2008 ACM Conference on Computer Supported Cooperative Work, CSCW ’08, San Diego. ACM, New York, pp. 485–494 (2008). doi:10.1145/1460563.1460641

    Google Scholar 

  24. Feo, J., Harper, D., Kahan, S., Konecny, P.: Eldorado. In: Proceedings of 2nd Conference on Computing Frontiers, CF ’05, Ischia. ACM, New York, pp. 28–34 (2005)

    Google Scholar 

  25. Foster, I.: Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering. Addison-Wesley Longman, Boston (1995)

    MATH  Google Scholar 

  26. Freeman, L.C.: Centrality in social networks: a conceptual clarification. Soc. Netw. 1(3), 215–239 (1978–1979)

    Google Scholar 

  27. Gregor, D., Lumsdaine, A.: The Parallel BGL: A generic library for distributed graph computations. In: Parallel Object-Oriented Scientific Computing, POOSC, Glasgow (2005)

    Google Scholar 

  28. Hadoop.: Apache hadoop. Available at http://hadoop.apache.org/ (2011)

  29. HipG.: HipG: High-level distributed processing of large-scale graphs. Available at http://www.cs.vu.nl/~ekr/hipg/ (2011)

  30. Kang, U., Tsourakakis, C.E., Faloutsos, C.: PEGASUS: mining peta-scale graphs. Knowl. Inf. Syst. 27(2), 303–325 (2011). doi:10.1007/s10115-010-0305-0

    Article  Google Scholar 

  31. Krepska, E., Kielmann, T., Fokkink, W., Bal, H.: A high-level framework for distributed processing of large-scale graphs. In: Proceedings of the 12th International Conference on Distributed Computing and Networking, ICDCN’11, Bangalore, pp. 155–166. Springer, Berlin/Heidelberg (2011)

    Google Scholar 

  32. Kumar, V., Gupta, A.G.A., Karpis, G.: Introduction to Parallel Computing, 2nd edn. Addison Wesley, Harlow (2003)

    Google Scholar 

  33. Lawson, C.L., Hanson, R.J., Kincaid, D.R., Krogh, F.T.: Basic linear algebra subprograms for fortran usage. ACM Trans Math Softw 5, 308–323 (1979). doi:10.1145/355841.355847

    Article  MATH  Google Scholar 

  34. Lichtenwalter, R.N., Chawla, N.V.: DisNet: A framework for distributed graph computation. In: Proceedings 2011 International Conference on Social Networks Analysis and Mining (ASONAM), Kaohsiung (2011, to appear)

    Google Scholar 

  35. Lumsdaine, A., Gregor, D., Hendrickson, B., Berry, J.W.: Challenges in parallel graph processing. Parallel Process. Lett. 17(1), 5–20 (2007)

    Article  MathSciNet  Google Scholar 

  36. Madduri, K., Bader, D.A.: Compact graph representations and parallel connectivity algorithms for massive dynamic network analysis. In: Proceedings of International Parallel and Distributed Processing Symposium, IPDPS, Rome. IEEE Computer Society, Los Alamitos, pp. 1–11 (2009)

    Google Scholar 

  37. Madduri, K., Bader, D.A.: Small-world Network Analysis and Partitioning–Version 0.4. Available at http://snap-graph.sourceforge.net/ (2010)

  38. Magnani, M., Rossi, L.: The ml-model for multi layer network analysis. In: IEEE International Conference on Advances in Social Network Analysis and Mining, Kaohsiung (2011)

    Google Scholar 

  39. Magnani, M., Rossi, L., Montesi, D.: Information propagation analysis in a social network site. In: 2010 International Conference on Advances in Social Networks Analysis and Mining, Odense, pp. 296–300. IEEE Computer Society, Los Alamitos (2010)

    Google Scholar 

  40. Message Passing Interface Forum MPI: A Message-Passing Interface Standard–Version 2.2. Available at http://www.mpi-forum.org/docs/ (2009)

  41. Moore, G.E.: Cramming more components onto integrated circuits. Proc. IEEE 86(1), 82 (1998). doi:10.1109/JPROC.1998.658762

    Article  Google Scholar 

  42. Moreno, J.L., Jennings, H.H.: Who Shall Survive?: A New Approach to the Problem of Human Interrelations. Nervous and Mental Disease Publishing Co., Washington, D.C. (1934)

    Book  Google Scholar 

  43. OpenMP Architecture Review Board: OpenMP Application Program Interface–Version 3.1. Available at http://openmp.org/wp/ (2011)

  44. Opsahl, T., Agneessens, F., Skvoretz, J.: Node centrality in weighted networks: Generalizing degree and shortest paths. Soc. Netw. 32(3), 245–251 (2010)

    Article  Google Scholar 

  45. Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web. Technical report, Stanford Digital Library Technologies Project (1998)

    Google Scholar 

  46. Pegasus: Project Pegasus. Available at http://www.cs.cmu.edu/~pegasus/ (2011)

  47. Sandia National Laboratories: Multi-Threaded Graph Library–Version 1.0. Available at https://software.sandia.gov/trac/mtgl (2011)

  48. Siek, J., Lee, L.Q., Lumsdaine, A.: The Boost Graph Library: User Guide and Reference Manual. Addison-Wesley, Boston (2002)

    Google Scholar 

  49. Trobec, R., Vajteršic, M., Zinterhof, P. (eds.): Parallel Computing: Numerics, Applications, and Trends. Springer, Dordrecht/New York (2009). doi:10.1007/978-1-84882-409-6_1

    Google Scholar 

  50. Watts, D.J., Strogatz, S.H.: Collective dynamics of “small-world” networks. Nature 393(6684), 440–442 (1998)

    Article  Google Scholar 

  51. Weng, J., Lim, E.P., Jiang, J., He, Q.: Twitterrank: finding topic-sensitive influential twitterers. In: Proceedings of Third ACM International Conference on Web Search and Data Mining, WSDM ’10, New York, pp. 261–270. ACM, New York (2010). doi:10.1145/1718487.1718520

    Google Scholar 

  52. Wheeler, K.B., Murphy, R.C., Thain, D.: Qthreads: an api for programming with millions of lightweight threads. In: 22nd IEEE International Symposium on Parallel and Distributed Processing, IPDPS. IEEE, Miami, pp. 1–8 (2008). doi:10.1109/IPDPS.2008.4536359

    Google Scholar 

  53. White, D., Borgatti, S.: Betweenness centrality measures for directed graphs. Soc. Netw. 16(4), 335–346 (1994). doi:10.1016/0378-8733(94)90015-9

    Article  Google Scholar 

Download references

Acknowledgements

This work has been partially funded by PRIN project “Relazioni sociali e identità in rete: vissuti e narrazioni degli italiani nei siti di social network” and by FIRB project “Information monitoring, propagation analysis and community detection in Social Network Sites”. This work was done while M. Magnani and C. Paolino were with the Deptartment of Computer Science, University of Bologna.

The authors thank the CINECA supercomputing center for providing access to the IBM pSeries 575 used for part of the tests described in Sect. 6.6.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Moreno Marzolla .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this chapter

Cite this chapter

Lambertini, M., Magnani, M., Marzolla, M., Montesi, D., Paolino, C. (2014). Large-Scale Social Network Analysis. In: Gkoulalas-Divanis, A., Labbi, A. (eds) Large-Scale Data Analytics. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-9242-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-9242-9_6

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-9241-2

  • Online ISBN: 978-1-4614-9242-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics