Skip to main content

Constructing and Analyzing Uncertain Social Networks from Unstructured Textual Data

  • Chapter
Mining Social Networks and Security Informatics

Part of the book series: Lecture Notes in Social Networks ((LNSN))

Abstract

Social network analysis and link diagrams are popular tools among intelligence analysts for analyzing and understanding criminal and terrorist organizations. A bottleneck in the use of such techniques is the manual effort needed to create the network to analyze from available source information. We describe how text mining techniques can be used for extraction of named entities and the relations among them, in order to enable automatic construction of networks from unstructured text. Since the text mining techniques used, viz. algorithms for named entity recognition and relation extraction, are not perfect, we also describe a method for incorporating information about uncertainty when constructing the networks and when doing the social network analysis. The presented approach is applied on text documents describing terrorist activities in Indonesia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A downside with closeness centrality is that it is not applicable to networks with several disconnected components. A possible solution for this is to consider the inverse closeness centrality instead.

  2. 2.

    Many real-world networks are scale-free, i.e., their number of edges follow a power law distribution [15].

  3. 3.

    A more complete description of the workings of the NER in NLTK can be found in [46].

  4. 4.

    http://www.crisisgroup.org/en/regions/asia/south-east-asia/indonesia/043-indonesia-backgrounder-how-the-jemaah-islamiyah-terrorist-network-operates.aspx

References

  1. Raab J, Milward HB (2003) Dark networks as problems. J Public Adm Res Theory 13:413–439

    Article  Google Scholar 

  2. Svenson P, Svensson P, Tullberg H (2006) Social network analysis and information fusion for anti-terrorism. In: Proceedings of the conference on civil and military readiness 2006

    Google Scholar 

  3. Zhu B, Watts S, Chen H (2010) Visualizing social network concepts. Decis Support Syst 49:151–161

    Article  Google Scholar 

  4. Geffre JL, Deckro RF, Knighton SA (2009) Determining critical members of layered operational terrorist networks. J Defense Model Simul, Appl Methodol Technol 6:97–109

    Google Scholar 

  5. Hougham V (2005) Sociological skills used in the capture of Saddam Hussein. http://www.asanet.org/footnotes/julyaugust05/fn3.html

  6. Koelle D, Pfautz J, Farry M, Cox Z, Catto G, Campolongo J (2006) Applications of Bayesian belief networks in social network analysis. In: Proceedings of the 4th Bayesian modeling applications workshop during the 22nd annual conference on uncertainty in artificial intelligence

    Google Scholar 

  7. Fellegi IP, Sunter AB (1969) A theory for record linkage. J Am Stat Assoc 64(328):1183–1210

    Article  Google Scholar 

  8. Dahlin J (2011) Entity matching. Swedish Defence Research Agency, Tech Rep

    Google Scholar 

  9. Frantz TL, Cataldo M, Carley KM (2009) Robustness of centrality measures under uncertainty: examining the role of network topology. Comput Math Organ Theory 303–328

    Google Scholar 

  10. Freeman LC (1979) Centrality in social networks: conceptual clarification. Soc Netw 1(3):215–239

    Article  Google Scholar 

  11. Scott J (2000) Social network analysis, 2nd edn. Sage, Thousand Oaks

    Google Scholar 

  12. Newman MEJ (2001) Scientific collaboration networks. ii. Shortest paths, weighted networks, and centrality. Phys Rev E 64:016132

    Article  ADS  Google Scholar 

  13. Wasserman S, Faust K (1994) Social network analysis: methods and applications. Cambridge University Press, Cambridge

    Book  Google Scholar 

  14. de Nooy W, Mrvar A, Batagelj V (2005) Exploratory social network analysis with Pajek. Structural analysis in the social sciences. Cambridge University Press, Cambridge

    Book  Google Scholar 

  15. Barabási A-L, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512

    Article  MathSciNet  ADS  Google Scholar 

  16. Bastian M, Heymann S, Jacomy M (2009) Gephi: an open source software for exploring and manipulating networks. In: Adar E, Hurst M, Finin T, Glance NS, Nicolov N, Tseng BL (eds) Proceedings of the 3rd international AAAI conference on weblogs and social media

    Google Scholar 

  17. Batagelj V, Mrvar A (2002) Pajek—analysis and visualization of large networks. In: Mutzel P, Jünger M, Leipert S (eds) Graph drawing. Lecture Notes in Computer Science, vol 2265. Springer, Berlin, pp 8–11

    Chapter  Google Scholar 

  18. Blondel V, Guillaume J, Lambiotte R, Mech E (2008) Fast unfolding of communities in large networks. J Stat Mech, Theory Exp P10008

    Google Scholar 

  19. Adar E, Ré C (2007) Managing uncertainty in social networks. IEEE Data Eng Bull 30(2):23–31

    Google Scholar 

  20. Kossinets G (2006) Effects of missing data in social networks. Soc Netw 28:247–268

    Article  Google Scholar 

  21. Costenbader E, Valente TW (2003) The stability of centrality measures when networks are sampled. Soc Netw 25:283–307

    Article  Google Scholar 

  22. Borgatti SP, Carley KM, Krackhardt D (2004) On the robustness of centrality measures under conditions of imperfect data. Soc Netw 28(2):124–136

    Article  Google Scholar 

  23. Svenson P (2008) Social network analysis of uncertain networks. In: Proceedings of the 2nd Skövde workshop on information fusion topics

    Google Scholar 

  24. Dahlin J, Svenson P (2011) A method for community detection in uncertain networks. In: Proceedings of the European intelligence and security informatics conference, EISIC 2011

    Google Scholar 

  25. Yager RR (2008) Intelligent social network analysis using granular computing. Int J Intell Syst 23:1196–1219

    Google Scholar 

  26. Dahlin J (2011) Community detection in imperfect networks. Master’s thesis, Umeå University

    Google Scholar 

  27. Opsahl T, Agneessens F, Skvoretz J (2010) Node centrality in weighted networks: generalizing degree and shortest paths. Soc Netw 32(3):245–251

    Article  Google Scholar 

  28. Newman MEJ (2004) Analysis of weighted networks. Phys Rev E 70:056131

    Article  ADS  Google Scholar 

  29. Barrat A, Barthelemy M, Pastor-Satorras R, Vespignani A (2004) The architecture of complex weighted networks. Proc Natl Acad Sci (PNAS) 101:3747

    Article  ADS  Google Scholar 

  30. Brandes U (2001) A faster algorithm for betweenness centrality. J Math Sociol 25:163–177

    Article  MATH  Google Scholar 

  31. Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci (PNAS) 99(12):7821–7826

    Article  MathSciNet  ADS  MATH  Google Scholar 

  32. Newman MEJ (2006) Modularity and community structure in networks. Proc Natl Acad Sci USA 103(23):8577–8582

    Article  ADS  Google Scholar 

  33. Feldman R, Sanger J (2007) The text mining handbook—advanced approaches in analyzing unstructured data. Cambridge University Press, Cambridge

    Google Scholar 

  34. Nadeau D, Sekine S (2007) A survey of named entity recognition and classification. Linguist Investig 30(1):3–26

    Article  Google Scholar 

  35. Hasegawa T, Sekine S, Grishman R (2004) Discovering relations among named entities from large corpora. In: Proceedings of the 42nd annual meeting on association for computational linguistics

    Google Scholar 

  36. Doddington G, Mitchell A, Przybock M, Ramshaw L, Strassel S, Weischedel R (2004) The automatic content extraction (ACE) program: tasks, data, and evaluation. In: Proceedings of LREC’04

    Google Scholar 

  37. Banko M, Etzioni O (2008) The tradeoffs between open and traditional relation extraction. In: Proceedings of ACL-08: HLT, pp 28–36

    Google Scholar 

  38. Zelenko D, Aone C, Richardella A (2003) Kernel methods for relation extraction. J Mach Learn Res 3:1083–1106

    MathSciNet  MATH  Google Scholar 

  39. Mesquita F, Merhav Y, Barbosa D (2010) Extracting information networks from the blogosphere: state-of-the-art and challenges. In: Proceedings of the fourth international conference on weblogs and social media

    Google Scholar 

  40. Banko M, Cafarella MJ, Soderl S, Broadhead M, Etzioni O (2007) Open information extraction from the web. In: Proceedings of the 20th international joint conference on artificial intelligence, pp 2670–2676

    Google Scholar 

  41. Zhu J, Nie Z, Liu X, Zhang B, Wen J-R (2009) Statsnowball: a statistical approach to extracting entity relationships. In: Proceedings of the 18th international conference on world wide web, ser. WWW ’09, pp 101–110

    Chapter  Google Scholar 

  42. GuoDong Z, Jian S, Jie Z, Min Z (2005) Exploring various knowledge in relation extraction. In: Proceedings of the 43rd annual meeting on association for computational linguistics, pp 427–434

    Google Scholar 

  43. Morris JF, Anthony K, Kennedy KT, Deckro RF (2011) Extraction distractions: a comparison of social network model construction methods. In: Proceedings of the 2011 European intelligence and security informatics conference, EISIC2011

    Google Scholar 

  44. Makrehchi M, Kamel MS (2005) Building social networks from web documents: a text mining approach. In: Proceedings of the 2nd LORNET scientific conference

    Google Scholar 

  45. Elson DK, Dames N, McKeown KR (2010) Extracting social networks from literary fiction. In: Proceedings of the 48th annual meeting of the association for computational linguistics, pp 138–147

    Google Scholar 

  46. Bird S, Klein E, Loper E (2009) Natural language processing with python: analyzing text with the natural language toolkit. O’Reilly Media

    MATH  Google Scholar 

  47. Fang Y, Chang KC-C (2011) Searching patterns for relation extraction over the Web: rediscovering the pattern-relation duality. In: Proceedings of the fourth ACM international conference on Web search and data mining, ser. WSDM ’11, pp 825–834

    Chapter  Google Scholar 

  48. Shafer G (1976) A mathematical theory of evidence. Princeton University Press, Princeton

    MATH  Google Scholar 

Download references

Acknowledgements

This work was supported by the R&D programme of the Swedish Armed Forces. We would like to express our thanks to the other members of the FOI Information Fusion and Data Mining group and the VIA project for fruitful discussions and valuable feedback.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fredrik Johansson .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Johansson, F., Svenson, P. (2013). Constructing and Analyzing Uncertain Social Networks from Unstructured Textual Data. In: Özyer, T., Erdem, Z., Rokne, J., Khoury, S. (eds) Mining Social Networks and Security Informatics. Lecture Notes in Social Networks. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-6359-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-94-007-6359-3_3

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-007-6358-6

  • Online ISBN: 978-94-007-6359-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics