Constructing and Analyzing Uncertain Social Networks from Unstructured Textual Data

Johansson, Fredrik; Svenson, Pontus

doi:10.1007/978-94-007-6359-3_3

Fredrik Johansson⁵ &
Pontus Svenson⁵

Part of the book series: Lecture Notes in Social Networks ((LNSN))

2767 Accesses
3 Citations

Abstract

Social network analysis and link diagrams are popular tools among intelligence analysts for analyzing and understanding criminal and terrorist organizations. A bottleneck in the use of such techniques is the manual effort needed to create the network to analyze from available source information. We describe how text mining techniques can be used for extraction of named entities and the relations among them, in order to enable automatic construction of networks from unstructured text. Since the text mining techniques used, viz. algorithms for named entity recognition and relation extraction, are not perfect, we also describe a method for incorporating information about uncertainty when constructing the networks and when doing the social network analysis. The presented approach is applied on text documents describing terrorist activities in Indonesia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
A downside with closeness centrality is that it is not applicable to networks with several disconnected components. A possible solution for this is to consider the inverse closeness centrality instead.
2.
Many real-world networks are scale-free, i.e., their number of edges follow a power law distribution [15].
3.
A more complete description of the workings of the NER in NLTK can be found in [46].
4.
http://www.crisisgroup.org/en/regions/asia/south-east-asia/indonesia/043-indonesia-backgrounder-how-the-jemaah-islamiyah-terrorist-network-operates.aspx

References

Raab J, Milward HB (2003) Dark networks as problems. J Public Adm Res Theory 13:413–439
Article Google Scholar
Svenson P, Svensson P, Tullberg H (2006) Social network analysis and information fusion for anti-terrorism. In: Proceedings of the conference on civil and military readiness 2006
Google Scholar
Zhu B, Watts S, Chen H (2010) Visualizing social network concepts. Decis Support Syst 49:151–161
Article Google Scholar
Geffre JL, Deckro RF, Knighton SA (2009) Determining critical members of layered operational terrorist networks. J Defense Model Simul, Appl Methodol Technol 6:97–109
Google Scholar
Hougham V (2005) Sociological skills used in the capture of Saddam Hussein. http://www.asanet.org/footnotes/julyaugust05/fn3.html
Koelle D, Pfautz J, Farry M, Cox Z, Catto G, Campolongo J (2006) Applications of Bayesian belief networks in social network analysis. In: Proceedings of the 4th Bayesian modeling applications workshop during the 22nd annual conference on uncertainty in artificial intelligence
Google Scholar
Fellegi IP, Sunter AB (1969) A theory for record linkage. J Am Stat Assoc 64(328):1183–1210
Article Google Scholar
Dahlin J (2011) Entity matching. Swedish Defence Research Agency, Tech Rep
Google Scholar
Frantz TL, Cataldo M, Carley KM (2009) Robustness of centrality measures under uncertainty: examining the role of network topology. Comput Math Organ Theory 303–328
Google Scholar
Freeman LC (1979) Centrality in social networks: conceptual clarification. Soc Netw 1(3):215–239
Article Google Scholar
Scott J (2000) Social network analysis, 2nd edn. Sage, Thousand Oaks
Google Scholar
Newman MEJ (2001) Scientific collaboration networks. ii. Shortest paths, weighted networks, and centrality. Phys Rev E 64:016132
Article ADS Google Scholar
Wasserman S, Faust K (1994) Social network analysis: methods and applications. Cambridge University Press, Cambridge
Book Google Scholar
de Nooy W, Mrvar A, Batagelj V (2005) Exploratory social network analysis with Pajek. Structural analysis in the social sciences. Cambridge University Press, Cambridge
Book Google Scholar
Barabási A-L, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512
Article MathSciNet ADS Google Scholar
Bastian M, Heymann S, Jacomy M (2009) Gephi: an open source software for exploring and manipulating networks. In: Adar E, Hurst M, Finin T, Glance NS, Nicolov N, Tseng BL (eds) Proceedings of the 3rd international AAAI conference on weblogs and social media
Google Scholar
Batagelj V, Mrvar A (2002) Pajek—analysis and visualization of large networks. In: Mutzel P, Jünger M, Leipert S (eds) Graph drawing. Lecture Notes in Computer Science, vol 2265. Springer, Berlin, pp 8–11
Chapter Google Scholar
Blondel V, Guillaume J, Lambiotte R, Mech E (2008) Fast unfolding of communities in large networks. J Stat Mech, Theory Exp P10008
Google Scholar
Adar E, Ré C (2007) Managing uncertainty in social networks. IEEE Data Eng Bull 30(2):23–31
Google Scholar
Kossinets G (2006) Effects of missing data in social networks. Soc Netw 28:247–268
Article Google Scholar
Costenbader E, Valente TW (2003) The stability of centrality measures when networks are sampled. Soc Netw 25:283–307
Article Google Scholar
Borgatti SP, Carley KM, Krackhardt D (2004) On the robustness of centrality measures under conditions of imperfect data. Soc Netw 28(2):124–136
Article Google Scholar
Svenson P (2008) Social network analysis of uncertain networks. In: Proceedings of the 2nd Skövde workshop on information fusion topics
Google Scholar
Dahlin J, Svenson P (2011) A method for community detection in uncertain networks. In: Proceedings of the European intelligence and security informatics conference, EISIC 2011
Google Scholar
Yager RR (2008) Intelligent social network analysis using granular computing. Int J Intell Syst 23:1196–1219
Google Scholar
Dahlin J (2011) Community detection in imperfect networks. Master’s thesis, Umeå University
Google Scholar
Opsahl T, Agneessens F, Skvoretz J (2010) Node centrality in weighted networks: generalizing degree and shortest paths. Soc Netw 32(3):245–251
Article Google Scholar
Newman MEJ (2004) Analysis of weighted networks. Phys Rev E 70:056131
Article ADS Google Scholar
Barrat A, Barthelemy M, Pastor-Satorras R, Vespignani A (2004) The architecture of complex weighted networks. Proc Natl Acad Sci (PNAS) 101:3747
Article ADS Google Scholar
Brandes U (2001) A faster algorithm for betweenness centrality. J Math Sociol 25:163–177
Article MATH Google Scholar
Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci (PNAS) 99(12):7821–7826
Article MathSciNet ADS MATH Google Scholar
Newman MEJ (2006) Modularity and community structure in networks. Proc Natl Acad Sci USA 103(23):8577–8582
Article ADS Google Scholar
Feldman R, Sanger J (2007) The text mining handbook—advanced approaches in analyzing unstructured data. Cambridge University Press, Cambridge
Google Scholar
Nadeau D, Sekine S (2007) A survey of named entity recognition and classification. Linguist Investig 30(1):3–26
Article Google Scholar
Hasegawa T, Sekine S, Grishman R (2004) Discovering relations among named entities from large corpora. In: Proceedings of the 42nd annual meeting on association for computational linguistics
Google Scholar
Doddington G, Mitchell A, Przybock M, Ramshaw L, Strassel S, Weischedel R (2004) The automatic content extraction (ACE) program: tasks, data, and evaluation. In: Proceedings of LREC’04
Google Scholar
Banko M, Etzioni O (2008) The tradeoffs between open and traditional relation extraction. In: Proceedings of ACL-08: HLT, pp 28–36
Google Scholar
Zelenko D, Aone C, Richardella A (2003) Kernel methods for relation extraction. J Mach Learn Res 3:1083–1106
MathSciNet MATH Google Scholar
Mesquita F, Merhav Y, Barbosa D (2010) Extracting information networks from the blogosphere: state-of-the-art and challenges. In: Proceedings of the fourth international conference on weblogs and social media
Google Scholar
Banko M, Cafarella MJ, Soderl S, Broadhead M, Etzioni O (2007) Open information extraction from the web. In: Proceedings of the 20th international joint conference on artificial intelligence, pp 2670–2676
Google Scholar
Zhu J, Nie Z, Liu X, Zhang B, Wen J-R (2009) Statsnowball: a statistical approach to extracting entity relationships. In: Proceedings of the 18th international conference on world wide web, ser. WWW ’09, pp 101–110
Chapter Google Scholar
GuoDong Z, Jian S, Jie Z, Min Z (2005) Exploring various knowledge in relation extraction. In: Proceedings of the 43rd annual meeting on association for computational linguistics, pp 427–434
Google Scholar
Morris JF, Anthony K, Kennedy KT, Deckro RF (2011) Extraction distractions: a comparison of social network model construction methods. In: Proceedings of the 2011 European intelligence and security informatics conference, EISIC2011
Google Scholar
Makrehchi M, Kamel MS (2005) Building social networks from web documents: a text mining approach. In: Proceedings of the 2nd LORNET scientific conference
Google Scholar
Elson DK, Dames N, McKeown KR (2010) Extracting social networks from literary fiction. In: Proceedings of the 48th annual meeting of the association for computational linguistics, pp 138–147
Google Scholar
Bird S, Klein E, Loper E (2009) Natural language processing with python: analyzing text with the natural language toolkit. O’Reilly Media
MATH Google Scholar
Fang Y, Chang KC-C (2011) Searching patterns for relation extraction over the Web: rediscovering the pattern-relation duality. In: Proceedings of the fourth ACM international conference on Web search and data mining, ser. WSDM ’11, pp 825–834
Chapter Google Scholar
Shafer G (1976) A mathematical theory of evidence. Princeton University Press, Princeton
MATH Google Scholar

Download references

Acknowledgements

This work was supported by the R&D programme of the Swedish Armed Forces. We would like to express our thanks to the other members of the FOI Information Fusion and Data Mining group and the VIA project for fruitful discussions and valuable feedback.

Author information

Authors and Affiliations

Swedish Defence Research Agency (FOI), SE-164 90, Stockholm, Sweden
Fredrik Johansson & Pontus Svenson

Authors

Fredrik Johansson
View author publications
You can also search for this author in PubMed Google Scholar
Pontus Svenson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fredrik Johansson .

Editor information

Editors and Affiliations

Department of Computer Engineering, TOBB University, Sogutozu Cad No. 43, Sogutozu Ankara, Turkey
Tansel Özyer
Information Technologies Institute, TUBITAK BILGEM, Gebze, Kocaeli, 41470, Turkey
Zeki Erdem
Computer Science, University of Calgary, University Dr. NW 2500, Calgary, T2N 1N4, Canada
Jon Rokne
American University of Sharjah, Universities City, Sharjah, Saudi Arabia
Suheil Khoury

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Johansson, F., Svenson, P. (2013). Constructing and Analyzing Uncertain Social Networks from Unstructured Textual Data. In: Özyer, T., Erdem, Z., Rokne, J., Khoury, S. (eds) Mining Social Networks and Security Informatics. Lecture Notes in Social Networks. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-6359-3_3

Download citation

DOI: https://doi.org/10.1007/978-94-007-6359-3_3
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-6358-6
Online ISBN: 978-94-007-6359-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics