Abstract
Microblogging sites are important sources of situational information during any natural or man-made disasters. Hence, it is important to design and test Information Retrieval (IR) systems that retrieve information from microblogs during disasters. With this perspective, a track was organized at the 8th meeting of Forum for Information Retrieval Evaluation (FIRE) 2016, focused on microblog retrieval during disaster events. A collection of about 50,000 microblogs posted during the Nepal Earthquake in April 2015 was released, along with a set of seven pragmatic information needs during a disaster situation. The task was to retrieve microblogs relevant to these information needs. Ten teams participated in the task, and fifteen runs were submitted. Evaluation of the performances of various microblog retrieval methodologies, as submitted by the participants, revealed several challenges associated with microblog retrieval. In this chapter, we describe our experience in organizing the FIRE track on microblog retrieval during disaster events. Additionally, we propose two novel methodologies for the said task, which perform better than all the methodologies submitted to the FIRE track.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
Since the different annotators potentially judged different sets of tweets, reporting inter-annotator agreement would not be meaningful under these circumstances.
- 6.
Note that the Twitter terms and conditions prohibit direct public sharing of tweets. Hence, only the tweet-ids of the tweets were distributed among the participants, along with a Python script using which the tweets can be downloaded via the Twitter API.
- 7.
- 8.
- 9.
- 10.
https://lucene.apache.org/ (2016, August 20).
- 11.
- 12.
The POS tagger included in the Python Natural Language Toolkit was used.
- 13.
We also tried retrieval with other parts of speech, and observed that forming the query out of nouns, verbs, and adjectives, gives the best retrieval performance.
- 14.
The Gensim implementation for word2vec was used – https://radimrehurek.com/gensim/models/word2vec.html.
- 15.
We had many ties among the rankings, e.g., the top-ranked tweet for FMT1 and the top-ranked tweet for FMT2 both had same rank.
- 16.
References
AIDR - Artificial Intelligence for Disaster Response. https://irevolutions.org/2013/10/01/aidr-artificial-intelligence-for-disaster-response/
Bandyopadhyay, S.: Correlation distance based information extraction system at FIRE 2016 Microblog Track. In: Working notes for the 2016 Conference of the Forum for Information Retrieval Evaluation (FIRE), CEUR Workshop Proceedings. CEUR-WS.org, December 2016
Basu, M., Bandyopadhyay, S., Ghosh, S.: Post disaster situation awareness and decision support through interactive crowdsourcing. In: Proceedings of International Conference on Humanitarian Technology: Science, Systems and Global Impact (HumTech), Procedia Engineering, pp. 167–173. Elsevier (2016)
Bhardwaj, P., Pakray, P.: Information extraction from Microblogs. In: Working Notes for the 2016 Conference of the Forum for Information Retrieval Evaluation (FIRE), CEUR Workshop Proceedings. CEUR-WS.org, December 2016
Chakraborty, R., Bhavsar, M.: Information Retrieval from Microblogs during natural disasters. In: Working Notes for the 2016 Conference of the Forum for Information Retrieval Evaluation (FIRE), CEUR Workshop Proceedings. CEUR-WS.org, December 2016
Cleverdon, C.: The cranfield tests on index language devices. In: Sparck Jones, K., Willett, P. (eds.) Readings in Information Retrieval, pp. 47–59. Morgan Kaufmann Publishers Inc., San Francisco (1997)
CrisisLex: Crisis-related Social Media Data and Tools. http://crisislex.org/
Dasgupta, S., Kumar, A., Das, D., Naskar, S.K., Bandyopadhyay, S.: Word embeddings for information extraction from tweets. In: Working notes for the 2016 Conference of the Forum for Information Retrieval Evaluation (FIRE), CEUR Workshop Proceedings. CEUR-WS.org, December 2016
Ghorai, T.: An information retrieval system for FIRE 2016 Microblog Track. In: Working Notes for the 2016 Conference of the Forum for Information Retrieval Evaluation (FIRE), CEUR Workshop Proceedings. CEUR-WS.org, December 2016
Ghosh, S., Ghosh, K.: Overview of the FIRE 2016 Microblog Track: information extraction from microblogs posted during disasters. In: Working Notes of FIRE 2016 - Forum for Information Retrieval Evaluation, Kolkata, India, pp. 56–61. 7–10 December 2016. http://ceur-ws.org/Vol-1737/T2-1.pdf
Hürriyetoǧlu, A., van den Bosch, A., Oostdijk, N.: Relevant tweet detection in Nepal earthquake with relevancer. In: Working notes for the 2016 Conference of the Forum for Information Retrieval Evaluation (FIRE), CEUR Workshop Proceedings. CEUR-WS.org, December 2016
Imran, M., Castillo, C., Diaz, F., Vieweg, S.: Processing social media messages in mass emergency: a survey. ACM Comput. Surv. 47(4), 67:1–67:38 (2015)
Li, W., Ganguly, D., Jones, G.J.F.: Using WordNet for query expansion: ADAPT@ FIRE 2016 Microblog Track. In: Working Notes for the 2016 Conference of the Forum for Information Retrieval Evaluation (FIRE), CEUR Workshop Proceedings. CEUR-WS.org, December 2016
Lin, J., Efron, M., Wang, Y., Sherman, G., Voorhees, E.: Overview of the TREC-2015 Microblog Track. In: Proceedings of Text Retrieval Conference (TREC) (2015). http://trec.nist.gov/pubs/trec24/papers/Overview-MB.pdf
Lkhagvasuren, G., Gonçalves, T., Saias, J.: Semi-automatic keyword based approach for FIRE 2016 Microblog Track. In: Working Notes for the 2016 Conference of the Forum for Information Retrieval Evaluation (FIRE), CEUR Workshop Proceedings. CEUR-WS.org, December 2016
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)
Mikolov, T., Yih, W., Zweig, G.: Linguistic regularities in continuous space word representations. In: NAACL HLT 2013 (2013)
Modha, S., Mandalia, C., Agrawal, K., Verma, D., Majumder, P.: Real time information extraction from Microblog. In: Working Notes for the 2016 Conference of the Forum for Information Retrieval Evaluation (FIRE), CEUR Workshop Proceedings. CEUR-WS.org, December 2016
Ounis, I., Macdonald, C., Lin, J., Soboroff, I.: Overview of the TREC-2011 Microblog Track. In: Proceedings of Text Retrieval Conference (TREC) (2011). http://trec.nist.gov/pubs/trec20/papers/MICROBLOG.OVERVIEW.pdf
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014). http://www.aclweb.org/anthology/D14-1162
Robertson, S.E., Zaragoza, H.: The probabilistic relevance framework: BM25 and beyond. Found. Trends Inf. Retr. 3(4), 333–389 (2009)
Soni, R., Pal, S.: IIT BHU at FIRE 2016 Microblog Track: a semi-automatic Microblog retrieval system. In: Working Notes for the 2016 Conference of the Forum for Information Retrieval Evaluation (FIRE), CEUR Workshop Proceedings. CEUR-WS.org, December 2016
Sparck Jones, K., van Rijsbergen, C.: Report on the need for and provision of an ideal information retrieval test collection. Technical report 5266, Computer Laboratory, University of Cambridge, UK (1975)
Strohman, T., Metzler, D., Turtle, H., Croft, W.B.: Indri: a language model-based search engine for complex queries. In: Proceedings of ICIA (2004). http://www.lemurproject.org/indri/
Tao, K., Abel, F., Hauff, C., Houben, G.J., Gadiraju, U.: Groundhog day: near-duplicate detection on Twitter. In: Proceedings of World Wide Web (WWW) (2013)
Twitter Search API. https://dev.twitter.com/rest/public/search
Varga, I., et al.: Aid is out there: looking for help from tweets during a large scale disaster. In: Proceedings of ACL (2013)
Vieweg, S., Hughes, A.L., Starbird, K., Palen, L.: Microblogging during two natural hazards events: what Twitter may contribute to situational awareness. In: Proceedings of ACM SIGCHI (2010)
World Disasters Report 2013 - Focus on technology and the future of humanitarian action (2013). http://www.ifrc.org/PageFiles/134658/WDR2013complete.pdf
Acknowledgement
We thank the FIRE organizing committee for allowing us to run the track, and all participating teams for their participation. This research was partially supported by a grant from the Information Technology Research Academy (ITRA), MeITY, Government of India (Ref. No.: ITRA/15 (58)/Mobile/DISARM/05).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Basu, M., Ghosh, K., Das, S., Bandyopadhyay, S., Ghosh, S. (2018). Microblog Retrieval During Disasters: Comparative Evaluation of IR Methodologies. In: Majumder, P., Mitra, M., Mehta, P., Sankhavara, J. (eds) Text Processing. FIRE 2016. Lecture Notes in Computer Science(), vol 10478. Springer, Cham. https://doi.org/10.1007/978-3-319-73606-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-73606-8_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73605-1
Online ISBN: 978-3-319-73606-8
eBook Packages: Computer ScienceComputer Science (R0)