Skip to main content

Discovering Links Between Lexical and Surface Features in Questions and Answers

  • Conference paper
Advances in Web Mining and Web Usage Analysis (WebKDD 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3932))

Included in the following conference series:

Abstract

Information retrieval systems, based on keyword match, are evolving to question answering systems that return short passages or direct answers to questions, rather than URLs pointing to whole pages. Most open-domain question answering systems depend on manually designed hierarchies of question types. A question is first classified to a fixed type, and then hand-engineered rules associated with the type yield keywords and/or predictive annotations that are likely to match indexed answer passages. Here we seek a more data-driven approach, assisted by machine learning. We propose a simple log-linear model over a pair of feature vectors, one derived from the question and the other derived from the a candidate passage. Features are extracted using a lexical network and surface context as in named entity extraction, except that there is no direct supervision available in the form of fixed entity types and their examples. Using the log-linear model, we filter candidate passages and see substantial improvement in the mean rank at which the first answer is found. The model parameters distill and reveal linguistic artifacts coupling questions and their answers, which can be used for better annotation and indexing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agichtein, E., Lawrence, S., Gravano, L.: Learning search engine specific query transformations for question answering. In: WWW Conference, pp. 169–178 (2001)

    Google Scholar 

  2. Breck, E., Burger, J., House, D., Light, M., Mani, I.: Answering from Large Document Collections. In: AAAI Fall Symposium on Question Answering Systems (1999)

    Google Scholar 

  3. Chen, S.F., Rosenfeld, R.: A gaussian prior for smoothing maximum entropy models. Technical Report CMU-CS-99-108, Carnegie Mellon University (1999)

    Google Scholar 

  4. Clarke, C.L.A., Cormack, G.V., Lynam, T.R.: Exploiting redundancy in question answering. In: SIGIR, pp. 358–365 (2001)

    Google Scholar 

  5. Dumais, S., Banko, M., Brill, E., Lin, J., Ng, A.: Web question answering: Is more always better? In: SIGIR, pp. 291–298 (2002)

    Google Scholar 

  6. Etzioni, O., Cafarella, M., et al.: Web-scale information extraction in KnowItAll. In: WWW Conference. ACM, New York (2004)

    Google Scholar 

  7. Harabagiu, S., Moldovan, D., Pasca, M., Mihalcea, R., Surdeanu, M., Bunescu, R., Girju, R., Rus, V., Morarescu, P.: FALCON: Boosting knowledge for answer engines. In: TREC 9, pp. 479–488. NIST (2000)

    Google Scholar 

  8. Hovy, E., Gerber, L., Hermjakob, U., Junk, M., Lin, C.-Y.: Question answering in Webclopedia. In: TREC 9, NIST (2001)

    Google Scholar 

  9. Katz, B., Lin, J.: Selectively using relations to improve precision in question answering. In: EACL Workshop on Natural Language Processing for Question Answering, Budapest, Hungary (2003)

    Google Scholar 

  10. Kwok, C., Etzioni, O., Weld, D.S.: Scaling question answering to the Web. In: WWW Conference, Hong Kong, vol. 10, pp. 150–161 (2001)

    Google Scholar 

  11. Light, M., Mann, G., Riloff, E., Breck, E.: Analyses for elucidating current question answering technology. Journal of Natural Language Engineering 7(4), 325–342 (2001)

    Article  Google Scholar 

  12. Lin, D., Pantel, P.: Discovery of inference rules for question answering. Natural Language Engineering 7(4), 343–360 (2001)

    Article  Google Scholar 

  13. McCallum, A.: Efficiently inducing features of conditional random fields. In: UAI (2003)

    Google Scholar 

  14. Miller, G., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: Introduction to WordNet: An online lexical database. International Journal of Lexicography (1993)

    Google Scholar 

  15. Nyberg, E., Mitamura, T., Callan, J., Carbonell, J., Frederking, R., Collins-Thompson, K., Hiyakumoto, L., Huang, Y., Huttenhower, C., Judy, S., Ko, J., Kupsc, A., Lita, L.V., Pedro, V., Svoboda, D., Durme, B.V.: The JAVELIN question-answering system at TREC 2003: A multi-strategy approach with dynamic planning. In: TREC, vol. 12 (2003)

    Google Scholar 

  16. Prager, J., Brown, E., Coden, A., Radev, D.: Question-answering by predictive annotation. In: SIGIR, pp. 184–191. ACM, New York (2000)

    Chapter  Google Scholar 

  17. Radev, D., Fan, W., Qi, H., Wu, H., Grewal, A.: Probabilistic question answering on the web. In: WWW Conference, pp. 408–419 (2002)

    Google Scholar 

  18. Ramakrishnan, G., Chakrabarti, S., Paranjpe, D.A., Bhattacharyya, P.: Is question answering an acquired skill? In: WWW Conference, New York, pp. 111–120 (2004)

    Google Scholar 

  19. Suzuki, J., Hirao, T., Sasaki, Y., Maeda, E.: Hierarchical directed acyclic graph kernel: Methods for structured natural language data. In: ACL, pp. 32–39 (2003)

    Google Scholar 

  20. Tellex, S., Katz, B., et al.: Quantitative evaluation of passage retrieval algorithms for question answering. In: SIGIR, pp. 41–47 (2003)

    Google Scholar 

  21. Voorhees, E.: Overview of the TREC 2001 question answering track. In: The Tenth Text REtrieval Conference. NIST Special Publication, vol. 500-250, pp. 42–51 (2001)

    Google Scholar 

  22. Yarowsky, D.: Decision lists for lexical ambiguity resolution: Application to accent restoration in Spanish and French. In: ACL, Las Cruces, NM, vol. 32, pp. 88–95 (1994)

    Google Scholar 

  23. Zhang, D., Lee, W.S.: A language modeling approach to passage question answering. In: Text REtrieval Conference (TREC), NIST, vol. 12 (November 2003)

    Google Scholar 

  24. Zhang, D., Lee, W.S.: Question classification using support vector machines. In: SIGIR, Toronto, Canada. ACM, New York (2003)

    Google Scholar 

  25. Zhang, J., Yang, Y.: Robustness of regularized linear classification methods in text categorization. In: SIGIR, pp. 190–197. ACM, New York (2003)

    Google Scholar 

  26. Zheng, Z.: AnswerBus question answering system. In: HLT (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chakrabarti, S. (2006). Discovering Links Between Lexical and Surface Features in Questions and Answers. In: Mobasher, B., Nasraoui, O., Liu, B., Masand, B. (eds) Advances in Web Mining and Web Usage Analysis. WebKDD 2004. Lecture Notes in Computer Science(), vol 3932. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11899402_8

Download citation

  • DOI: https://doi.org/10.1007/11899402_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-47127-1

  • Online ISBN: 978-3-540-47128-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics