Skip to main content

Report on the XML Mining Track at INEX 2005 and INEX 2006

Categorization and Clustering of XML Documents

  • Conference paper
Comparative Evaluation of XML Information Retrieval Systems (INEX 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4518))

Abstract

This article is a report concerning the two years of the XML Mining track at INEX (2005 and 2006). We focus here on the classification and clustering of XML documents. We detail these two tasks and the corpus used for this challenge and then present a summary of the different methods proposed by the participants. We last compare the results obtained during the two years of the track.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Maes, F., Denoyer, L., Gallinari, P.: XML structure mapping application to the pascal INEX 2006 XML document mining track. In: Workshop of the INitiative for the Evaluation of XML Retrieval (2006)

    Google Scholar 

  2. Gilleron, R., Jousse, F., Tellier, I., Tommasi, M.: XML document transformation with conditional random fields. In: INEX 2006 (2007)

    Google Scholar 

  3. Fuhr, N., Gövert, N., Kazai, G., Lalmas, M., (eds.): Proceedings of the First Workshop of the INitiative for the Evaluation of XML Retrieval (INEX), Schloss Dagstuhl, Germany, December 9-11, 2002. In: Fuhr, N., Gövert, N., Kazai, G., Lalmas, M., (eds.) Workshop of the INitiative for the Evaluation of XML Retrieval (2002)

    Google Scholar 

  4. Denoyer, L., Gallinari, P.: The Wikipedia XML Corpus. SIGIR Forum (2006)

    Google Scholar 

  5. Vercoustre, A.M., Fegas, M., Gul, S., Lechevallier, Y.: A flexible structured-based representation for XML document mining. In: Workshop of the INitiative for the Evaluation of XML Retrieval, pp. 443–457 (2005)

    Google Scholar 

  6. Garboni, C., Masseglia, F., Trousse, B.: Sequential pattern mining for structure-based XML document classification. In: Workshop of the INitiative for the Evaluation of XML Retrieval, pp. 458–468 (2005)

    Google Scholar 

  7. Candillier, L., Tellier, I., Torre, F.: Transforming XML trees for efficient classification and clustering. In: Workshop of the INitiative for the Evaluation of XML Retrieval, pp. 469–480 (2005)

    Google Scholar 

  8. Hagenbuchner, M., Sperduti, A., Tsoi, A.C., Trentini, F., Scarselli, F., Gori, M.: Clustering XML documents using self-organizing maps for structures. In: Workshop of the INitiative for the Evaluation of XML Retrieval, pp. 481–496 (2005)

    Google Scholar 

  9. Kc, M., Hagenbuchner, M., Tsoi, A., Scarselli, F., Gori, M., Sperduti, A.: XML document mining using contextual self-organizing maps for structures. In: Workshop of the INitiative for the Evaluation of XML Retrieval (2006)

    Google Scholar 

  10. Doucet, A., Lehtonen, M.: Unsupervised classification of text-centric XML document collections. In: Workshop of the INitiative for the Evaluation of XML Retrieval (2006)

    Google Scholar 

  11. Knijf, J.D.: Fat-cat: Frequent attributes tree based classification. In: Workshop of the INitiative for the Evaluation of XML Retrieval (2006)

    Google Scholar 

  12. Tran, T., Nayak, R., Raymond, K.: Clustering XML documents by structural similarity with pcxss. In: Workshop of the INitiative for the Evaluation of XML Retrieval (2006)

    Google Scholar 

  13. Nayak, R., Xu, S.: XML documents clustering by structures. In: Workshop of the INitiative for the Evaluation of XML Retrieval, pp. 432–442 (2005)

    Google Scholar 

  14. Xing, G., Xia, Z.: Classifying XML documents based on structure/content similarity. In: Workshop of the INitiative for the Evaluation of XML Retrieval (2006)

    Google Scholar 

  15. Yong, S.L., Hagenbuchner, M., Tsoi, A., Scarselli, F., Gori, M.: XML document mining using graph neural network. In: Workshop of the INitiative for the Evaluation of XML Retrieval (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Norbert Fuhr Mounia Lalmas Andrew Trotman

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Denoyer, L., Gallinari, P., Vercoustre, AM. (2007). Report on the XML Mining Track at INEX 2005 and INEX 2006. In: Fuhr, N., Lalmas, M., Trotman, A. (eds) Comparative Evaluation of XML Information Retrieval Systems. INEX 2006. Lecture Notes in Computer Science, vol 4518. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73888-6_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73888-6_41

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73887-9

  • Online ISBN: 978-3-540-73888-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics