Skip to main content

Modeling Navigation Patterns of Visitors of Unstructured Websites

  • Conference paper
Research and Development in Intelligent Systems XXII (SGAI 2005)

Abstract

In this paper we describe a practical approach for modeling navigation patterns of visitors of unstructured websites. These patterns are derived from web logs that are enriched with 3 sorts of information: (1) content type of visited pages, (2) visitor type, and (3) location of the visitor. We developed an intelligent Text Mining system, iTM, which supports the process of classifying web pages into a number of pre-defined categories. With help of this system we were able to reduce the labeling effort by a factor 10–20 without affecting the accuracy of the final result too much. Another feature of our approach is the use of a new technique for modeling navigation patterns: navigation trees. They provide a very informative graphical representation of most frequent sequences of categories of visited pages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imielinski, T., and Swami, A. (1993), Mining association rules between sets of items in large databases. In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 207–216.

    Google Scholar 

  2. Argamon-Engelson, S. and Dagan, I. (1999). Commitee-based sample selection for probabilistic classifiers. Journal of Artificial Intelligence Research, (11):335–360, 1999.

    MATH  Google Scholar 

  3. Baglioni, M., Ferrara, U., Romei, A., Ruggieri, S., and Turini, F. (2003), Preprocessing and Mining Web Log Data for Web Personalization. 8th Italian Conf. on Artificial Intelligence vol. 2829 of LNCS, p.237–249.

    Google Scholar 

  4. Balog, K., (2004). An Intelligent Support System for Developing Text Classifiers. MSc. Thesis, Vrije Universiteit Amsterdam, The Netherlands.

    Google Scholar 

  5. Cadez, I. V., Heckerman, D., Meek, C, Smyth, P., and White, S. (2003), Model-Based Clustering and Visualization of Navigation Patterns on a Web Site. Data Mining and Knowledge Discovery, vol.7 n.4, p.399–424.

    Article  MathSciNet  Google Scholar 

  6. Chevalier, K., Bothorel, C, and Corruble, V. (2003), Discovering rich navigation patterns on a web site. Proceedings of the 6th International Conference on Discovery Science Hokkaido University Conference Hall, Sapporo, Japan.

    Google Scholar 

  7. Cooley, R., Mobasher, B., Srivastava, J. (1999), Data Preparation for Mining World Wide Web Browsing Patterns. In Knowledge and Information System, vol. 1(1), pages 5–32.

    Google Scholar 

  8. Dumais, S.T., and H. Chen (2000). Hierarchical classification of web content. Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’00), August 2000, pages 256–263.

    Google Scholar 

  9. Hay B., Wets, G., and Vanhoof K. (2003), Segmentation of visiting patterns on websites using a sequence alignment method. Journal of Retailing and Consumer Services vol.10, p. 145–153.

    Article  Google Scholar 

  10. Hofgesang, P.I., (2004). Web usage mining. Structuring semantically enriched clickstream data. MSc. Thesis, Vrije Universiteit Amsterdam, The Netherlands.

    Google Scholar 

  11. Jenamani, M., Mohapatra, P.K.J., and Ghose, S. (2003), A stochastic model of e-customer behaviour. Electronic Commerce Research and Applications vol.2, p.81–94.

    Article  Google Scholar 

  12. Kosala, R., and Blocked, H. (2000). Web mining research: A survey, SIGKDD Explorations. Newsletter of the ACM Special Interest Group on Knowledge Discovery and Data Mining 2(1), pp. 1–15, July, 2000.

    Google Scholar 

  13. Mladenic, D. (1998). Turning Yahoo to Automatic Web-Page Classifier. In H. Prade, editor, Proceedings of the 13th European Conference on Artificial Intelligence (ECAI-98), pages 473–474.

    Google Scholar 

  14. Mobasher, B., Jain, N., Han, E., and Srivastava, J. (1996), Web Mining: Pattern discovery from World Wide Web transactions. Technical Report TR 96-050, University of Minnesota, Dept. of Computer Science, Minneapolis.

    Google Scholar 

  15. Nanopoulos A., Manolopoulos Y. (2001), Mining patterns from graph traversals. Data and Knowledge Engineering No. 37, pages 243–266.

    Article  MATH  Google Scholar 

  16. Nigam, K., McCallum, A.K., Thrun, S., and Mitchell, T. (2000). Text classification from labeled and unlabeled documents using EM. Machine Learning, Kluwer Acedemic Press, 39(2/3),pages 103–134.

    MATH  Google Scholar 

  17. Pei, J., Han, J., Mortazavi-asl, B., and Zhu, H. (2000), Mining Access Patterns Efficiently from Web Logs. Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 396–407.

    Google Scholar 

  18. Sebastiani, F. (2002), Machine learning in automated text categorization. ACM Computing Surveys, 34(1), pages 1–47.

    Article  Google Scholar 

  19. Schapire, R.E. and Singer, Y. (2000). Boostexter: A boosting-based system for text categorization. Machine Learning, 39(2/3), pages 135–168.

    Article  MATH  Google Scholar 

  20. Web Mining and Web Usage Mining Software, http://www.kdnuggets.com/software/web.html

    Google Scholar 

  21. Xing, D., and Shen, J. (2004), Efficient data mining for web navigation patterns. Information and Software Technology vol.46, pages 55–63.

    Article  Google Scholar 

  22. Yang, Q., Li T.I., and Wang K. (2003), Web-log Cleaning for Constructing Sequential Classifiers. Applied Artificial Intelligence vol. 17, issue 5–6, pages 431–441.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag London Limited

About this paper

Cite this paper

Balog, K., Hofgesang, P., Kowalczyk, W. (2006). Modeling Navigation Patterns of Visitors of Unstructured Websites. In: Bramer, M., Coenen, F., Allen, T. (eds) Research and Development in Intelligent Systems XXII. SGAI 2005. Springer, London. https://doi.org/10.1007/978-1-84628-226-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-1-84628-226-3_10

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-84628-225-6

  • Online ISBN: 978-1-84628-226-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics