Skip to main content

Extracting Extended Web Logs to Identify the Origin of Visits and Search Keywords

  • Conference paper
Intelligent Informatics

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 182))

  • 1770 Accesses

Abstract

Web Usage Mining is the extraction of information from web log data. The extended web log file contains information about the user traffic and behavior, the browser type, its version and operating system used. Mining these web logs provide the origin of visit or the referring website and popular keywords used to access a website. This paper proposes an indiscernibility approach in rough set theory to extract information from extended web logs to identify the origin of visits and the keywords used to visit a web site which will lead to better design of websites and search engine optimization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Facca, M.F., Lanzi, L.P.: Mining interesting knowledge from weblogs: a survey. Data & Knowledge Engineering 53, 225–241 (2005)

    Article  Google Scholar 

  2. Cooley, R., Mobasher, B., Srivastava, J.: Web Mining: Information and Pattern Discovery on the World Wide Web. In: 9th IEEE International Conference on Tools with Artificial Intelligence (1997)

    Google Scholar 

  3. Pitkow, J.: In search of Reliable usage Data on WWW. In: Sixth International WWW Conference (1997)

    Google Scholar 

  4. Kohavi, R., Parekh, R.: Ten Supplementary Analyses to improve E-commerce Web Sites. In: Fifth KDD Workshop (2003)

    Google Scholar 

  5. Spiliopoulou, M.: Web Usage Mining for Web Site Evaluation. Communications of the ACM 43(8), 127–134 (2000)

    Article  Google Scholar 

  6. Ortega, J.L., Aguillo, I.: Differences between web sessions according to the origin of their visits. Journal of Informetrics 4(1), 331–337 (2010)

    Article  Google Scholar 

  7. Suresh, R.M., Padmajavalli, R.: An Overview of Data Pre processing in Data and Web Usage Mining. In: First International Conference on Digital Management, pp. 193–198 (2006)

    Google Scholar 

  8. Burton, M.C., Walther, B.J.: A Survey of Web Log Data and Their Application in Use-Based Design. In: 34th Hawaii International Conference on System Sciences, pp. 1–10 (2000)

    Google Scholar 

  9. Wahab, M.H.A., Mohd, M.N.H., Hanafi, H.F., Mohsin, M.F.M.: Data Pre-processing on Web Server Logs for Generalized Association Rules Mining Algorithm. In: Proceedings of the World Academy of Science, Engineering and Technology, pp. 190–197 (2008)

    Google Scholar 

  10. Pabarskaite, Z., Raudys, A.: A process of knowledge discovery from web log data: Systematization and critical review. Journal of Intelligent Information Systems 28, 79–104 (2007)

    Article  Google Scholar 

  11. Bertot, J.C., Mcculure, C.R., Moen, W.E., Rubin, J.: Web Usage Statistics: Measurement Issues and Analytical Techniques. Government Information Quarterly 14, 373–395 (1997)

    Article  Google Scholar 

  12. Hussain, T., Asghar, S., Masood, N.: Web Usage Mining: A Survey of Preprocessing of Web Log File. In: International Conference on Information and Emerging Technologies, pp. 1–6 (2010)

    Google Scholar 

  13. Internet: Hypertext Transfer Protocol Overview, http://www.w3.org/protocols (last retrieved October 2011)

  14. Mican, D., Sitar-Taut, D.: Preprocessing and Content/Navigational Pages Identification as Premises for an Extended Web Usage Mining Model Development. Informatica Economica 13(4), 168–179 (2009)

    Google Scholar 

  15. Pawlak, Z., Skowron, A.: Rudiments of Rough Sets. Information Sciences 177(1), 3–27 (2007)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jeeva Jose .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jose, J., Sojan Lal, P. (2013). Extracting Extended Web Logs to Identify the Origin of Visits and Search Keywords. In: Abraham, A., Thampi, S. (eds) Intelligent Informatics. Advances in Intelligent Systems and Computing, vol 182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32063-7_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32063-7_46

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32062-0

  • Online ISBN: 978-3-642-32063-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics