Extracting Extended Web Logs to Identify the Origin of Visits and Search Keywords

Jose, Jeeva; Sojan Lal, P.

doi:10.1007/978-3-642-32063-7_46

Jeeva Jose³ &
P. Sojan Lal⁴

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 182))

1770 Accesses

Abstract

Web Usage Mining is the extraction of information from web log data. The extended web log file contains information about the user traffic and behavior, the browser type, its version and operating system used. Mining these web logs provide the origin of visit or the referring website and popular keywords used to access a website. This paper proposes an indiscernibility approach in rough set theory to extract information from extended web logs to identify the origin of visits and the keywords used to visit a web site which will lead to better design of websites and search engine optimization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Facca, M.F., Lanzi, L.P.: Mining interesting knowledge from weblogs: a survey. Data & Knowledge Engineering 53, 225–241 (2005)
Article Google Scholar
Cooley, R., Mobasher, B., Srivastava, J.: Web Mining: Information and Pattern Discovery on the World Wide Web. In: 9th IEEE International Conference on Tools with Artificial Intelligence (1997)
Google Scholar
Pitkow, J.: In search of Reliable usage Data on WWW. In: Sixth International WWW Conference (1997)
Google Scholar
Kohavi, R., Parekh, R.: Ten Supplementary Analyses to improve E-commerce Web Sites. In: Fifth KDD Workshop (2003)
Google Scholar
Spiliopoulou, M.: Web Usage Mining for Web Site Evaluation. Communications of the ACM 43(8), 127–134 (2000)
Article Google Scholar
Ortega, J.L., Aguillo, I.: Differences between web sessions according to the origin of their visits. Journal of Informetrics 4(1), 331–337 (2010)
Article Google Scholar
Suresh, R.M., Padmajavalli, R.: An Overview of Data Pre processing in Data and Web Usage Mining. In: First International Conference on Digital Management, pp. 193–198 (2006)
Google Scholar
Burton, M.C., Walther, B.J.: A Survey of Web Log Data and Their Application in Use-Based Design. In: 34th Hawaii International Conference on System Sciences, pp. 1–10 (2000)
Google Scholar
Wahab, M.H.A., Mohd, M.N.H., Hanafi, H.F., Mohsin, M.F.M.: Data Pre-processing on Web Server Logs for Generalized Association Rules Mining Algorithm. In: Proceedings of the World Academy of Science, Engineering and Technology, pp. 190–197 (2008)
Google Scholar
Pabarskaite, Z., Raudys, A.: A process of knowledge discovery from web log data: Systematization and critical review. Journal of Intelligent Information Systems 28, 79–104 (2007)
Article Google Scholar
Bertot, J.C., Mcculure, C.R., Moen, W.E., Rubin, J.: Web Usage Statistics: Measurement Issues and Analytical Techniques. Government Information Quarterly 14, 373–395 (1997)
Article Google Scholar
Hussain, T., Asghar, S., Masood, N.: Web Usage Mining: A Survey of Preprocessing of Web Log File. In: International Conference on Information and Emerging Technologies, pp. 1–6 (2010)
Google Scholar
Internet: Hypertext Transfer Protocol Overview, http://www.w3.org/protocols (last retrieved October 2011)
Mican, D., Sitar-Taut, D.: Preprocessing and Content/Navigational Pages Identification as Premises for an Extended Web Usage Mining Model Development. Informatica Economica 13(4), 168–179 (2009)
Google Scholar
Pawlak, Z., Skowron, A.: Rudiments of Rough Sets. Information Sciences 177(1), 3–27 (2007)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

BPC College, Piravom, 686664, Kerala, India
Jeeva Jose
School of Computer Science, Mahatma Gandhi University, Kottayam, 686560, Kerala, India
P. Sojan Lal

Authors

Jeeva Jose
View author publications
You can also search for this author in PubMed Google Scholar
P. Sojan Lal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jeeva Jose .

Editor information

Editors and Affiliations

(MIR Labs), Scientific Network for Innovation and, Machine Intelligence Research Labs, MIR Labs Campus, Auburn, 98071, Washington, USA
Ajith Abraham
Technology and Management, Indian Institute of Information, Technopark Campus, Trivandrum, 695581, India
Sabu M Thampi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jose, J., Sojan Lal, P. (2013). Extracting Extended Web Logs to Identify the Origin of Visits and Search Keywords. In: Abraham, A., Thampi, S. (eds) Intelligent Informatics. Advances in Intelligent Systems and Computing, vol 182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32063-7_46

Download citation

DOI: https://doi.org/10.1007/978-3-642-32063-7_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32062-0
Online ISBN: 978-3-642-32063-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics