Skip to main content

UCYMICRA: Distributed Indexing of the Web Using Migrating Crawlers

  • Conference paper
Advances in Databases and Information Systems (ADBIS 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2798))

Abstract

Due to the tremendous increase rate and the high change frequency of Web documents, maintaining an up-to-date index for searching purposes (search engines) is becoming a challenge. The traditional crawling methods are no longer able to catch up with the constantly updating and growing Web. Realizing the problem, in this paper we suggest an alternative distributed crawling method with the use of mobile agents. Our goal is a scalable crawling scheme that minimizes network utilization, keeps up with document changes, employs time realization, and is easily upgradeable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altavista Search Engine, Basic submit, Available at http://addurl.altavista.com/addurl/new

  2. Ahuja, S., Carriero, N., Gelernter, D.: Linda and Friends. IEEE Computer 19(8), 26–34 (1986)

    Google Scholar 

  3. Brin, S., Page, L.: The Anatomy of a Large-Scale Hypertextual Web Search Engine. In: WWW7, Brisbaib (April 1998)

    Google Scholar 

  4. Brown, C.M., Danzig, B.B., Hardy, D., Manber, U., Schwartz, M.F.: The harvest information discovery and access system. In: WWW2, Chicago (October 1994)

    Google Scholar 

  5. Bowman, C.M., Danzig, P.B., Hardy, D.R., Manber, U., Schwartz, M.F.: Harvest: A Scalable, Customizable Discovery and Access System. Technical Report CU-CS-732-94, Department of Computer Science, University of Colorado (August 1995)

    Google Scholar 

  6. Chakrabarti, S., van den Berg, M., Dom, B.: Focused Crawling: A New Approach to Topic- Specific Web Resource Discovery. WWW8 / Computer Networks 31(11–16), 1623–1640 (1999)

    Article  Google Scholar 

  7. Chakrabarti, S., Punera, K., Subramanyam, M.: Accelerated Focused Crawling through Online Relevance Feedback. In: WWW2002, Hawaii (May 2002)

    Google Scholar 

  8. Chess, D., Harrison, C., Kershenbaum, A.: Mobile Agents: Are They A Good Idea? IBM research

    Google Scholar 

  9. Cho, J., Garcia-Molina, H.: Parallel Crawlers. In: WWW2002, Hawaii (May 2002)

    Google Scholar 

  10. Diligenti, M., Coetzee, F., Lawrence, S., Giles, C.L., Gori, M.: Focused Crawling Using Context Graphs. VLDB 2000, 527–534 (2000)

    Google Scholar 

  11. Fiedler, J., Hammer, J.: Using the Web Efficiently: Mobile Crawling. In: Proc. of the Seventeenth Annual International Conference of the Association of Management (AoM/IAoM) on Computer Science, August 1999, pp. 324–329. Maximilian Press Publishers, San Diego (1999)

    Google Scholar 

  12. Fiedler, J., Hammer, J.: Using Mobile Crawlers to Search the Web Efficiently. International Journal of Computer and Information Science 1(1), 36–58 (2000)

    Google Scholar 

  13. Google Search Appliance, Available at http://www.google.com/services/

  14. Grub Distributed Internet Crawler, Available at www.grub.org

  15. Heydon, A., Najork, M.: Mercator: A Scalable, Extensible Web Crawler. Compaq Systems Research Center. In: WWW9, Amsterdam (May 2000)

    Google Scholar 

  16. Hypertext Transfer Protocol – HTTP/1.0, specification, Available at http://www.w3.org/

  17. Kahle, B.: Achieving the Internet. Scientific American (1996)

    Google Scholar 

  18. Karjoth, G., Asokan, N., Gülcü, C.: Protecting the Computation Results of Free Roaming Agents. In: Rothermel, K., Hohl, F. (eds.) MA 1998. LNCS, vol. 1477, p. 195. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  19. Lawrence, S., Lee Giles, C.: Accessibility of information on the web. Nature 400(6740), 107–109 (1999)

    Article  Google Scholar 

  20. Lyman, P., Varian, H., Dunn, J., Strygin, A., Swearingen, K.: How much information?, Available at http://www.sims.berkeley.edu/how-much-info

  21. Dikaiakos, M., Samaras, G.: Quantitative Performance Analysis of Mobile Agent Systems. A Hierarchical Approach. Technical Report TR-00-2, Department of Computer Science, University of Cyprus (June 2000)

    Google Scholar 

  22. Sander, T., Tschudin, C.F.: Towards Mobile Cryptography. In: Proc. of the IEEE Symposium on Research in Security and Privacy, USA (1998)

    Google Scholar 

  23. SETI: Search for Extraterrestrial Intelligence, Available at http://setiathome.ssl.berkeley.edu/

  24. Varadharajan, V.: Security enhanced mobile agents: ACM Conference on Computer and Communications Security, 200–209 (2000)

    Google Scholar 

  25. Voyager Web site, by ObjectSpace, Available at http://www.recursionsw.com/products/voyager/voyager.asp

  26. Yoshioka, N., Tahara, Y., Oshuga, A., Honiden, S.: Security for Mobile Agents. In: Ciancarini, P., Wooldridge, M.J. (eds.) AOSE 2000. LNCS, vol. 1957, pp. 223–234. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Papapetrou, O., Papastavrou, S., Samaras, G. (2003). UCYMICRA: Distributed Indexing of the Web Using Migrating Crawlers. In: Kalinichenko, L., Manthey, R., Thalheim, B., Wloka, U. (eds) Advances in Databases and Information Systems. ADBIS 2003. Lecture Notes in Computer Science, vol 2798. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39403-7_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-39403-7_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20047-5

  • Online ISBN: 978-3-540-39403-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics