Skip to main content

Distributed Search and Pattern Matching

  • Chapter
  • First Online:
Handbook of Peer-to-Peer Networking

Abstract

Peer-to-peer (P2P) technology has triggered a wide range of distributed applications including file-sharing, distributed XML databases, distributed computing, server-less web publishing and networked resource/service sharing. Despite of the diversity in application, these systems share common requirements for searching due to transitory nodes population and content volatility. In such dynamic environment, users do not have the exact information about available resources. Queries are based on partial information. This mandates the search mechanism to be emphflexible. On the other hand, the search mechanism is required to be bandwidth emphefficient to support large networks. Variety of search techniques have been proposed to provide satisfactory solution to the conflicting requirements of search efficiency and flexibility. This chapter highlights the search requirements in large scale distributed systems and the ability of the existing distributed search techniques in satisfying these requirements. Representative search techniques from three application domains, namely, P2P content sharing, service discovery and distributed XML databases, are considered. An abstract problem formulation called Distributed Pattern Matching (DPM) is presented as well. The DPM framework can be used as a common ground for addressing the search problem in these three application domains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 349.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 449.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Reference

  1. The Gnutella website, http://www.gnutella.com

  2. Adjie-Winoto, W., Schwartz, E., Balakrishnan, H., Lilley, J.: The Design and Implementation of an Intentional Naming System. In: Symposium on Operating Systems Principles, pp. 186–201 (1999)

    Google Scholar 

  3. Ahmed, R., Boutaba, R.: Distributed pattern matching: A key to flexible and efficient P2P search. IEEE Journal on Selected Areas in Communications (JSAC) 25(1), 73–83 (2007)

    Article  Google Scholar 

  4. Ahmed, R., Boutaba, R.: Plexus: A scalable Peer-to-Peer protocol enabling efficient subset search (2009). IEEE/ACM Transaction on Networking (TON)

    Google Scholar 

  5. Ahmed, R., Limam, N., Xiao, J., Iraqi, Y., Boutaba, R.: Resource and service discovery in large-scale multi-domain networks. IEEE Communications Surveys & Tutorials 9(4) (2007)

    Google Scholar 

  6. Amir, A., Porat, E., Lewenstein, M.: Approximate subset matching with don’t cares. In: Proc. of Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 305–306 (2001)

    Google Scholar 

  7. Androutsellis-Theotokis, S., Spinellis, D.: A survey of Peer-to-Peer content distribution technologies. ACM Computing Surveys 45(2), 195–205 (2004)

    Google Scholar 

  8. Antoniou, G., van Harmelen, F.: Web Ontology Language: OWL. Handbook on Ontologies in Information Systems pp. 76–92 (2003)

    Google Scholar 

  9. Aspnes, J., Shah, G.: Skip graphs. In: Proc. of Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 384–393 (2003)

    Google Scholar 

  10. Baker, B.S.: A theory of parameterized pattern matching: algorithms and applications. In: Proc. of ACM Symposium on Theory of Computing (STOC), pp. 71–80 (1993)

    Google Scholar 

  11. Balazinska, M., Balakrishnan, H., Karger, D.: INS/Twine: A scalable Peer-to-Peer architecture for intentional resource discovery. In: Proc. of International Conference on Pervasive Computing, pp. 195–210. Springer-Verlag (2002)

    Google Scholar 

  12. Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Communications of ACM 13(7), 422–426 (1970)

    Article  MATH  Google Scholar 

  13. Bonifati, A., Matrangolo, U., Cuzzocrea, A., Jain, M.: XPath lookup queries in P2P networks. In: Proc. of the ACM international workshop on Web information and data management (WIDM), pp. 48–55. ACM Press, New York, NY, USA (2004)

    Chapter  Google Scholar 

  14. Booth, D., Haas, H., McCabe, F., Newcomer, E., Champion, M., Ferris, C., Orchard, D.: Web Service Architecture (2004). URL http://www.w3.org/TR/2004/NOTE-ws-arch-20040211/

  15. Brookshier, D., Govoni, D., Krishnan, N.: JXTA: Java P2P Programming. SAMS (2002)

    Google Scholar 

  16. Burstein, M.H., Hobbs, J.R., Lassila, O., Martin, D., McDermott, D.V., McIlraith, S.A., Narayanan, S., Paolucci, M., Payne, T.R., Sycara, K.P.: DAML-S: Web Service description for the semantic web. In: Proc. of International Semantic Web Conference on The Semantic Web (ISWC), pp. 348–363. Springer-Verlag, London, UK (2002)

    Google Scholar 

  17. Cai, M., Frank, M.: RDFPeers: a scalable distributed RDF repository based on a structured Peer-to-Peer network. In: International World Wide Web Conference (WWW) (2004)

    Google Scholar 

  18. Chamberlin, D., Siméon, J., Boag, S., Florescu, D., Fernández, M.F., Robie, J.: XQuery 1.0: An XML query language. W3C recommendation, W3C (2007). http://www.w3.org/TR/2007/REC-xquery-20070123/

  19. Chawathe, Y., Ratnasamy, S., Breslau, L., Lanham, N., Shenker, S.: Making Gnutella-like P2P systems scalable. In: Proc. of ACM SIGCOMM, pp. 407–418 (2003)

    Google Scholar 

  20. Choon-Hoong, D., Nutanong, S., Buyya, R.: Peer-to-Peer Computing: Evolution of a Disruptive Technology, chap. 2–Peer-to-Peer Networks for Content Sharing, pp. 28–65. Idea Group Inc. (2005)

    Google Scholar 

  21. Clarke, I., Sandberg, O., Wiley, B., Hong, T.W.: Freenet: A distributed anonymous information storage and retrieval system. Lecture Notes in Computer Science (LNCS) 2009, 46–66 (2001)

    Article  Google Scholar 

  22. Cohen, E., Fiat, A., Kaplan, H.: Associative search in Peer-to-Peer networks: Harnessing latent semantics. In: Proc. of IEEE INFOCOM (2003)

    Google Scholar 

  23. Cole, R., Harihan, R.: Tree pattern matching and subset matching in randomized \(o(n\log^3m)\) time. In: Proc. of ACM Symposium on Theory of Computing (STOC), pp. 66–75 (1997)

    Google Scholar 

  24. Czerwinski, S.E., Zhao, B.Y., Hodes, T.D., Joseph, A.D., Katz, R.H.: An Architecture for a Secure Service Discovery Service. In: Proc. of International Conference on Mobile Computing and Networking (MOBICOM), pp. 24–35 (1999)

    Google Scholar 

  25. Decker, S., Schlosser, M., Sintek, M., Nejdl, W.: Hypercup – hypercubes, ontologies and efficient search on P2P networks. In: International Workshop on Agents and Peer-to-Peer Computing (2002)

    Google Scholar 

  26. Fuchs, M., Wadler, P., Robie, J., Brown, A.: XML schema: Formal description. W3C working draft, W3C (2001). http://www.w3.org/TR/2001/WD-xmlschema-formal-20010925/

  27. Galanis, L., Wang, Y., Jeffery, S., DeWitt., D.: Locating data sources in large distributed systems. In: Proc. of the VLDB Conference, (2003)

    Google Scholar 

  28. Ganesan, P., Sun, Q., Garcia-Molina, H.: Adlib: A self-tuning index for dynamic Peer-to-Peer systems. In: Proc. of the International Conference on Data Engineering (ICDE), pp. 256–257. IEEE Computer Society, Los Alamitos, CA, USA (2005)

    Google Scholar 

  29. Garofalakis, J., Panagis, Y., Sakkopoulos, E., Tsakalidis, A.: Web service discovery mechanisms: Looking for a needle in a haystack? In: International Workshop on Web Engineering (2004)

    Google Scholar 

  30. Guttman, E., Perkins, C., Veizades, J., Day, M.: Service Location Protocol (SLP), version 2. Tech. rep., IETF, RFC2608, http://www.ietf.org/rfc/rfc2608.txt (1999)

  31. Harren, M., Hellerstein, J.M., Huebsch, R., Loo, B.T., Shenker, S., Stoica, I.: Complex queries in DHT-based Peer-to-Peer networks. In: Proc. of International Workshop on Peer-to-Peer Systems (IPTPS), pp. 242–259 (2002)

    Google Scholar 

  32. Harvey, N., Jones, M.B., Saroiu, S., Theimer, M., Wolman, A.: SkipNet: A scalable overlay network with practical locality properties. In: Proc. of the USENIX Symposium on Internet Technologies and Systems (USITS) (2003)

    Google Scholar 

  33. Herschel, S., Heese, R.: Humboldt Discoverer: A semantic P2P index for PDMS. In: Proc. of the International Workshop Data Integration and the Semantic Web (DISWeb’05) (2005)

    Google Scholar 

  34. Howes, T.: The String Representation of LDAP Search Filters. USA (1997). RFC Editor

    Google Scholar 

  35. Hu, H., Seneviratne, A.: Autonomic Peer-to-Peer service directory. IEICE/IEEE Joint Special Section on Autonomous Decentralized Systems E88-D(12), 2630–2639 (2005)

    Google Scholar 

  36. Jin, X., Yiu, W.P.K., Chan, S.H.: Supporting multiple-keyword search in a hybrid structured Peer-to-Peer network. In: Proc. of IEEE International Conference on Communications (ICC), pp. 42–47. Istanbul (2006)

    Google Scholar 

  37. Joung, Y., Yang, L., Fang, C.: Keyword search in DHT-based Peer-to-Peer networks. IEEE Journal on Selected Areas in Communications (JSAC) 25(1), 46–61 (2007)

    Article  Google Scholar 

  38. Kay, M., Fernández, M.F., Boag, S., Chamberlin, D., Berglund, A., Siméon, J., Robie, J.: XML path language (XPath) 2.0. W3C recommendation, W3C (2007). http://www.w3.org/TR/2007/REC-xpath20-20070123/

  39. Koloniari, G., Pitoura, E.: Peer-to-Peer management of XML data: issues and research challenges. ACM SIGMOD Record 34(2), 6–17 (2005)

    Article  Google Scholar 

  40. Kosaraju, S.R.: Efficient tree pattern matching. In: Proc. of the 30th IEEE Symposium on the Foundations of Computer Science (FOCS), pp. 178–183 (1989)

    Google Scholar 

  41. Lassila, O., Swick, R.R.: Resource description framework (RDF) model and syntax specification. supersed work, W3C (1999). http://www.w3.org/TR/1999/REC-rdf-syntax-19990222

  42. Li, M., Lee, W., Sivasubramaniam, A.: Neighborhood signatures for searching P2P networks. In: Proc. of Seventh International Database Engineering and Applications Symposium (IDEAS), pp. 149–159 (2003)

    Google Scholar 

  43. Li, Y., Zou, F., Wu, Z., Ma, F.: PWSD: A scalable web service discovery architecture based on Peer-to-Peer overlay network. In: Proc. APWeb, Lecture Notes Ccomputer Science (LNCS), vol. 3007 (2004)

    Google Scholar 

  44. Liu, L., Ryu, K.D., Lee, K.: Supporting efficient keyword-based file search in Peer-to-Peer file sharing systems. In: Proc. of GLOBECOM (2004)

    Google Scholar 

  45. Lv, Q., Cao, P., Cohen, E., Li, K., Shenker, S.: Search and replication in unstructured Peer-to-Peer networks. In: Proc. of the International Conference on Supercomputing (ICS), pp. 84–95 (2002)

    Google Scholar 

  46. Maymounkov, P., Mazireres, D.: Kademlia: A Peer-to-Peer information system based on the XOR metric. In: Proc. of International Workshop on Peer-to-Peer Systems (IPTPS), pp. 53–65. Springer-Verlag (2002)

    Google Scholar 

  47. Miller, B.A., Nixon, T., Tai, C., Wood, M.D.: Home networking with Universal Plug and Play. IEEE Communications Magazine pp. 104–109 (2001)

    Google Scholar 

  48. Montebello, M., Abela, C.: DAML enabled web service and agents in semantic web. In: Workshop on Web, Web Services and Database Systems, Lecture Notes Ccomputer Science (LNCS) (2003)

    Google Scholar 

  49. Nejdl, W., Wolpers, M., Siberski, W., Schmitz, C., Schlosser, M., Brunkhorst, I., Loser, A.: Super-peer-based routing strategies for RDF-based Peer-to-Peer networks. Journal of Web Semantics 1(2), 177–186 (2004)

    Google Scholar 

  50. Ng, W.S., Ooi, B.C., Tan, K.L., Zhou, A.: PeerDB: A P2P-based System for Distributed Data Sharing. In: Proc. of the International Conference on Data Engineering (ICDE), pp. 633–644 (2003)

    Google Scholar 

  51. Prud’Hommeaux, E., Seaborne, A.: SPARQL query language for RDF. Working Draft WD-rdf-sparql-query-20061004, World Wide Web Consortium (W3C) (2006)

    Google Scholar 

  52. Pugh, W.: Skip lists: a probabilistic alternative to balanced trees. Communications of the ACM 33(6), 668–676 (1990)

    Article  MathSciNet  Google Scholar 

  53. Rabin, M.: Fingerprinting by random polynomials. Technical report, CRCT TR-15-81, Harvard University (1981)

    Google Scholar 

  54. Ratnasamy, S., Francis, P., Handley, M., Karp, R., Schenker, S.: A scalable content-addressable network. In: Proc. of ACM SIGCOMM, pp. 161–172 (2001)

    Google Scholar 

  55. Rhea, S., Kubiatowicz, J.: Probabilistic location and routing. In: Proc. of IEEE INFOCOM (2002)

    Google Scholar 

  56. Rompothong, P., Senivongse, T.: A query federation of UDDI registries. In: Proc. of International Symposium on Information and Communication Technologies (ISICT), pp. 578–583 (2003)

    Google Scholar 

  57. Rowstron, A., Druschel, P.: Pastry: Scalable, distributed object location and routing for large-scale Peer-to-Peer systems. In: Proc. of IFIP/ACM International Conference on Distributed Systems Platforms (Middleware). Heidelberg, Germany (2001)

    Google Scholar 

  58. Sagan, H.: Space-filling curves. Springer-Verlag (1994)

    Google Scholar 

  59. Salutation Consortium: Salutation architecture specification version 2.0c. http://www.salutation.org (1999)

  60. Schlosser, M., Sintek, M., Decker, S., Nejdl, W.: A scalable and ontology-based P2P infrastructure for semantic web services. In: Proc. of International Conference on Peer-to-Peer Computing (P2P) (2002)

    Google Scholar 

  61. Schmidt, C., Parashar, M.: Enabling flexible queries with guarantees in P2P systems. IEEE Internet Computing 8(3), 19–26 (2004)

    Article  Google Scholar 

  62. Schmidt, C., Parashar, M.: Peer-to-Peer approach to web service discovery. In: WWW: Internet and web information systems, vol. 7, pp. 211–229 (2004)

    Google Scholar 

  63. Sperberg-McQueen, C.M., Bray, T., Maler, E., Paoli, J., Yergeau, F.: Extensible markup language (XML) 1.0 (fourth edition). W3C recommendation, W3C (2006). http://www.w3.org/TR/2006/REC-xml-20060816

  64. Stoica, I., Morris, R., Liben-Nowell, D., Karger, D.R., Kaashoek, M.F., Dabek, F., Balakrishnan, H.: Chord: a scalable Peer-to-Peer lookup protocol for Internet applications. IEEE/ACM Transaction on Networking (TON) 11(1), 17–32 (2003)

    Article  Google Scholar 

  65. Sun Microsystems: Jini Technology Core Platform Specification (2000). http://www.sun.com/jini/specs/

  66. Tang, C., Dwarkadas, S.: Hybrid global-local indexing for efficient Peer-to-Peer information retrieval. In: Proc. of the Symposium on Networked Systems Design and Implementation (NSDI) (2004)

    Google Scholar 

  67. Tang, C., Xu, Z., Mahalingam, M.: pSearch: information retrieval in structured overlays. ACM SIGCOMM Computer Communication Review 33(1), 89–94 (2003)

    Article  Google Scholar 

  68. Tsoumakos, D., Roussopoulos, N.: Adaptive probabilistic search for Peer-to-Peer networks. In: Proc. of International Conference on Peer-to-Peer Computing (P2P) (2003)

    Google Scholar 

  69. UDDI Consortium: UDDI Technical White Paper (2002). URL http://www.uddi.org/pubs/Iru_UDDI_Technical_White_Paper.pdf

  70. Uschold, M., Gruninger, M.: Ontologies: Principles, methods and applications. Knowledge Sharing and Review 11(2) (1996)

    Google Scholar 

  71. Yang, B., Garcia-Molina, H.: Improving search in Peer-to-Peer networks. In: Proc. of International Conference on Distributed Computing Systems (ICDCS) (2002)

    Google Scholar 

  72. Zhao, B., Huang, L., Stribling, J., Rhea, S., Joseph, A., Kubiatowicz, J.: Tapestry: A resilient global-scale overlay for service deployment. IEEE Journal on Selected Areas in Communications (JSAC) 22(1), 41–53 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Reaz Ahmed .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Ahmed, R., Boutaba, R. (2010). Distributed Search and Pattern Matching. In: Shen, X., Yu, H., Buford, J., Akon, M. (eds) Handbook of Peer-to-Peer Networking. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-09751-0_16

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-09751-0_16

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-09750-3

  • Online ISBN: 978-0-387-09751-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics