Skip to main content

Partial Replication: Achieving Scalability in Redundant Arrays of Inexpensive Databases

  • Conference paper
Principles of Distributed Systems (OPODIS 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3144))

Included in the following conference series:

Abstract

Clusters of workstations become more and more popular to power data server applications such as large scale Web sites or e-Commerce applications. There has been much research on scaling the front tiers (web servers and application servers) using clusters, but databases usually remain on large dedicated SMP machines. In this paper, we focus on the database tier using clusters of commodity hardware. Our approach consists of studying different replication strategies to achieve various degree of performance and fault tolerance. Redundant Array of Inexpensive Databases (RAIDb) is to databases what RAID is to disks. In this paper, we focus on RAIDb-1 that offers full replication and RAIDb-2 that introduces partial replication, in which the user can define the degree of replication of each database table. We present a Java implementation of RAIDb called Clustered JDBC or C-JDBC. C-JDBC achieves both database performance scalability and high availability at the middleware level without changing existing applications. We show, using the TPC-W benchmark, that partial replication (RAIDb-2) can offer better performance scalability (up to 25%) than full replication by allowing fine-grain control on replication. Distributing and restricting the replication of frequently written tables to a small set of backends reduces I/O usage and improves CPU utilization of each cluster node.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amza, C., Cox, A.L., Zwaenepoel, W.: Conflict-Aware Scheduling for Dynamic Content Applications. In: Proceedings of USITS 2003 (March 2003)

    Google Scholar 

  2. Amza, C., Cox, A.L., Zwaenepoel, W.: Scaling and availability for dynamic content web sites, Rice University Technical Report TR02-395 (2002)

    Google Scholar 

  3. Ban, B.: Design and Implementation of a Reliable Group Communication Toolkit for Java, Cornell University (September 1998)

    Google Scholar 

  4. Bernstein, P.A., Hadzilacos, V., Goodman, N.: Concurrency Control and Recovery. In: Database Systems, Addison-Wesley, Reading (1987)

    Google Scholar 

  5. Bezenek, T., Cain, T., Dickson, R., Heil, T., Martin, M., McCurdy, C., Rajwar, R., Weglarz, E., Zilles, C., Lipasti, M.: Characterizing a Java Implementation of TPC-W. In: 3rd Workshop On Computer Architecture Evaluation Using Commercial Workloads (CAECW) (January 2000)

    Google Scholar 

  6. Bialek, B., Ahuja, R.: IBM DB2 Integrated Cluster Environment (ICE) for Linux, IBM Blueprint (May 2003)

    Google Scholar 

  7. Cecchet, E., Marguerite, J., Zwaenepoel, W.: Performance and scalability of EJB applications. In: Proceedings of OOPSLA 2002 (November 2002)

    Google Scholar 

  8. Cecchet, E., Marguerite, J., Zwaenepoel, W.: Reduandant Array of Inexpensive Databases, INRIA Research Report no 4921 (September 2003)

    Google Scholar 

  9. Gray, J., Helland, P., O’Neil, P., Shasha, D.: The Dangers of Replication and a Solution. In: Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data (June 1996)

    Google Scholar 

  10. Jakarta Tomcat Servlet Engine, http://jakarta.apache.org/tomcat/

  11. Kemme, B.: Database Replication for Clusters of Workstations, Ph. D. thesis nr. 13864, Swiss Federal Institute of Technology Zurich (2000)

    Google Scholar 

  12. Kemme, B., Alonso, G.: Don’t be lazy, be consistent: Postgres-R, a new way to implement Database Replication. In: Proceedings of the 26th International Conference on Very Large Databases (September 2000)

    Google Scholar 

  13. MySQL Reference Manual – MySQL AB (2003)

    Google Scholar 

  14. Oracle – Oracle9i Real Application Clusters – Oracle white paper (February 2002)

    Google Scholar 

  15. Pacitti, E., Minet, P., Simon, E.: Fast algorithms for maintaining replica consistency in lazy master replicated databases. In: Proceedings of VLDB (1999)

    Google Scholar 

  16. Sousa, A., Pedone, F., Oliveira, R., Moura, F.: Partial replication in the Database State Machine. In: Proceeding of the IEEE International Symposium on Networking Computing and Applications, NCA 2001 (2001)

    Google Scholar 

  17. Stacey, D.: Replication: DB2, Oracle or Sybase. Database Programming & Design 7, 12

    Google Scholar 

  18. Transaction Processing Performance Council, http://www.tpc.org/

  19. White, S., Fisher, M., Cattel, R., Hamilton, G., Hapner, M.: JDBC API Tutorial and Reference, 2nd edn., November 2001. Addison-Wesley, Reading (2001)

    Google Scholar 

  20. Wiesmann, M., Pedone, F., Schiper, A., Kemme, B., Alonso, G.: Database replication techniques: a three parameter classification. In: Proceedings of the 19th IEEE Symposium on Reliable Distributed Systems (SRDS 2000) (October 2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cecchet, E., Marguerite, J., Zwaenepoel, W. (2004). Partial Replication: Achieving Scalability in Redundant Arrays of Inexpensive Databases. In: Papatriantafilou, M., Hunel, P. (eds) Principles of Distributed Systems. OPODIS 2003. Lecture Notes in Computer Science, vol 3144. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27860-3_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-27860-3_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22667-3

  • Online ISBN: 978-3-540-27860-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics