Skip to main content
Log in

Performance evaluation of a parallel cascade semijoin algorithm for computing path expressions in object database systems

  • Regular Papers
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

With the emerging of new applications, especially in Web, such as E-Commerce, Digital Library and DNA Bank, object database systems show their stronger functions than other kinds of database systems due to their powerful representation ability on complex semantics and relationship. One distinguished feature of object database systems is path expression, and most queries on an object database are based on path expression because it is the most natural and convenient way to access the object database, for example, to navigate the hyperlinks in a web-based database. The execution of path expression is usually extremely expensive on a very large database. Therefore, the improvement of path expression execution efficiency is critical for the performance of object databases. As an important approach realizing high-performance query processing, the parallel processing of path expression on distributed object databases is explored in this paper. Up to now, some algorithms about how to compute path expressions and how to optimize path expression processing have been proposed for centralized environments. But, few approaches have been presented for computing path expressions in parallel. In this paper, a new parallel algorithm for computing path expression named Parallel Cascade Semijoin (PCSJ) is proposed. Moreover, a new scheduling strategy called right-deep zigzag tree is designed to further improve the performance of the PCSJ algorithm. The experiments have been implemented in an NOW distributed and parallel environment. The results show that the PCSJ algorithm outperforms the other two parallel algorithms (the parallel version of forward pointer chasing algorithm (PFPC) and the index splitting parallel algorithm (IndexSplit) when computing path expressions with restrictive predicates and that the right-deep zigzag tree scheduling strategy has better performance than the right-deep tree scheduling strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Amano H, Atitsugi A, Bai G,et al. Shusse-Uo: A persistent project of developing a flexible platform for advanced database systems and applications. Technical Report of IEICE, DE93-63, Oct., 1994, pp. 137–144.

  2. Bai G, Makinouchi A. WAKASHI/D: A distributed paged-object server for storage management of new generation databases. InProc. the International Symposium on ADTI, Nara, Japan, Oct., 1994, pp. 137–144.

  3. Cattell R. The Object Database Standard: ODMG 2.0. Morgan Kaufmann Publishers, Inc., 1997.

  4. Chen M-S, Lo M, Yu P S, Young H C. Using segmented right-deep trees for the execution of pipelined hash joins. InProceedings of the 18th VLDB Conference, Vancouver, Canada, 1992, pp. 15–26.

  5. Cho W, Lee S, Whang K, Yoon Y. Query optimization techniques utilizing path indexes in object-oriented database systems.DASFAA97, Molbourne, Astralia, 1997, pp. 21–30.

  6. Chen, M-S, Lo M, Yu P S, Young H. Applying segmented right-deep trees to pipelining multiple hash joints.IEEE Trans. Knowledge and Data Engineering, August, 1995, 7(4): 656–668.

    Article  Google Scholar 

  7. Gardarin G Gruser J-R, Tang Z-H. Cost-based selection of path expression processing algorithms in object-oriented databases. InProceedings of the 22nd VLDB Conference, Mumbai (Bombay), Indian, 1996, pp. 390–401.

  8. Huang Y, Chiang Y, Su S Y W. Parallel and asynchronous query processing and optimization in object-oriented databases. InProc. the International Symposium on Parallel and Distributed Supercomputing, Fukuoka, Japan, 1999, pp. 188–197.

  9. Kim K-C. Parallelism in object-oriented query processing. InProc. the 6th ICDE Conference, Los Angeles California, Feb., 1990, pp. 209–217.

  10. Lieuwen D, DeWitt D, Mehta M. Pointer-based join techniques for object-oriented databases. Technique Report CCS-TR-92-1099, Computer Science Department, University of Wisconsin, Madison, 1992.

    Google Scholar 

  11. Lieuwen D, DeWitt D, Mehta M. Parallel pointer-based joint techniques for object-oriented databases. InProceedings of the 2nd International Conference on Parallel and Distributed Information Systems, San Diego, CA, USA, January, 1993, pp. 172–181.

  12. Lanzelotee R, Valduriez P, Zait M. On the effectiveness of optimization search strategies for parallel execution spaces. InProceedings of the 19th VLDB Conference, Dublin, Ireland, 1993, pp. 493–504.

  13. Ozkan C, Dogac A, Altinel M. A cost model for path expressions in object-oriented queries.Journal of Database Management, 1996.

  14. Ozkan C, Dogac A, Evrendilek C. A heuristic approach for optimization of path expressions in object-oriented query languages. InProceedings of the 6th International Conference on Database and Expert Systems Applications, London, 1995, pp. 574–583.

  15. Schneider D, DeWitt D. A performance evaluation of four parallel joint algorithms in a shared-nothing multiprocessor environment. InProceedings of the 1989 ACM SIGMOD International Conference on Management of Data, Portland, Oregon, June, 1989, pp. 110–121.

  16. Shekita E, Carey M. A performance evaluation of pointer-based joins. InProceedings of the 1990 ACM SIGMOD International Conference on Management of Data, Atlantic City, NJ, USA., 1990, pp.300–311.

  17. Schneider D, DeWitt D. Tradeoffs in processing in processing complex join queries via hashing in multiprocessor database machines. InProceedings of the 16th VLDB Conference, Brisbane, Australia, 1990, pp.469–480.

  18. Tsuji T, Hochin T. Parallel index retrieval of complex objects. InAdvanced Database Systems for Integration of Media and User Environments’98, Singapore, 1998, pp.179–184.

  19. Thakore A K, Su S Y W, Lam H X. Algorithms for asynchronous parallel processing of object-oriented databases.IEEE Trans. TKDE, 1995, 7(3): 487–504.

    Google Scholar 

  20. Wilschut A, Folkstra J, Apers P. Parallel evaluation of multi-join queries. InProceedings of the SIGMOD Conference, San Jose, CA, USA, pp.115–126.

  21. Wang G, Yu G, Kaneko K, Makinouchi A. Design and performance evaluation of a DSVM based parallel hash join algorithm for object database systems in NOW environments. InProc. ISCA 11th Int. Conf. PDCS’98, Chicago, Illinois, North Carolina: ISCA, 1998, pp.177–182.

    Google Scholar 

  22. Yu G, Kaneko K, Bai G, Makinouchi A. Transaction management for a distributed object storage system WAKASHI-design, implementation and performance. InProc. the 12th ICDE Conference, New Orleans, Louisiana, 1996, pp.460–468.

  23. Ziane M, Zait M, Borla-Salamet P. Parallel query processing in DBS3. InProceedings of the 2nd International Conference on Parallel and Distributed Information Systems (PDIS 1993), San Diego, CA, USA, 1993, pp.93–102.

  24. Ziane M, Zait M, Borla-Salamet P. Parallel query processing with zigzag.VLDB Journal, 1993, 2(3): 277–301.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wang Guoren.

Additional information

This work is supported by the Teaching and Research Award Programme for Outstanding Young Teachers in Higher Education Institutions the Ministry of Education, China. (TRAPOYT) and the Foundation for University Key Teachers by the Ministry of Education, China, and Cross Centry Excellent Young Teacher Foundation of the Ministry of Education of China.

WANG Guoren is a professor at Northeastern University, China. He received his B.E., M.E. and Ph.D. degrees from Northeastern University in 1988, 1991, and 1996, respectively. He did post-doctoral work at Kyushu University from 1996 to 1997. He is a member of ACM and ACM SIGMOD. His research interests include multi-database systems, object-oriented database systems, document database systems, parallel processing, query optimization, and information integration.

YU Ge is a professor at Northeastern University, China. He received his B.E. and M.E. degrees from Northeastern University in 1982 and 1986, respectively, and his Ph.D. degree from Kyushu University, Japan in 1996. He is a member of CIMS Expert Group of the National ‘863’ High-Tech Programme of China, a member of IPSJ, ACM, and ACM SIGMOD. His research interests include distributed and parallel database systems, object-oriented database systems, multi-database and information integration, data warehousing and data mining, and transactional workflow management.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, G., Yu, G. Performance evaluation of a parallel cascade semijoin algorithm for computing path expressions in object database systems. J. Comput. Sci. & Technol. 17, 140–151 (2002). https://doi.org/10.1007/BF02962206

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02962206

Keywords

Navigation