Abstract
Skyline query retrieve objects that are not dominated by another object. A result of a skyline query is relatively small, does not contain less important objects, and is useful for selecting an object. In this paper, we consider a method for computing skyline query in MapReduce framework, which is a de facto standard in big data analysis. Currently, we have to be aware of data disclosure. Therefore, we propose a distributed computation method, in which each computer uses only a projected database that is vertically splitted from an original database, for computing skyline query. Since one computer can see only projected values, sensitive information in a database can be localized in the proposed method in addition to the advantage of the efficiency of MapReduce. Extensive experiments demonstrate the efficiency of proposed algorithm for synthetic datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Borzsonyi, S., Kossmann, D., Stocker, K.: The skyline operator. In: Proceedings of ICDE, pp. 421–430 (2001)
Chomicki, J., Godfrey, P., Gryz, J., Liang, D.: Skyline with presorting. In: Proceedings of ICDE, pp. 717–719 (2003)
Tan, K.-L., Eng, P.-K., Ooi, B.C.: Efficient progressive skyline computation. In: Proceedings of VLDB, pp. 301–310 (2001)
Papadias, D., Tao, Y., Fu, G., Seeger, B.: Progressive skyline computation in database systems. ACM Trans. Database Syst. 30(1), 41–82 (2005)
Chan, C.Y., Jagadish, H.V., Tan, K.-L., Tung, A.-K.H., Zhang, Z.: Finding k-Dominant skyline in high dimensional space. In: Proceedings of ACM SIGMOD, pp. 503–514 (2006)
Lin, X., Yuan, Y., Wang, W., Lu, H.: Stabbing the sky: efficient skyline computation over sliding windows. In: Proceedings of ICDE, pp. 502–513 (2005)
Balke, W.-T., Güntzer, U., Zheng, J.X.: Efficient distributed skylining for web information systems. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 256–273. Springer, Heidelberg (2004)
Vlachou, A., Doulkeridis, C., Kotidis, Y., Vazirgiannis, M.: SKYPEER: efficient subspace skyline computation over distributed data. In: Proceedings of ICDE, pp. 416–425 (2007)
Tao, Y., Xiao, X., Pei, J.: Subsky: efficient computation of skylines in subspaces. In: Proceedings of ICDE, pp. 65–65 (2006)
Papadias, D., Tao, Y., Fu, G., Seeger, B.: An optimal and progressive algorithm for skyline queries. In: Proceedings of SIGMOD, pp. 467–478 (2003)
Dellis, E., Seeger, B.: Efficient computation of reverse skyline queries. In: Proceedings of VLDB, pp. 291–302 (2007)
Lee, J., Hwang, S., Nie, Z., Wen, J.-R.: Navigation system for product search. In: Proceedings of ICDE, pp. 1113–1116 (2010)
Lappas, T., Gunopulos, D.: Efficient confident search in large review corpora. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010, Part II. LNCS, vol. 6322, pp. 195–210. Springer, Heidelberg (2010)
Wang, G., Xin, J., Chen, L., Liu, Y.: Energy efficient reverse skyline query processing over wireless sensor networks. IEEE Trans. Knowl. Data Eng. 24(7), 1259–1275 (2012)
Zou, L., Chen, L., Özsu, M.T., Zhao, D.: Dynamic skyline queries in large graphs. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds.) DASFAA 2010. LNCS, vol. 5982, pp. 62–78. Springer, Heidelberg (2010)
Tao, Y., Lin, W., Xiao, X.: Minimal MapReduce algorithm. In: Proceedings of SIGMOD, pp. 529–540 (2013)
Park, Y., Min, J., Shim, K.: Parallel computation of skyline and reverse skyline queries using MapReduce. In: Proceedings of VLDB, pp. 2002–2013 (2013)
Jiang, D., Tung, A.K.H., Chen, G.: MAP-JOIN-REDUCE: toward scalable and efficient data analysis on large clusters. TKDE 23(9), 1299–1311 (2011)
Blanas, S., Patel, J.M., Ercegovac, V., Rao, J., Shekita, E.J., Tian, Y.: A comparison of join algorithms for log processing in MaPreduce. In: Proceedings of SIGMOD, pp. 975–986 (2010)
Vernica, R., Carey, M.J., Li, C.: Efficient parallel set-similarity joins using MapReduce. In: Proceedings of SIGMOD, pp. 495–506 (2010)
O’Malley, O.: Terabyte sort on apache hadoop. In Yahoo Technical report (2008)
Acknowledgments
This work is supported by KAKENHI (23500180, 25.03040) Japan.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Siddique, M.A., Tian, H., Morimoto, Y. (2014). Distributed Skyline Computation of Vertically Splitted Databases by Using MapReduce. In: Han, WS., Lee, M., Muliantara, A., Sanjaya, N., Thalheim, B., Zhou, S. (eds) Database Systems for Advanced Applications. DASFAA 2014. Lecture Notes in Computer Science(), vol 8505. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43984-5_3
Download citation
DOI: https://doi.org/10.1007/978-3-662-43984-5_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-43983-8
Online ISBN: 978-3-662-43984-5
eBook Packages: Computer ScienceComputer Science (R0)