Abstract
Pattern mining is an important task of data mining and involves the extraction of interesting associations from large transactional databases. Typically, a given transactional database D gets updated due to the addition and deletion of transactions. Consequently, some of the previously discovered patterns may become invalid, while some new patterns may emerge. This has motivated significant research efforts in the area of incremental mining. The goal of incremental mining is to efficiently mine patterns when D gets updated with additions and/or deletions of transactions as opposed to mining all of the patterns from scratch. Incidentally, active research efforts are being made to develop incremental pattern mining algorithms for extracting frequent patterns, sequential patterns and utility patterns. Another important type of pattern is the coverage pattern (CP), which has significant applications in areas such as banner advertising, search engine advertising and visibility mining. However, none of the existing works address the issue of incremental mining for extracting CPs. In this regard, the main contributions of this work are twofold. First, we introduce the problem of incremental mining of CPs. Second, we propose an approach, designated as Comprehensive Coverage Pattern Mining, for efficiently extracting CPs under the incremental paradigm. We have also performed extensive experiments using two real click-stream datasets and one synthetic dataset to demonstrate the overall effectiveness of our proposed approach.
Similar content being viewed by others
References
Abdullah, Z., Herawan, T., Noraziah, A., Deris, M.M.: DFP-Growth: an efficient algorithm for mining frequent patterns in dynamic database. In: Proceedings of International Conference on Information Computing and Applications, pp. 51–58. Springer (2012)
Adnan, M., Alhajj, R., Barker, K.: Constructing complete FP-tree for incremental mining of frequent patterns in dynamic databases. In: Proceedings of the International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, pp. 363–372. Springer (2006)
Aggarwal, C.C., Bhuiyan, M.A., Al Hasan, M.: Frequent pattern mining algorithms: a survey. In: Frequent Pattern Mining, pp 19–64. Springer (2014)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the Very Large Data Bases, pp. 487–499. Springer (1994)
Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the Special Interest Group on Management of Data, pp. 207–216. ACM (1993)
Ahmed, C.F., Tanbeer, S.K., Jeong, B.S., Lee, Y.K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)
Aumann, Y., Feldman, R., Lipshtat, O., Manilla, H.: Borders: an efficient algorithm for association generation in dynamic databases. J. Intell. Inf. Syst. 12(1), 61–73 (1999)
Borah, A., Nath, B.: Rare association rule mining from incremental databases. Pattern Anal. Appl. 23(1), 113–134 (2020)
Budhiraja, A., Reddy, P.K.: An approach to cover more advertisers in adwords. In: Proceedings of the International Conference on Data Science and Advanced Analytics. IEEE, pp. 1–10 (2015)
Budhiraja, A., Reddy, P.K.: An improved approach for long tail advertising in sponsored search. In: Proceedings of the Database Systems for Advanced Applications, pp. 169–184 (2017)
Budhiraja, A., Ralla, A., Reddy, P.K.: Coverage pattern based framework to improve search engine advertising. Int. J. Data Sci. Anal. 8(2), 199–211 (2019)
Chang, L., Wang, T., Yang, D., Luan, H.: SeqStream: mining closed sequential patterns over stream sliding windows. In: Proceedings of the International Conference on Data Mining. IEEE, pp. 83–92 (2008)
Chau, M., Fang, X., Liu Sheng, O.R.: Analysis of the query logs of a web site search engine. J. Am. Soc. Inf. Sci. Technol. 56(13), 1363–1376 (2005)
Cheng, H., Yan, X., Han, J.: IncSpan: incremental mining of sequential patterns in large database. In: Proceedings of the Special Interest Group on Knowledge Discovery and Data Mining, pp. 527–532. ACM (2004)
Cheung, D.W., Wong, C., Han, J., Ng, V.T.: Maintenance of discovered association rules in large databases: an incremental updating technique. In: Proceedings of the International Conference on Data Engineering. IEEE, pp. 106–114 (1996)
Cheung, D.W., Lee, S.D., Kao, B.: A general incremental technique for maintaining discovered association rules. In: Proceedings of the Database Systems for Advanced Applications, pp. 185–194. World Scientific (1997)
Chuang, P.J., Tu, Y.S.: Efficient frequent pattern mining in data streams. In: IOP Conference Series: Earth and Environmental Science, vol. 234, no. 1, pp. 012–066. IOP Publishing (2019)
Gangumalla, L., Reddy, P.K., Mondal, A.: Multi-location visibility query processing using portion-based transactional modeling and pattern mining. Data Min. Knowl. Discov. 33(5), 1393–1416 (2019)
Goethals, B., Zaki, M.J.: Advances in frequent itemset mining implementations: report on FIMI’03. Spec. Interest Group Knowl. Discov. Data Min. Explor. Newslett. ACM 6(1), 109–117 (2004)
Guo, F., Li, Y., Li, L.: Research on improvement of high utility pattern mining algorithm over data streams. In: IOP Conference Series: Materials Science and Engineering, vol. 715, no. 1, pp. 012–022. IOP Publishing (2020)
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Elsevier, San Francisco (2011)
Hassani, M., Töws, D., Cuzzocrea, A., Seidl, T.: BFSPMiner: an effective and efficient batch-free algorithm for mining sequential patterns over data streams. Int. J. Data Sci. Anal. 8(3), 223–239 (2019)
Ho, C.C., Li, H.F., Kuo, F.F., Lee, S.Y.: Incremental mining of sequential patterns over a stream sliding window. In: Proceedings of the International Conference on Data Mining-Workshops, IEEE, pp. 677–681 (2006)
Ishita, S.Z., Ahmed, C.F., Leung, C.K., Hoi, C.H.: Mining regular high utility sequential patterns in static and dynamic databases. In: Proceedings of the International Conference on Ubiquitous Information Management and Communication, , pp. 897–916. Springer (2019)
Karim, M.R., Cochez, M., Beyan, O.D., Ahmed, C.F., Decker, S.: Mining maximal frequent patterns in transactional databases and dynamic data streams: a spark-based approach. Inf. Sci. 432, 278–300 (2018)
Kavya, V.N.S., Reddy, P.K.: Coverage patterns-based approach to allocate advertisement slots for display advertising. In: Proceedings of the International Conference on Web Engineering, pp. 152–169. Springer (2016)
Lee, G., Yun, U., Ryu, K.H.: Sliding window based weighted maximal frequent pattern mining over data streams. Expert Syst. Appl. 41(2), 694–708 (2014)
Lin, M.Y., Hsueh, S.C., Chan, C.C.: Mining and maintenance of sequential patterns using a backward generation framework. J. Inf. Sci. Eng. 34(5), 1329–1349 (2018)
Marascu, A., Masseglia, F.: Mining sequential patterns from data streams: a centroid approach. J. Intell. Inf. Syst. 27(3), 291–307 (2006)
Masseglia, F., Poncelet, P., Teisseire, M.: Incremental mining of sequential patterns in large databases. Data Knowl. Eng. 46(1), 97–121 (2003)
Nguyen, L.T., Nguyen, P., Nguyen, T.D., Vo, B., Fournier-Viger, P., Tseng, V.S.: Mining high-utility itemsets in dynamic profit databases. Knowl.-Based Syst. 175, 130–144 (2019)
Nguyen, S.N., Sun, X., Orlowska, M.E.: Improvements of IncSpan: incremental mining of sequential patterns in large database. In: Proceedings of the Pacific Asia Conference on Knowledge Discovery and Data Mining, pp. 442–451. Springer (2005)
Nguyen, T.T.: Mining incrementally closed item sets with constructive pattern set. Expert Syst. Appl. 100, 41–67 (2018)
Noll, M.G., Meinel, C.: The metadata triumvirate: Social annotations, anchor texts and search queries. In: Proceedings of the International Conference on Web Intelligence and Intelligent Agent Technology, IEEE, pp. 640–647 (2008)
Ralla, A., Reddy, P.K., Mondal, A.: An incremental technique for mining coverage patterns in large databases. In: Proceedings of the International Conference on Data Science and Advanced Analytics, IEEE, pp. 211–220 (2019)
Ryang, H., Yun, U.: High utility pattern mining over data streams with sliding window technique. Expert Syst. Appl. 57, 214–231 (2016)
Srinivas, P.G., Reddy, P.K., Bhargav, S., Kiran, R.U., Kumar, D.S.: Discovering coverage patterns for banner advertisement placement. In: Proceedings of the Pacific–Asia Conference on Knowledge Discovery and Data Mining, pp. 133–144. Springer (2012)
Srinivas, P.G., Reddy, P.K., Trinath, A.V., Bhargav, S., Kiran, R.U.: Mining coverage patterns from transactional databases. J. Intell. Inf. Syst. 45(3), 423–439 (2015)
Tanbeer, S.K., Ahmed, C.F., Jeong, B.S., Lee, Y.K.: Sliding window-based frequent pattern mining over data streams. Inf. Sci. 179(22), 3843–3865 (2009)
Trinath, A., Srinivas, P.G., Reddy, P.K.: Content specific coverage patterns for banner advertisement placement. In: Proceedings of the International Conference on Data Science and Advanced Analytics, IEEE, pp. 263–269 (2014)
Wang, J.Z., Huang, J.L.: On incremental high utility sequential pattern mining. Trans. Intell. Syst. Technol. 9(5), 1–26 (2018)
Yen, S.J., Lee, Y.S.: Efficient approaches for updating sequential patterns. In: Proceedings of the Asian Conference on Intelligent Information and Database Systems, pp. 553–564. Springer (2020)
Yun, U., Lee, G.: Incremental mining of weighted maximal frequent itemsets from dynamic databases. Expert Syst. Appl. 54, 304–327 (2016)
Yun, U., Lee, G., Yoon, E.: Advanced approach of sliding window based erasable pattern mining with list structure of industrial fields. Inf. Sci. 494, 37–59 (2019a)
Yun, U., Nam, H., Lee, G., Yoon, E.: Efficient approach for incremental high utility pattern mining with indexed list structure. Future Gener. Comput. Syst. 95, 221–239 (2019b)
Zhang, B., Lin, C.W., Gan, W., Hong, T.P.: Maintaining the discovered sequential patterns for sequence insertion in dynamic databases. Eng. Appl. Artif. Intell. 35, 131–142 (2014)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kaushik, K., Reddy, P.K., Mondal, A. et al. An incremental framework to extract coverage patterns for dynamic databases. Int J Data Sci Anal 12, 273–291 (2021). https://doi.org/10.1007/s41060-021-00262-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41060-021-00262-4