Skip to main content

COSINE: A Vertical Group Difference Approach to Contrast Set Mining

  • Conference paper
Advances in Artificial Intelligence (Canadian AI 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6657))

Included in the following conference series:

Abstract

Contrast sets have been shown to be a useful mechanism for describing differences between groups. A contrast set is a conjunction of attribute-value pairs that differ significantly in their distribution across groups. These groups are defined by a selected property that distinguishes one from the other (e.g customers who default on their mortgage versus those that don’t). In this paper, we propose a new search algorithm which uses a vertical approach for mining maximal contrast sets on categorical and quantitative data. We utilize a novel yet simple discretization technique, akin to simple binning, for continuous-valued attributes. Our experiments on real datasets demonstrate that our approach is more efficient than two previously proposed algorithms, and more effective in filtering interesting contrast sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bay, S.D., Pazzani, M.J.: Detecting change in categorical data: Mining contrast sets. In: KDD, pp. 302–306 (1999)

    Google Scholar 

  2. Bay, S.D., Pazzani, M.J.: Detecting group differences: Mining contrast sets. Data Min. Knowl. Discov. 5(3), 213–246 (2001)

    Article  MATH  Google Scholar 

  3. Hilderman, R., Peckham, T.: A statistically sound alternative approach to mining contrast sets. In: AusDM, pp. 157–172 (2005)

    Google Scholar 

  4. Simeon, M., Hilderman, R.J.: Exploratory quantitative contrast set mining: A discretization approach. In: ICTAI, vol. (2), pp. 124–131 (2007)

    Google Scholar 

  5. Bayardo Jr., R.J.: Efficiently mining long patterns from databases. In: SIGMOD Conference, pp. 85–93 (1998)

    Google Scholar 

  6. Wong, T.T., Tseng, K.L.: Mining negative contrast sets from data with discrete attributes. Expert Syst. Appl. 29, 401–407 (2005)

    Article  Google Scholar 

  7. Holm, S.: A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6, 65–70 (1979)

    MATH  Google Scholar 

  8. Lin, J., Keogh, E.J.: Group SAX: Extending the notion of contrast sets to time series and multimedia data. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 284–296. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  9. Lin, J., Keogh, E.J., Lonardi, S., chi Chiu, B.Y.: A symbolic representation of time series, with implications for streaming algorithms. In: DMKD, pp. 2–11 (2003)

    Google Scholar 

  10. Savasere, A., Omiecinski, E., Navathe, S.B.: An efficient algorithm for mining association rules in large databases. In: VLDB, pp. 432–444 (1995)

    Google Scholar 

  11. Zaki, M.J., Gouda, K.: Fast vertical mining using diffsets. In: KDD, pp. 326–335 (2003)

    Google Scholar 

  12. Gouda, K., Zaki, M.J.: Genmax: An efficient algorithm for mining maximal frequent itemsets. Data Min. Knowl. Discov. 11(3), 223–242 (2005)

    Article  Google Scholar 

  13. Asuncion, A., Newman, D.: UCI machine learning repository (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Simeon, M., Hilderman, R. (2011). COSINE: A Vertical Group Difference Approach to Contrast Set Mining. In: Butz, C., Lingras, P. (eds) Advances in Artificial Intelligence. Canadian AI 2011. Lecture Notes in Computer Science(), vol 6657. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21043-3_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-21043-3_43

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-21042-6

  • Online ISBN: 978-3-642-21043-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics