Abstract
Pattern-based clustering has broad applications in microarray data analysis, customer segmentation, e-business data analysis, etc. However, pattern-based clustering often returns a large number of highly-overlapping clusters, which makes it hard for users to identify interesting patterns from the mining results. Moreover, there lacks of a general model for pattern-based clustering. Different kinds of patterns or different measures on the pattern coherence may require different algorithms. In this paper, we address the above two problems by proposing a general quality-driven approach to mining top-k quality pattern-based clusters. We examine our quality-driven approach using real world microarray data sets. The experimental results show that our method is general, effective and efficient.
This research is partly supported by NSF grants DBI-0234895 and IIS-0308001, NIH grant 1 P20 GM067650-01A1, the Endowed Research Fellowship and the President Research Grant from Simon Fraser University. All opinions, findings, conclusions and recommendations in this paper are those of the authors and do not necessarily reflect the views of the funding agencies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cheng, Y., Church, G.M.: Biclustering of expression data. In: ISMB 2000 (2000)
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Computing Surveys 31, 264–323 (1999)
Jiang, D., Pei, J., Zhang, A.: Interactive Exploration of Coherent Patterns in Time-Series Gene Expression Data. In: KDD 2003 (2003)
Jiang, D., Pei, J., Ramanathan, M., et al.: Mining Coherent Gene Clusters from Gene-Sample-Time Microarray Data. In: KDD 2004 (2004)
Liu, J., Wang, W.: OP-Cluster: Clustering by Tendency in High Dimensional Space. In: ICDM 2003 (2003)
Pei, J., Zhang, X., Cho, M., et al.: A Fast Algorithm for Maximal Pattern-based Clustering. In: ICDM 2003 (2003)
Rymon, R.: Search through systematic set enumeration. In: KR 1992 (1992)
Spellman, P.T., Sherlock, G., Zhang, M.Q., et al.: Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 9, 3272–3297 (1998)
Wang, H., Wang, W., Yang, J., et al.: Clustering by Pattern Similarity in Large Data Sets. In: SIGMOD 2002 (2002)
Yang, J., Wang, W., Wang, H., et al.: δ-cluster: Capturing Subspace Correlation in a Large Data Set. In: ICDE 2002 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jiang, D., Peii, J., Zhang, A. (2005). A General Approach to Mining Quality Pattern-Based Clusters from Microarray Data. In: Zhou, L., Ooi, B.C., Meng, X. (eds) Database Systems for Advanced Applications. DASFAA 2005. Lecture Notes in Computer Science, vol 3453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11408079_18
Download citation
DOI: https://doi.org/10.1007/11408079_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25334-1
Online ISBN: 978-3-540-32005-0
eBook Packages: Computer ScienceComputer Science (R0)