A Journey in Pattern Mining

Zaki, Mohammed J.

doi:10.1007/978-3-642-28047-4_16

Mohammed J. Zaki²

1708 Accesses
1 Altmetric

Abstract

The traditional research paradigm in the sciences was hypothesis-driven. Over the last decade or so, this hypothesis-driven view has been replaced with a data-driven view of scientific research. In almost all fields of scientific endeavor, large research teams are systematically collecting data on questions of great import. Knowledge and insights are gained through data analysis and mining, feeding this inversion of science, i.e., rather than going from hypothesis to data, we use data to generate and validate hypotheses and to generate knowledge and understanding. The same can be said for applications in the commercial realm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

R. Agrawal, R. Srikant, Fast algorithms for mining association rules, in 20th VLDB Conference, Sept 1994
Google Scholar
M.J. Zaki, Scalable data mining for rules. Technical Report URCSTR-702 (Ph.D. Thesis), University of Rochester, July 1998
Google Scholar
M.J. Zaki, M. Ogihara, S. Parthasarathy, W. Li, Parallel data mining for association rules on shared-memory multi-processors, in Supercomputing’96, Nov 1996
Google Scholar
M.J. Zaki, Parallel sequence mining on shared-memory machines. J. Parallel Distrib. Comput. 61(3), 401–426 (2001). Special issue on High Performance Data Mining
Article MATH Google Scholar
M.J. Zaki, C.-T. Ho, R. Agrawal, Parallel classification for data mining on shared-memory multiprocessors, in 15th IEEE International Conference on Data Engineering, Mar 1999. See IBM Technical Report RJ10104 [6] for a more detailed version of this paper
Google Scholar
M.J. Zaki, C.-T. Ho, R. Agrawal, Parallel classification for data mining on shared-memory systems. Technical Report RJ10104, IBM, 1999
Google Scholar
M.J. Zaki, Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)
Article MathSciNet Google Scholar
M.J. Zaki, S. Parthasarathy, M. Ogihara, W. Li, New algorithms for fast discovery of association rules, in 3rd International Conference on Knowledge Discovery and Data Mining (KDD), Aug 1997
Google Scholar
M.J. Zaki, S. Parthasarathy, M. Ogihara, W. Li, Parallel algorithms for discovery of association rules. Data Min. Knowl. Discov. Int. J. 1(4), 343–373 (1997). Special issue on Scalable High-Performance Computing for KDD
Article Google Scholar
M.J. Zaki, C.-J. Hsiao, Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Trans. Knowl. Data Eng. 17(4), 462–478 (2005)
Article Google Scholar
K. Gouda, M.J. Zaki, Genmax: an efficient algorithm for mining maximal frequent itemsets. Data Min. Knowl. Discov. Int. J. 11(3), 223–242 (2005)
Article MathSciNet Google Scholar
B. Ganter, R. Wille, Formal Concept Analysis: Mathematical Foundations (Springer, Berlin, 1999)
Book MATH Google Scholar
M. Luxenburger, Implications partielles dans un contexte. Math. Inf. Sci. Hum. 29(113), 35–55 (1991)
MathSciNet Google Scholar
M.J. Zaki, M. Ogihara, Theoretical foundations of association rules, in 3rd ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, June 1998
Google Scholar
N. Pasquier, Y. Bastide, R. Taouil, L. Lakhal, Pruning closed itemset lattices for associations rules, in 14ème Journèes Bases de Donnèes Avancèes (BDA), 1998
Google Scholar
M.J. Zaki, Generating non-redundant association rules, in 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug 2000
Google Scholar
M.J. Zaki, Mining non-redundant association rules. Data Min. Knowl. Discov. Int. J. 9(3), 223–248 (2004)
Article MathSciNet Google Scholar
M.J. Zaki, C.-J. Hsiao, CHARM: an efficient algorithm for closed itemset mining, in 2nd SIAM International Conference on Data Mining, Apr 2002
Google Scholar
M.J. Zaki, Efficient enumeration of frequent sequences, in 7th ACM International Conference on Information and Knowledge Management, Nov 1998
Google Scholar
M.J. Zaki, SPADE: an efficient algorithm for mining frequent sequences. Mach. Learn. J. 42(1/2), 31–60 (2001). Special issue on Unsupervised Learning
Article MATH Google Scholar
M.J. Zaki, Sequences mining in categorical domains: incorporating constraints, in 9th ACM International Conference on Information and Knowledge Management, Nov 2000
Google Scholar
N. Lesh, M.J. Zaki, M. Ogihara, Scalable feature mining for sequential data. IEEE Intell. Syst. Appl. 15(2), 48–56 (2000). Special issue on Data Mining
Article Google Scholar
M.J. Zaki, Efficiently mining frequent trees in a forest, in 8th ACM SIGKDD International Conference Knowledge Discovery and Data Mining, July 2002
Google Scholar
M.J. Zaki, Efficiently mining frequent trees in a forest: algorithms and applications. IEEE Trans. Knowl. Data Eng. 17(8), 1021–1035 (2005). Special issue on Mining Biological Data
Article Google Scholar
M.J. Zaki, Efficiently mining frequent embedded unordered trees. Fundamenta Informaticae 66(1–2), 33–52 (2005). Special issue on Advances in Mining Graphs, Trees and Sequences
MathSciNet MATH Google Scholar
M.J. Zaki, C.C. Aggarwal, Xrules: an effective structural classifier for xml data. Mach. Learn. J. 62(1–2), 137–170 (2006). Special issue on Statistical Relational Learning and Multi-Relational Data Mining
Article Google Scholar
V. Chaoji, M.A. Hasan, S. Salem, M.J. Zaki, An integrated, generic approach to pattern mining: data mining template library. Data Min. Knowl. Discov. 17(3), 457–495 (2008)
Article MathSciNet Google Scholar
V. Chaoji, M.A. Hasan, S. Salem, J. Besson, M.J. Zaki, ORIGAMI: a novel and effective approach for mining representative orthogonal graph patterns. Stat. Anal. Data Min. 1(2), 67–84 (2008)
Article MathSciNet Google Scholar
M.A. Hasan, M.J. Zaki, Musk: uniform sampling of k maximal patterns, in 9th SIAM International Conference on Data Mining, Apr 2009
Google Scholar
M.A. Hasan, M.J. Zaki, Output space sampling for graph patterns, in Proceedings of the VLDB Endowment (35th International Conference on Very Large Data Bases) 2(1), 730–741 (2009)
Google Scholar
H. Yildirim, V. Chaoji, M.J. Zaki, Grail: scalable reachability index for large graphs. Proceedings of the VLDB Endowment (36th International Conference on Very Large Data Bases) 3(1), 276–284 (2010)
Google Scholar
M.J. Zaki, S. Jin, C. Bystroff, Mining residue contacts in proteins using local structure predictions. IEEE Trans. Syst. Man Cybern. B 33(5), 789–801 (2003). Special issue on Bioengineering and Bioinformatics
Article Google Scholar
M.J. Zaki, V. Nadimpally, D. Bardhan, C. Bystroff, Predicting protein folding pathways. Bioinformatics 20(1), i386–i393 (Aug 2004). Supplement on the Proceedings of the 12th International Conference on Intelligent Systems for Molecular Biology
Google Scholar
F. Gao, M.J. Zaki, PSIST: indexing protein structures using suffix trees, in IEEE Computational Systems Bioinformatics Conference, Aug 2005
Google Scholar
B. Phoophakdee, M.J. Zaki, Genome-scale disk-based suffix tree indexing, in ACM SIGMOD International Conference on Management of Data, June 2007
Google Scholar
Y. Zhang, M.J. Zaki, Exmotif: efficient structured motif extraction. Algorithms Mol. Biol. 1(21), (2006)
Google Scholar
Y. Zhang, M.J. Zaki, Smotif: efficient structured pattern and profile motif search. Algorithms Mol. Biol. 1(22), (2006)
Google Scholar
Z. Shentu, M.A. Hasan, C. Bystroff, M.J. Zaki, Context shapes: efficient complementary shape matching for protein-protein docking. Prot. Struct. Funct. Bioinformatics 70(3), 1056–1073 (2008)
Article Google Scholar
S. Salem, M.J. Zaki, C. Bystroff, Iterative non-sequential protein structural alignment. J. Bioinformatics Comput. Biol. 7(3), 571–596 (2009). Special issue on the best of CSB’08
Article Google Scholar
S. Salem, M.J. Zaki, C. Bystroff, FlexSnap: flexible nonsequential protein structure alignment. Algorithms Mol. Biol. 5(12), (2010). Special issue on best papers from WABI’09
Google Scholar
L. Zhao, M.J. Zaki, Microcluster: an efficient deterministic biclustering algorithm for microarray data. IEEE Intell. Syst. 20(6), 40–49 (2005). Special issue on Data Mining for Bioinformatics
Article Google Scholar
L. Zhao, M.J. Zaki, TriCluster: an effective algorithm for mining coherent clusters in 3d microarray data, in ACM SIGMOD Conference on Management of Data, June 2005
Google Scholar
M.J. Zaki, N. Ramakrishnan, L. Zhao, Mining frequent boolean expressions: application to gene expression and regulatory modeling. Int. J. Knowl. Discov. Bioinformatics 1(3), 68–96 (2010). Special issue on Mining Complex Structures in Biology
Article Google Scholar
L. Zhao, M.J. Zaki, N. Ramakrishnan, Blosom: a framework for mining arbitrary boolean expressions, in 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug 2006
Google Scholar

Download references

Author information

Authors and Affiliations

Rensselaer Polytechnic Institute, Troy, NY, USA
Mohammed J. Zaki

Authors

Mohammed J. Zaki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammed J. Zaki .

Editor information

Editors and Affiliations

, School of Computing, University of Portsmouth, Buckingham Building, Lion Terrace, Portsmouth, PO1 3HE, United Kingdom
Mohamed Medhat Gaber

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zaki, M.J. (2012). A Journey in Pattern Mining. In: Gaber, M. (eds) Journeys to Data Mining. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28047-4_16

Download citation

DOI: https://doi.org/10.1007/978-3-642-28047-4_16
Published: 02 April 2012
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28046-7
Online ISBN: 978-3-642-28047-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics