Bi-stochastic Matrix Approximation Framework for Data Co-clustering

Labiod, Lazhar; Nadif, Mohamed

doi:10.1007/978-3-319-46349-0_24

Lazhar Labiod¹⁷ &
Mohamed Nadif¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9897))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

1693 Accesses

Abstract

The matrix approximation approaches like Singular Value Decomposition SVD and Non-negative Matrix Tri-Factorization (NMTF) have recently been shown to be useful and effective to tackle the co-clustering problem. In this work, we embed the co-clustering in a Bistochastic Matrix Approximation (BMA) framework and we derive from the double kmeans objective function a new formulation of the criterion to optimize. First, we show that the double k-means is equivalent to algebraic problem of BMA under some suitable constraints. Secondly, we propose an iterative process seeking for the optimal simultaneous partitions of rows and columns data, the solution is given as the steady state of a markov chain process. We develop two iterative algorithms; the first consists in learning rows and columns similarities matrices and the second consists in obtaining the simultaneous rows and columns partitions. Numerical experiments on simulated and real datasets demonstrate the interest of our approach which does not require the knowledge of the number of co-clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Cheng, Y., Church, G.M.: Biclustering of expression data, pp. 93–103. AAAI (2000)
Google Scholar
Cho, H., Dhillon, I., Guan, Y., Sra, S.: Minimum sum-squared residue co-clustering of gene expression data. In: Proceedings of the Fourth SIAM International Conference on Data Mining, pp. 114–125 (2004)
Google Scholar
Dhillon, I.: Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the International Conference ACM SIGKDD, San Francisco, USA, pp. 269–274 (2001)
Google Scholar
Dhillon, I., Mallela, S., Modha, D.S.: Information-theoretic coclustering. In: Proceedings of KDD 2003, pp. 89–98 (2003)
Google Scholar
Ding, C., Li, T., Peng, W., Park, H.: Orthogonal nonnegative matrix tri-factorizations for clustering. In: Proceedings of KDD 2006, Philadelphia, PA, pp. 635–640, September 2006
Google Scholar
Golub, G.H., van Loan, C.F.: Matrix Computations, 3rd edn. Johns Hopkins University Press, Baltimore (1996)
MATH Google Scholar
Govaert, G., Nadif, M.: Block clustering with Bernoulli mixture models: comparison of different approaches. Comput. Stat. Data Anal. 52, 2333–3245 (2008)
Article MathSciNet MATH Google Scholar
Govaert, G., Nadif, M.: Latent block model for contingency table. Commun. Stat. Theor. Methods 39, 416–425 (2010)
Article MathSciNet MATH Google Scholar
Govaert, G., Nadif, M.: Co-clustering: Models, Algorithms and Applications. Wiley, New York (2013)
Book MATH Google Scholar
Hartigan, J.A.: Direct clustering of a data matrix. J. Am. Stat. Assoc. 67(337), 123–129 (1972)
Article Google Scholar
Labiod, L., Nadif, M.: Co-clustering for binary and categorical data with maximum modularity. In: ICDM 2011, pp. 1140–1145 (2011)
Google Scholar
Labiod, L., Nadif, M.: Co-clustering under nonnegative matrix tri-factorization. In: Arik, S., Huang, T., Lai, W.K., Liu, Q. (eds.) ICONIP 2015. LNCS, vol. 9492, pp. 709–717. Springer, Heidelberg (2011). doi:10.1007/978-3-642-24958-7_82
Chapter Google Scholar
Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 03, 583–617 (2002)
MathSciNet MATH Google Scholar
Wang, F., Li, P., König, A.C., Wan, M.: Improving clustering by learning a bi-stochastic data similarity matrix. Knowl. Inf. Syst. 32(2), 351–382 (2012)
Article Google Scholar
Yoo, J., Choi, S.: Orthogonal nonnegative matrix tri-factorization for co-clustering: multiplicative updates on Stiefel manifolds. Inf. Process. Manag. 46(5), 559–570 (2010)
Article Google Scholar
Zass, R., Shashua, A.: A unifying approach to hard and probabilistic clustering. In: ICCV, pp. 294–301 (2005)
Google Scholar

Download references

Acknowledgments

This work has been funded by AAP Sorbonne Paris Cité.

Author information

Authors and Affiliations

LIPADE, University Paris Descartes, 75006, Paris, France
Lazhar Labiod & Mohamed Nadif

Authors

Lazhar Labiod
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Nadif
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lazhar Labiod .

Editor information

Editors and Affiliations

Stockholm University , Stockholm, Sweden
Henrik Boström
Leiden University , Leiden, The Netherlands
Arno Knobbe
University of Porto , Porto, Portugal
Carlos Soares
Stockholm University , Stockholm, Sweden
Panagiotis Papapetrou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Labiod, L., Nadif, M. (2016). Bi-stochastic Matrix Approximation Framework for Data Co-clustering. In: Boström, H., Knobbe, A., Soares, C., Papapetrou, P. (eds) Advances in Intelligent Data Analysis XV. IDA 2016. Lecture Notes in Computer Science(), vol 9897. Springer, Cham. https://doi.org/10.1007/978-3-319-46349-0_24

Download citation

DOI: https://doi.org/10.1007/978-3-319-46349-0_24
Published: 21 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46348-3
Online ISBN: 978-3-319-46349-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics