Applicability and Interpretability of Ward’s Hierarchical Agglomerative Clustering With or Without Contiguity Constraints

Randriamihamison, Nathanaël; Vialaneix, Nathalie; Neuvial, Pierre

doi:10.1007/s00357-020-09377-y

Applicability and Interpretability of Ward’s Hierarchical Agglomerative Clustering With or Without Contiguity Constraints

Published: 30 September 2020

Volume 38, pages 363–389, (2021)
Cite this article

Journal of Classification Aims and scope Submit manuscript

Nathanaël Randriamihamison ORCID: orcid.org/0000-0002-7040-8314^1,2,
Nathalie Vialaneix¹ &
Pierre Neuvial²

692 Accesses
25 Citations
Explore all metrics

Abstract

Hierarchical agglomerative clustering (HAC) with Ward’s linkage has been widely used since its introduction by Ward (Journal of the American Statistical Association, 58(301), 236–244, 1963). This article reviews extensions of HAC to various input data and contiguity-constrained HAC, and provides applicability conditions. In addition, different versions of the graphical representation of the results as a dendrogram are also presented and their properties are clarified. We clarify and complete the results already available in an heterogeneous literature using a uniform background. In particular, this study reveals an important distinction between a consistency property of the dendrogram and the absence of crossover within it. Finally, a simulation study shows that the constrained version of HAC can sometimes provide more relevant results than its unconstrained version despite the fact that the constraint leads to optimize the objective criterion on a reduced set of solutions at each step. Overall, this article provides comprehensive recommendations, both for the use of HAC and constrained HAC depending on the input data, and for the representation of the results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 6

Ward’s Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward’s Criterion?

Article 18 October 2014

Minkowski Generalizations of Ward’s Method in Hierarchical Clustering

Article 04 July 2014

A Constrained Cluster Analysis with Homogeneity of External Criterion

Notes

In the rare situation when the minimal linkage is achieved by more than one merger, a choice between these mergers has to be made. Different choices are made by different implementations of HAC.
https://CRAN.R-project.org/package=adjclust
In some cases, similarity measures are also supposed to take non-negative values, but we will not make this assumption in the present article.
The detailed analysis of all examples and counter-examples of this section is provided in Appendix 2.
https://CRAN.R-project.org/package=rioja
The pre-processed and normalized data have been downloaded from the authors’ website at http://chromosome.sdsc.edu/mouse/hi-c/download.html (raw sequence data are also published on the GEO website, accession number GSE35156).

References

Ah-Pine, J., & Wang, X. (2016). Similarity based hierarchical clustering with an application to text collections. In Boström, H., Knobbe, A., Soares, C., & Papapetrou, P. (Eds.) Proceedings of the 15th International Symposium on Intelligent Data Analysis (IDA 2016), Lecture Notes in Computer Sciences (pp. 320–331). Stockholm.
Ambroise, C., Dehman, A., Neuvial, P., Rigaill, G., Vialaneix, N. (2019). Adjacency-constrained hierarchical clustering of a band similarity matrix with application to genomics. Algorithms for Molecular Biology, 14, 22.
Article Google Scholar
Arlot, S., Brault, V., Baudry, J.-P., Maugis, C., Michel, B. (2016). capushe: CAlibrating Penalities Using Slope HEuristics. R package version 1.1.1.
Arlot, S., Celisse, A., Harchaoui, Z. (2019). A kernel multiple change-point algorithm via model selection. Submitted for publication. arXiv:1202.3878v3. Now published in JMLR, see https://jmlr.org/papers/v20/16-155.html Bibtex entry: https://jmlr.org/papers/v20/16-155.bib.
Aronszajn, N. (1950). Theory of reproducing kernels. Transactions of the American Mathematical Society, 68(3), 337–337.
Article MathSciNet Google Scholar
Batagelj, V. (1981). Note on ultrametric hierarchical clustering algorithms. Psychometrika, 46(3), 351–352.
Article MathSciNet Google Scholar
Bennett, K.D. (1996). Determination of the number of zones in a biostratigraphical sequence. New Phytologist, 132(1), 155–170.
Article Google Scholar
Chavent, M., Kuentz-Simonet, V., Labenne, A., Saracco, J. (2018). Clustgeo2: an R package for hierarchical clustering with spatial constraints. Computational Statistics, 33(4), 1799–1822.
Article MathSciNet Google Scholar
Chen, J., & Ye, J. (2008). Training SVM with indefinite kernels. In Cohen, W., McCallum, A., & Roweis, S. (Eds.) Proceedings of the 25th International Conference on Machine Learning (ICML 2008) (pp. 136–146). New York: ACM.
Chen, Y., Garcia, E., Gupta, M., Rahimi, A., Cazzanti, L. (2009). Similarity-based classification: concepts and algorithm. Journal of Machine Learning Research, 10, 747–776.
MathSciNet MATH Google Scholar
Danon, L., Diaz-Guilera, A., Duch, J., Arenas, A. (2005). Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment, 2005, P09008.
Article Google Scholar
Dehman, A. (2015). Spatial clustering of linkage disequilibrium blocks for genome-wide association studies, PhD thesis, Université Paris Saclay.
Dixon, J., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J., Ren, B. (2012). Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature, 485, 376–380.
Article Google Scholar
Ferligoj, A., & Batagelj, V. (1982). Clustering with relational constraint. Psychometrika, 47(4), 413–426.
Article MathSciNet Google Scholar
Fraser, J., Ferrai, C., Chiariello, A.M., Schueler, M., Rito, T., Laudanno, G., Barbieri, M., Moore, B.L., Kraemer, D.C., Aitken, S., Xie, S.Q., Morris, K.J., Itoh, M., Kawaji, H., Jaeger, I., Hayashizaki, Y., Carninci, P., Forrest, A.R., The FANTOM Consortium, Semple, C.A., Dostie, J., Pombo, A., Nicodemi, M. (2015). Hierarchical folding and reorganization of chromosomes are linked to transcriptional changes in cellular differentiation. Molecular Systems Biology, 11, 852.
Article Google Scholar
Gordon, A. (1996). A survey of constrained classification. Computational Statistics & Data Analysis, 21(1), 17–29.
Article MathSciNet Google Scholar
Grimm, E.C. (1987). CONISS: A FORTRAN 77 program for stratigraphically constrained analysis by the method of incremental sum of squares. Computers & Geosciences, 13(1), 13–35.
Article Google Scholar
Haddad, N., Vaillant, C., Jost, D. (2017). IC-Finder: inferring robustly the hierarchical organization of chromatin folding. Nucleic Acids Research, 45(10), e81–e81.
Google Scholar
Hartigan, J.A. (1967). Representation of similarity matrices by trees. Journal of the American Statistical Association, 62(320), 1140–1158.
Article MathSciNet Google Scholar
Imakaev, M., Fudenberg, G., McCord, R., Naumova, N., Goloborodko, A., Lajoie, B., Dekker, J., Mirny, L. (2012). Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nature Methods, 9(10), 999–1003.
Article Google Scholar
Johnson, S.C. (1967). Hierarchical clustering schemes. Psychometrika, 32(3), 241–254.
Article Google Scholar
Krislock, N., & Wolkowicz, H. (2012). Handbook on semidefinite, conic and polynomial optimization, volume 166 of International Series in Operations Research & Management Science, chapter Euclidean distance matrices and applications, (pp. 879–914). New York: Springer.
MATH Google Scholar
Kruskal, J. (1964). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29(1), 1–27.
Article MathSciNet Google Scholar
Lance, G., & Williams, W. (1967). A general theory of classificatory sorting strategies: 1. Hierarchical systems. The Computer Journal, 9(4), 373–380.
Article Google Scholar
Lebart, L. (1978). Programme d’agrégation avec contraintes. Les Cahiers de l’Analyse des Données, 3(3), 275–287.
Google Scholar
Miyamoto, S., Abe, R., Endo, Y., Takeshita, J.-I. (2015). Ward method of hierarchical clustering for non-Euclidean similarity measures. In Proceedings of the VIIth International Conference of Soft Computing and Pattern Recognition (SoCPaR 2015). Fukuoka: IEEE.
Murtagh, F., & Legendre, P. (2014). Ward’s hierarchical agglomerative clustering method: which algorithms implement Ward’s criterion. Journal of Classification, 31(3), 274–295.
Article MathSciNet Google Scholar
Qin, J., Lewis, D.P., Noble, W.S. (2003). Kernel hierarchical gene clustering from microarray expression data. Bioinformatics, 19(16), 2097–2104.
Article Google Scholar
Rammal, R., Toulouse, G., Virasoro, M.A. (1986). Ultrametricity for physicists. Reviews of Modern Physics, 58(3), 765–788.
Article MathSciNet Google Scholar
Schleif, F.-M., & Tino, P. (2015). Indefinite proximity learning: a review. Neural Computation, 27(10), 2039–2096.
Article MathSciNet Google Scholar
Schoenberg, I. (1935). Remarks to Maurice fréchet’s article “Sur la définition axiomatique d’une classe d’espace distanciés vectoriellement applicable sur l’espace de Hilbert”. Annals of Mathematics, 36, 724–732.
Article MathSciNet Google Scholar
Schölkopf, B., & Smola, A.J. (2002). Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press.
Steinley, D., & Hubert, L. (2008). Order-constrained solutions in K-means clustering: even better than being globally optimal. Psychometrika, 73(4), 647–664.
Article MathSciNet Google Scholar
Strauss, T., & von Maltitz, M.J. (2017). Generalising Ward’s method for use with Manhattan distances. PLoS ONE, 12, e0168288.
Article Google Scholar
Székely, G.J., & Rizzo, M.L. (2005). Hierarchical clustering via joint between-within distances: extending Ward’s minimum variance method. Journal of Classification, 22(2), 151–183.
Article MathSciNet Google Scholar
Ward, J.H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244.
Article MathSciNet Google Scholar
Wickham, H. (2016). ggplot2: elegant graphics for data analysis. New York: Springer.
Book Google Scholar
Wishart, D. (1969). An algorithm for hierarchical classifications. Biometrics, 25(1), 165–170.
Article Google Scholar
Young, G., & Householder, A. (1938). Discussion of a set of points in terms of their mutual distances. Psychometrika, 3, 19–22.
Article Google Scholar
Zufferey, M., Tavernari, D., Oricchio, E., Ciriello, G. (2018). Comparison of computational methods for the identification of topologically associating domains. Genome Biology, 19(1), 217.
Article Google Scholar

Download references

Acknowledgments

The authors would like to thank Marie Chavent for numerous instructive discussions on this paper. The authors are grateful to the GenoToul bioinformatics platform (INRAE Toulouse, http://bioinfo.genotoul.fr/) and its staff for providing computing facilities.

Funding

The PhD thesis of N.R. is funded by the INRAE/Inria doctoral program 2018. This work was also supported by the SCALES project funded by CNRS (Mission “Osez l’interdisciplinarité”).

Author information

Authors and Affiliations

INRAE, UR875 Mathématiques et Informatique Appliquées Toulouse, F-31326, Castanet-Tolosan, France
Nathanaël Randriamihamison & Nathalie Vialaneix
Institut de Mathématiques de Toulouse, UMR 5219, Université de Toulouse, CNRS UPS, F-31062, Toulouse Cedex 9, France
Nathanaël Randriamihamison & Pierre Neuvial

Authors

Nathanaël Randriamihamison
View author publications
You can also search for this author in PubMed Google Scholar
Nathalie Vialaneix
View author publications
You can also search for this author in PubMed Google Scholar
Pierre Neuvial
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nathanaël Randriamihamison.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1. Proof of Proposition 2

Proof of Proposition 2

We begin by noting that by Proposition 1, the only reversals that may occur are crossovers. With the notation of Proposition 2, a crossover at step t + 1 corresponds to the situation where:

$$ \begin{array}{@{}rcl@{}} \delta(G_{l} , G_{r})\geq \delta(G_{l} \cup G_{r}, G_{\bar{r}}) \textrm{ or } \delta(G_{l} , G_{r})\geq \delta(G_{l} \cup G_{r}, G_{\bar{l}}). \end{array} $$

By symmetry, we focus on the first case. With the notation of Proposition 2, and using the Lance-Willams formula (4), the first condition is equivalent to:

$$ \begin{array}{@{}rcl@{}} \delta(G_{l}, G_{r}) \geq \frac{g_{lr'} \delta(G_{l}, G_{\bar{r}}) + g_{rr^{\prime}} \delta (G_{r}, G_{\bar{r}})}{g_{lr^{\prime}} + g_{rr^{\prime}}} \end{array} $$

while the second one is equivalent to:

$$ \begin{array}{@{}rcl@{}} \delta(G_{l}, G_{r}) \geq \frac{g_{\bar{l}l} \delta(G_{\bar{l}}, G_{l}) + g_{\bar{l}r} \delta (G_{\bar{l}}, G_{r})}{g_{\bar{l}l} + g_{\bar{l}r}} \end{array} $$

hence the result. □

Appendix 2. Step-by-step Description of the Counter-Examples

In the following tables, Bold values are used to signal reversals. Italic values in Table 3 are used to highlight the value of the objective function (ESS_t) for the clustering with 3 clusters.

Table 2 Details of Fig. 1

Full size table

Table 3 Details of Fig. 2

Full size table

Table 4 Details of Fig. 3

Full size table

Table 5 Details of Fig. 4

Full size table

Table 6 Details of Fig. 11

Full size table

Appendix 3. Counter-Example of the Monotonicity of $\bar {I}_{t}$ for Standard HAC in the Euclidean Case

Rights and permissions

Reprints and permissions

About this article

Cite this article

Randriamihamison, N., Vialaneix, N. & Neuvial, P. Applicability and Interpretability of Ward’s Hierarchical Agglomerative Clustering With or Without Contiguity Constraints. J Classif 38, 363–389 (2021). https://doi.org/10.1007/s00357-020-09377-y

Download citation

Published: 30 September 2020
Issue Date: July 2021
DOI: https://doi.org/10.1007/s00357-020-09377-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Applicability and Interpretability of Ward’s Hierarchical Agglomerative Clustering With or Without Contiguity Constraints

Abstract

Access this article

Similar content being viewed by others

Ward’s Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward’s Criterion?

Minkowski Generalizations of Ward’s Method in Hierarchical Clustering

A Constrained Cluster Analysis with Homogeneity of External Criterion

Notes

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Appendices

Appendix 1. Proof of Proposition 2

Proof of Proposition 2

Appendix 2. Step-by-step Description of the Counter-Examples

Appendix 3. Counter-Example of the Monotonicity of \(\bar {I}_{t}\) for Standard HAC in the Euclidean Case

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Applicability and Interpretability of Ward’s Hierarchical Agglomerative Clustering With or Without Contiguity Constraints

Abstract

Access this article

Similar content being viewed by others

Ward’s Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward’s Criterion?

Minkowski Generalizations of Ward’s Method in Hierarchical Clustering

A Constrained Cluster Analysis with Homogeneity of External Criterion

Notes

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Appendices

Appendix 1. Proof of Proposition 2

Proof of Proposition 2

Appendix 2. Step-by-step Description of the Counter-Examples

Appendix 3. Counter-Example of the Monotonicity of \(\bar {I}_{t}\) for Standard HAC in the Euclidean Case

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation