Parts2Whole: Self-supervised Contrastive Learning via Reconstruction

Feng, Ruibin; Zhou, Zongwei; Gotway, Michael B.; Liang, Jianming

doi:10.1007/978-3-030-60548-3_9

Ruibin Feng¹⁹,
Zongwei Zhou¹⁹,
Michael B. Gotway²⁰ &
…
Jianming Liang¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12444))

Included in the following conference series:

2846 Accesses
10 Citations

Abstract

Contrastive representation learning is the state of the art in computer vision, but requires huge mini-batch sizes, special network design, or memory banks, making it unappealing for 3D medical imaging, while in 3D medical imaging, reconstruction-based self-supervised learning reaches a new height in performance, but lacks mechanisms to learn contrastive representation; therefore, this paper proposes a new framework for self-supervised contrastive learning via reconstruction, called Parts2Whole, because it exploits the universal and intrinsic part-whole relationship to learn contrastive representation without using contrastive loss: Reconstructing an image (whole) from its own parts compels the model to learn similar latent features for all its own partsin the latent space, while reconstructing different images (wholes) from their respective parts forces the model to simultaneously push those parts belonging to different wholes farther apart from each other in the latent space; thereby the trained model is capable of distinguishing images. We have evaluated our Parts2Whole on five distinct imaging tasks covering both classification and segmentation, and compared it with four competing publicly available 3D pretrained models, showing that Parts2Whole significantly outperforms in two out of five tasks while achieves competitive performance on the rest three. This superior performance is attributable to the contrastive representations learned with Parts2Whole. Codes and pretrained models are available at github.com/JLiangLab/Parts2Whole.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
If we consider each whole image itself as a “label”, the training process of Parts2Whole is equivalent to predicting the correct “label” given a part of one image as input, or discriminating each image from its parts.
2.
3D U-Net: github.com/ellisdg/3DUnetCNN.
3.
Denote the \(l_2\)-normalized features of a positive pair and negative pair as \(\{\mathcal {F}_E(p_i), \mathcal {F}_E(p'_i)\}\) and \(\{\mathcal {F}_E(p_i), \mathcal {F}_E(p_j)\}\), respectively. The contrastive loss is calculated as \(-\log \frac{\exp (\mathcal {F}_E(p_i) \cdot \mathcal {F}_E(p'_i) / \tau )}{\exp (\mathcal {F}_E(p_i) \cdot \mathcal {F}_E(p'_i) / \tau + \sum _{j=1}^{5000} \exp (\mathcal {F}_E(p_i) \cdot \mathcal {F}_E(p_j) / \tau )}\) where \(\tau = 0.7\).

References

Ardila, D., et al.: End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nat. Med. 25(6), 954–961 (2019)
Article Google Scholar
Armato III, S.G., McLennan, G., Bidaut, L., et al.: The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on CT scans. Med. Phys. 38(2), 915–931 (2011)
Article Google Scholar
Bachman, P., Hjelm, R.D., Buchwalter, W.: Learning representations by maximizing mutual information across views. In: Advances in Neural Information Processing Systems, pp. 15509–15519 (2019)
Google Scholar
Bakas, S., Reyes, M., Jakab, A., et al.: Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the brats challenge. arXiv preprint arXiv:1811.02629 (2018)
Bilic, P., Christ, P.F., Vorontsov, E., Chlebus, G., Chen, H., Dou, Q., et al.: The liver tumor segmentation benchmark (LiTS). arXiv preprint arXiv:1901.04056 (2019)
Bishop, C.M., et al.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)
MATH Google Scholar
Caron, M., Misra, I., Mairal, J., et al.: Unsupervised learning of visual features by contrasting cluster assignments. arXiv preprint arXiv:2006.09882 (2020)
Carreira, J., Zisserman, A.: Quo vadis, action recognition? A new model and the kinetics dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6299–6308 (2017)
Google Scholar
Chen, S., Ma, K., Zheng, Y.: Med3D: Transfer learning for 3D medical image analysis. arXiv preprint arXiv:1904.00625 (2019)
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. arXiv preprint arXiv:2002.05709 (2020)
Dosovitskiy, A., Springenberg, J.T., Riedmiller, M., Brox, T.: Discriminative unsupervised feature learning with convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 766–774 (2014)
Google Scholar
Gibson, E., Li, W., Sudre, C., et al.: NiftyNet: a deep-learning platform for medical imaging. Comput. Methods Programs Biomed. 158, 113–122 (2018)
Article Google Scholar
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9729–9738 (2020)
Google Scholar
Misra, I., Maaten, L.V.D.: Self-supervised learning of pretext-invariant representations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6707–6717 (2020)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Setio, A.A.A., Traverso, A., De Bel, T., et al.: Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: the LUNA16 challenge. Med. Image Anal. 42, 1–13 (2017)
Article Google Scholar
Tajbakhsh, N., Gotway, M.B., Liang, J.: Computer-aided pulmonary embolism detection using a novel vessel-aligned multi-planar image representation and convolutional neural networks. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9350, pp. 62–69. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24571-3_8
Chapter Google Scholar
Tao, X., Li, Y., Zhou, W., Ma, K., Zheng, Y.: Revisiting Rubik’s cube: self-supervised learning with volume-wise transformation for 3D medical image segmentation. arXiv preprint arXiv:2007.08826 (2020)
Wu, Z., Xiong, Y., Yu, S.X., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3733–3742 (2018)
Google Scholar
Zhou, Z., et al.: Models genesis: generic autodidactic models for 3D medical image analysis. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11767, pp. 384–393. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32251-9_42
Chapter Google Scholar

Download references

Acknowledgments

This research has been supported partially by ASU and Mayo Clinic through a Seed Grant and an Innovation Grant, and partially by the NIH under Award Number R01HL128785. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. This work has utilized the GPUs provided partially by the ASU Research Computing and partially by the Extreme Science and Engineering Discovery Environment (XSEDE) funded by the National Science Foundation (NSF) under grant number ACI-1548562. We would like to thank Jiaxuan Pang, Md Mahfuzur Rahman Siddiquee, and Zuwei Guo for evaluating I3D, NiftyNet, and MedicalNet, respectively. The content of this paper is covered by patents pending.

Author information

Authors and Affiliations

Arizona State University, Tempe, AZ, 85281, USA
Ruibin Feng, Zongwei Zhou & Jianming Liang
Mayo Clinic, Scottsdale, AZ, 85259, USA
Michael B. Gotway

Authors

Ruibin Feng
View author publications
You can also search for this author in PubMed Google Scholar
Zongwei Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Michael B. Gotway
View author publications
You can also search for this author in PubMed Google Scholar
Jianming Liang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ruibin Feng or Jianming Liang .

Editor information

Editors and Affiliations

Technical University Munich, Munich, Germany
Shadi Albarqouni
University of Pennsylvania, Philadelphia, PA, USA
Spyridon Bakas
Imperial College London, London, UK
Konstantinos Kamnitsas
King's College London, London, UK
M. Jorge Cardoso
Vanderbilt University, Nashville, TN, USA
Bennett Landman
NVIDIA Ltd., Cambridge, UK
Wenqi Li
NVIDIA GmbH and Johnson & Johnson, Munich, Germany
Fausto Milletari
NVIDIA GmbH, Munich, Germany
Nicola Rieke
NVIDIA Corporation, Bethesda, MD, USA
Holger Roth
NVIDIA Corporation, Bethesda, MD, USA
Daguang Xu
NVIDIA Corporation, Santa Clara, CA, USA
Ziyue Xu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Feng, R., Zhou, Z., Gotway, M.B., Liang, J. (2020). Parts2Whole: Self-supervised Contrastive Learning via Reconstruction. In: Albarqouni, S., et al. Domain Adaptation and Representation Transfer, and Distributed and Collaborative Learning. DART DCL 2020 2020. Lecture Notes in Computer Science(), vol 12444. Springer, Cham. https://doi.org/10.1007/978-3-030-60548-3_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-60548-3_9
Published: 26 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60547-6
Online ISBN: 978-3-030-60548-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)