A Generalized Approach to Determine Confident Samples for Deep Neural Networks on Unseen Data

Zhang, Min; Leung, Kevin H.; Ma, Zili; Wen, Jin; Avinash, Gopal

doi:10.1007/978-3-030-32689-0_7

Min Zhang²²,
Kevin H. Leung^22,23,
Zili Ma²²,
Jin Wen²² &
…
Gopal Avinash²²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11840))

Included in the following conference series:

1751 Accesses
2 Citations

Abstract

Deep neural network (DNN) models are widely applied in biomedical image studies since DNN models take advantage of massive data to provide improved performance over traditional machine learning models. However, like any other data-driven models, DNN models still face generalization limitations. For example, a model trained on clinical data from one hospital may not perform as well on data from another hospital. In this work, a novel approach is proposed to determine confident samples from unseen data on which a DNN model will have improved performance. Confident samples are defined as inliers identified by an outlier detector, which is based on projection of training data onto a standard feature space (e.g. ImageNet feature space). The hypothesis of the proposed method is that in a standard feature space, a DNN model will perform better on the inlier data samples and more poorly on the outliers. While projecting the unseen data to a standard feature space, if data points are detected as inliers, then the model will likely have consistent performance on those inliers as those patterns have already been “seen” from the training dataset. To validate our hypothesis, experiments were conducted using publicly available digit image datasets and chest X-ray images from three unseen datasets collected across U.S. and Canada hospitals. The experimental results showed consistently improved performance across various DNN models on all confident samples from unseen datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chen, T., Navrátil, J., Iyengar, V., Shanmugam, K.: Confidence scoring using whitebox meta-models with linear classifier probes. arXiv preprint arXiv:1805.05396 (2018)
Gal, Y., Ghahramani, Z.: Dropout as a bayesian approximation: representing model uncertainty in deep learning. In: International Conference on Machine Learning, pp. 1050–1059 (2016)
Google Scholar
Nair, T., Precup, D., Arnold, D.L., Arbel, T.: Exploring uncertainty measures in deep networks for multiple sclerosis lesion detection and segmentation. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11070, pp. 655–663. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00928-1_74
Chapter Google Scholar
Leibig, C., Allken, V., Ayhan, M.S., Berens, P., Wahl, S.: Leveraging uncertainty information from deep neural networks for disease detection. Sci. Rep. 7, 17816 (2017)
Article Google Scholar
Geifman, Y., El-Yaniv, R.: Selective classification for deep neural networks. In: Advances in Neural Information Processing Systems, pp. 4878–4887 (2017)
Google Scholar
Hendrycks, D., Gimpel, K.: A baseline for detecting misclassified and out-of-distribution examples in neural networks. arXiv preprint arXiv:1610.02136 (2016)
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. In: Advances in Neural Information Processing Systems, pp. 2234–2242 (2016)
Google Scholar
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In: Advances in Neural Information Processing Systems, pp. 6626–6637 (2017)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998)
Article Google Scholar
Hull, J.J.: A database for handwritten text recognition research. IEEE Trans. Pattern Anal. Mach. Intell. 16, 550–554 (1994)
Article Google Scholar
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning (2011)
Google Scholar
Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R.M.: Chestx-ray8: hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2097–2106 (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Google Scholar
Liu, F.T., Ting, K.M., Zhou, Z.-H.: Isolation forest. In: 2008 Eighth IEEE International Conference on Data Mining, pp. 413–422. IEEE (2008)
Google Scholar
Moya, M.M., Hush, D.R.: Network constraints and multi-objective optimization for one-class classification. Neural Networks. 9, 463–474 (1996)
Article Google Scholar
Competition, B.P.: AAPM spring clinical meeting-abstracts SATURDAY, APRIL 7. J. Appl. Clin. Med. Phys. 19, 370–407 (2018)
Article Google Scholar
Chollet, F.: Deep Learning mit Python und Keras: Das Praxis-Handbuch vom Entwickler der Keras-Bibliothek. MITP-Verlags GmbH & Co. KG (2018)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

Download references

Author information

Authors and Affiliations

GE Healthcare, San Ramon, CA, 94583, USA
Min Zhang, Kevin H. Leung, Zili Ma, Jin Wen & Gopal Avinash
The Johns Hopkins University School of Medicine, Baltimore, MD, 21205, USA
Kevin H. Leung

Authors

Min Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Kevin H. Leung
View author publications
You can also search for this author in PubMed Google Scholar
Zili Ma
View author publications
You can also search for this author in PubMed Google Scholar
Jin Wen
View author publications
You can also search for this author in PubMed Google Scholar
Gopal Avinash
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Min Zhang .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Hayit Greenspan
University College London, London, UK
Ryutaro Tanno
Fraunhofer Singapore, Nanyang Technological University, Singapore, Singapore
Marius Erdt
McGill University, Montreal, QC, Canada
Tal Arbel
ETH Zürich, Zürich, Switzerland
Christian Baumgartner
Massachusetts Institute of Technology, Harvard Medical School, Cambridge, MA, USA
Adrian Dalca
University College London, King's College London, London, UK
Carole H. Sudre
Harvard Medical School, Boston, MA, USA
William M. Wells
Aachen University of Applied Sciences, Aachen, Germany
Klaus Drechsler
Children’s National Healthcare System, Washington, D.C., DC, USA
Marius George Linguraru
Fraunhofer IGD, Darmstadt, Germany
Cristina Oyarzun Laura
Children's National Healthcare System, Washington, D.C., DC, USA
Raj Shekhar
Fraunhofer IGD, Darmstadt, Germany
Stefan Wesarg
ICREA - Universitat Pompeu Fabra, Barcelona, Spain
Miguel Ángel González Ballester

1 Electronic Supplementary Material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 672 kb)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, M., Leung, K.H., Ma, Z., Wen, J., Avinash, G. (2019). A Generalized Approach to Determine Confident Samples for Deep Neural Networks on Unseen Data. In: Greenspan, H., et al. Uncertainty for Safe Utilization of Machine Learning in Medical Imaging and Clinical Image-Based Procedures. CLIP UNSURE 2019 2019. Lecture Notes in Computer Science(), vol 11840. Springer, Cham. https://doi.org/10.1007/978-3-030-32689-0_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-32689-0_7
Published: 07 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32688-3
Online ISBN: 978-3-030-32689-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)