Skip to main content

A Generalized Approach to Determine Confident Samples for Deep Neural Networks on Unseen Data

  • Conference paper
  • First Online:
Uncertainty for Safe Utilization of Machine Learning in Medical Imaging and Clinical Image-Based Procedures (CLIP 2019, UNSURE 2019)

Abstract

Deep neural network (DNN) models are widely applied in biomedical image studies since DNN models take advantage of massive data to provide improved performance over traditional machine learning models. However, like any other data-driven models, DNN models still face generalization limitations. For example, a model trained on clinical data from one hospital may not perform as well on data from another hospital. In this work, a novel approach is proposed to determine confident samples from unseen data on which a DNN model will have improved performance. Confident samples are defined as inliers identified by an outlier detector, which is based on projection of training data onto a standard feature space (e.g. ImageNet feature space). The hypothesis of the proposed method is that in a standard feature space, a DNN model will perform better on the inlier data samples and more poorly on the outliers. While projecting the unseen data to a standard feature space, if data points are detected as inliers, then the model will likely have consistent performance on those inliers as those patterns have already been “seen” from the training dataset. To validate our hypothesis, experiments were conducted using publicly available digit image datasets and chest X-ray images from three unseen datasets collected across U.S. and Canada hospitals. The experimental results showed consistently improved performance across various DNN models on all confident samples from unseen datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chen, T., Navrátil, J., Iyengar, V., Shanmugam, K.: Confidence scoring using whitebox meta-models with linear classifier probes. arXiv preprint arXiv:1805.05396 (2018)

  2. Gal, Y., Ghahramani, Z.: Dropout as a bayesian approximation: representing model uncertainty in deep learning. In: International Conference on Machine Learning, pp. 1050–1059 (2016)

    Google Scholar 

  3. Nair, T., Precup, D., Arnold, D.L., Arbel, T.: Exploring uncertainty measures in deep networks for multiple sclerosis lesion detection and segmentation. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11070, pp. 655–663. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00928-1_74

    Chapter  Google Scholar 

  4. Leibig, C., Allken, V., Ayhan, M.S., Berens, P., Wahl, S.: Leveraging uncertainty information from deep neural networks for disease detection. Sci. Rep. 7, 17816 (2017)

    Article  Google Scholar 

  5. Geifman, Y., El-Yaniv, R.: Selective classification for deep neural networks. In: Advances in Neural Information Processing Systems, pp. 4878–4887 (2017)

    Google Scholar 

  6. Hendrycks, D., Gimpel, K.: A baseline for detecting misclassified and out-of-distribution examples in neural networks. arXiv preprint arXiv:1610.02136 (2016)

  7. Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. In: Advances in Neural Information Processing Systems, pp. 2234–2242 (2016)

    Google Scholar 

  8. Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In: Advances in Neural Information Processing Systems, pp. 6626–6637 (2017)

    Google Scholar 

  9. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)

    Google Scholar 

  10. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  11. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998)

    Article  Google Scholar 

  12. Hull, J.J.: A database for handwritten text recognition research. IEEE Trans. Pattern Anal. Mach. Intell. 16, 550–554 (1994)

    Article  Google Scholar 

  13. Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning (2011)

    Google Scholar 

  14. Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R.M.: Chestx-ray8: hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2097–2106 (2017)

    Google Scholar 

  15. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  16. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)

    Google Scholar 

  17. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)

    Google Scholar 

  18. Liu, F.T., Ting, K.M., Zhou, Z.-H.: Isolation forest. In: 2008 Eighth IEEE International Conference on Data Mining, pp. 413–422. IEEE (2008)

    Google Scholar 

  19. Moya, M.M., Hush, D.R.: Network constraints and multi-objective optimization for one-class classification. Neural Networks. 9, 463–474 (1996)

    Article  Google Scholar 

  20. Competition, B.P.: AAPM spring clinical meeting-abstracts SATURDAY, APRIL 7. J. Appl. Clin. Med. Phys. 19, 370–407 (2018)

    Article  Google Scholar 

  21. Chollet, F.: Deep Learning mit Python und Keras: Das Praxis-Handbuch vom Entwickler der Keras-Bibliothek. MITP-Verlags GmbH & Co. KG (2018)

    Google Scholar 

  22. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Min Zhang .

Editor information

Editors and Affiliations

1 Electronic Supplementary Material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 672 kb)

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, M., Leung, K.H., Ma, Z., Wen, J., Avinash, G. (2019). A Generalized Approach to Determine Confident Samples for Deep Neural Networks on Unseen Data. In: Greenspan, H., et al. Uncertainty for Safe Utilization of Machine Learning in Medical Imaging and Clinical Image-Based Procedures. CLIP UNSURE 2019 2019. Lecture Notes in Computer Science(), vol 11840. Springer, Cham. https://doi.org/10.1007/978-3-030-32689-0_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-32689-0_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-32688-3

  • Online ISBN: 978-3-030-32689-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics