Abstract
For safety and mission critical systems relying on Convolutional Neural Networks (CNNs), it is crucial to avoid incorrect predictions that can cause accident or financial crisis. This can be achieved by quantifying and interpreting the predictive uncertainty. Current methods for uncertainty quantification rely on Bayesian CNNs that approximate Bayesian inference via dropout sampling. This paper investigates different dropout methods to robustly quantify the predictive uncertainty for misclassifications detection. Specifically, the following questions are addressed: In which layers should activations be sampled? Which dropout sampling mask should be used? What dropout probability should be used? How to choose the number of ensemble members? How to combine ensemble members? How to quantify the classification uncertainty? To answer these questions, experiments were conducted on three datasets using three different network architectures. Experimental results showed that the classification uncertainty is best captured by averaging the predictions of all stochastic CNNs sampled from the Bayesian CNN and by validating the predictions of the Bayesian CNN with three uncertainty measures, namely the predictive confidence, predictive entropy and standard deviation thresholds. The results showed further that the optimal dropout method specified through the sampling location, sampling mask, inference dropout probability, and number of stochastic forward passes depends on both the dataset and the designed network architecture. Notwithstanding this, I proposed to sample inputs to max pooling layers with a cascade of Multiplicative Gaussian Mask (MGM) followed by Multiplicative Bernoulli Spatial Mask (MBSM) to robustly quantify the classification uncertainty, while keeping the loss in performance low.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Gal, Y., Ghahramani, Z.: Dropout as a bayesian approximation: representing model uncertainty in deep learning. In: Proceedings of the 33rd International Conference on Machine Learning, vol. 48, pp. 1050–1059 (2016)
Gal, Y., Ghahramani, Z.: Bayesian convolutional neural networks with Bernoulli approximate variational inference (2016). http://arxiv.org/pdf/1506.02158v6
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors (2012). http://arxiv.org/pdf/1207.0580v1
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)
Wang, S.I., Manning, C.D.: Fast dropout training (2013). http://proceedings.mlr.press/v28/wang13a.html
McClure, P., Kriegeskorte, N.: Robustly representing uncertainty in deep neural networks through sampling (2018). http://arxiv.org/pdf/1611.01639v7
Tompson, J., Goroshin, R., Jain, A., LeCun, Y., Bregler, C.: Efficient object localization using convolutional networks (2015). http://arxiv.org/pdf/1411.4280v3
Wu, H., Gu, X.: Max-pooling dropout for regularization of convolutional neural networks (2015). https://arxiv.org/abs/1512.01400v1
Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and scalable predictive uncertainty estimation using deep ensembles (2017). http://arxiv.org/pdf/1612.01474v3
Oliveira, R., Tabacof, P., Valle, E.: Known unknowns: uncertainty quality in Bayesian neural networks (2016). http://arxiv.org/pdf/1612.01251v2
Lin, M., Chen, Q., Yan, S.: Network in network (2014). http://arxiv.org/pdf/1312.4400v3
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient based learning applied to document recognition. In: Proceedings of the IEEE (1998). https://doi.org/10.1109/5.726791
Krizhevsky, A.: Learning multiple layers of features from tiny images. Master’s thesis, University of Toronto (2009)
Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C. (eds.): The German traffic sign recognition benchmark: a multi-class classification competition. In: The 2011 International Joint Conference on Neural Networks, San Jose, CA, USA. IEEE (2011)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2015). http://arxiv.org/pdf/1409.1556v6
Szegedy, C., et al.: Going deeper with convolutions (2014). http://arxiv.org/pdf/1409.4842v1
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015). http://arxiv.org/pdf/1512.03385v1
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Njieutcheu Tassi, C.R. (2020). Bayesian Convolutional Neural Network: Robustly Quantify Uncertainty for Misclassifications Detection. In: Djeddi, C., Jamil, A., Siddiqi, I. (eds) Pattern Recognition and Artificial Intelligence. MedPRAI 2019. Communications in Computer and Information Science, vol 1144. Springer, Cham. https://doi.org/10.1007/978-3-030-37548-5_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-37548-5_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37547-8
Online ISBN: 978-3-030-37548-5
eBook Packages: Computer ScienceComputer Science (R0)