Abstract
Recent survey has shown that drowsy driving is one of the main factors in fatal motor vehicle crashes. In this paper, given only the visual information of the driver, we propose a Multistage Spatial-Temporal Network (MSTN) to efficiently and accurately detect driver drowsiness. The proposed MSTN consists of a spatial CNN, a temporal LSTM, and then followed by a temporal smoothing. Firstly, we use the spatial CNN to effectively extract drowsiness-related features from the face region detected from each video frame. Then, we model the temporal variation of the drowsiness status by feeding a sequence of frame-level features into the Long Short Term Memory (LSTM). Finally, we conduct the temporal smoothing to smooth the predicted drowsiness scores in order to avoid noisy predictions. We evaluate the proposed MSTN using NTHU Drowsy Driver Detection Video Dataset and achieve 82.61% overall accuracy on the testing set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Donahue, J., Hendricks, L.A., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: CVPR (2015)
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing coadaptation of feature detectors. arXiv:1207.0580 (2012)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167 (2015)
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. arXiv:1408.5093 (2014)
Kingma, D., Ba, J.: ADAM: a method for stochastic optimization. arXiv:1412.6980 (2014)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)
Zaremba, W., Sutskever, I.: Learning to execute. arXiv:1410.4615 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Shih, TH., Hsu, CT. (2017). MSTN: Multistage Spatial-Temporal Network for Driver Drowsiness Detection. In: Chen, CS., Lu, J., Ma, KK. (eds) Computer Vision – ACCV 2016 Workshops. ACCV 2016. Lecture Notes in Computer Science(), vol 10118. Springer, Cham. https://doi.org/10.1007/978-3-319-54526-4_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-54526-4_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54525-7
Online ISBN: 978-3-319-54526-4
eBook Packages: Computer ScienceComputer Science (R0)