Early anticipation of driver’s maneuver in semiautonomous vehicles using deep learning

Gite, Shilpa; Agrawal, Himanshu; Kotecha, Ketan

doi:10.1007/s13748-019-00177-z

Early anticipation of driver’s maneuver in semiautonomous vehicles using deep learning

Regular Paper
Published: 11 March 2019

Volume 8, pages 293–305, (2019)
Cite this article

Progress in Artificial Intelligence Aims and scope Submit manuscript

Shilpa Gite¹,
Himanshu Agrawal¹ &
Ketan Kotecha¹

542 Accesses
11 Citations
Explore all metrics

Abstract

Making machines to anticipate human action is a complex research problem. Some of the recent research studies on computer vision and assistive driving have reported that the anticipation of driver’s action few seconds in advance is a challenging problem. These studies are based on the driver’s head movement tracking, eye gaze tracking, and spatiotemporal interest points. The study is aimed to address an important question of how to anticipate a driver’s action while driving and improve the anticipation time. The goal of this study is to review the existing deep learning framework for assistive driving. This paper differs from the existing solutions in two ways. First, it proposes a simplified framework using the driver’s inside video data and develops a driver’s movement tracking (DMT) algorithm. Majority of the existing state of the art is based on inside and outside features of the vehicles. Second, the proposed work tends to improve the image pattern recognition by introducing a fusion of spatiotemporal data points (STIPs) for movement tracking along with eye cuboids and then action anticipation by using deep learning. The proposed DMT algorithm tracks the driver’s movement using STIPs from the input video. Also, a fast eye gaze algorithm tracks eye movements. The features extracted from STIP and eye gaze are fused and analyzed by a deep recurrent neural network to improve the prediction time, thereby giving a few extra seconds to anticipate the driver’s correct action. The performance of the DMT algorithm is compared with the previous algorithms and found that DMT offers 30% improvement with regards to anticipating driver’s action over two recently proposed deep learning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

3D Object Detection for Autonomous Driving: A Comprehensive Survey

Article 27 April 2023

A performance comparison of YOLOv8 models for traffic sign detection in the Robotaxi-full scale autonomous vehicle competition

Article 12 August 2023

Deep Learning vs. Traditional Computer Vision

References

Uddin, M.T., Uddiny, M.A.: Human activity recognition from wearable sensors using extremely randomized trees. In: 2015 International Conference on Electrical Engineering and Information Communication Technology (ICEEICT). IEEE (2015)
Jalal, A., Kim, J.T., Kim, T.-S.: Human activity recognition using the labeled depth body parts information of depth silhouettes. In: Proceedings of the 6th International Symposium on Sustainable Healthy Buildings, Seoul, Korea. vol. 27 (2012)
Farooq, F., Ahmed, J., Zheng, L.: Facial expression recognition using hybrid features and self-organizing maps. In: 2017 IEEE International Conference on Multimedia and Expo (ICME), IEEE (2017)
Jalal, A., et al.: Human activity recognition via recognized body parts of human depth silhouettes for residents monitoring services at smart home. Indoor Built Environ 22(1), 271–279 (2013)
Article Google Scholar
Ahad, M.A.R., Kobashi, S., Tavares, J.M.R.S.: Advancements of image processing and vision in healthcare. J. Healthc. Eng. 2018, 8458024 (2018)
Jalal, A.: Security architecture for third generation (3 g) using gmhs cellular network. In: International Conference on Emerging Technologies, 2007. ICET 2007. IEEE
Jalal, A., Rasheed, Y.A.: Collaboration achievement along with performance maintenance in video streaming. In: Proceedings of the IEEE Conference on Interactive Computer Aided Learning, Villach, Austria. vol. 2628 (2007)
Jacobsen, D., Ott, P.: Cloud architecture for industrial image processing: platform for realtime inline quality assurance. In: 2017 IEEE 15th International Conference on Industrial Informatics (INDIN). IEEE (2017)
Mazhar, M., Rathore, U., et al.: Real-time continuous feature extraction in large size satellite images. J. Syst. Archit. 64, 122–132 (2016)
Article Google Scholar
Farooq, A., Jalal, A., Kamal, S.: Dense RGB-D map-based human tracking and activity recognition using skin joints features and self-organizing map. KSII Trans. Internet Inf. Syst. 9(5), 1856–1869 (2015)
Google Scholar
Jalal, A., Kim, S.: Global security using human face understanding under vision ubiquitous architecture system. World Acad. Sci. Eng. Technol. 13, 7–11 (2006)
Google Scholar
Yoshomoto, H., Date, N., Yonemoto, S.: Vision-based real-time motion capture system using multiple cameras. In: Proceedings IEEE conference on multisensor fusion and integration for intelligent systems (2003)
Kamal, S., Jalal, A.: A hybrid feature extraction approach for human detection, tracking and activity recognition using depth sensors. Arab. J. Sci. Eng. 41(3), 1043–1051 (2016)
Article Google Scholar
Jalal, A., Shahzad, A.: Multiple facial feature detection using vertex-modeling structure. In: Proceedings of the IEEE Conference on Interactive Computer Aided Learning, Villach, Austria. vol. 2628 (2007)
Huang, Q., Yang, J., Qiao, Y.: Person re-identification across multi-camera system based on local descriptors. In: 2012 Sixth International Conference on Distributed Smart Cameras (ICDSC). IEEE (2012)
Li, Y., Xia, R., Huang, Q., Xie, W., Li, X.: Survey of spatio-temporal interest point detection algorithms in video. 5 (2017)
Mur, O., Frigola, M., Casals, A.: Modelling daily actions through hand-based spatio-temporal features. 978-1-4673-7509-2/15 (2015)
Happy, S.L., Routray, A.: Fuzzy histogram of optical flow orientations for micro-expression recognition. (2015) https://doi.org/10.1109/taffc.2017.2723386
Cuong, N.H., Hoang, H.T.: Eye-gaze detection with a single WebCAM based on geometry features extraction. In: 2010 11th International Conference on Control, Automation, Robotics and Vision Singapore, 7–10 December 2010
George, A., Routray, A.: Fast and accurate algorithm for eye localization for gaze tracking in low resolution images. IET Comput. Vis. 10(7), 660–669 (2016)
Article Google Scholar
Hsiao, P.-Y., S.-S. Chou, F.-C. Huang: Generic 2-D Gaussian smoothing filter for noisy image processing. In: TENCON 2007-2007 IEEE Region 10 Conference
Piyathilaka, L., Kodagoda, S.: Gaussian mixture based HMM for human daily activity recognition using 3D skeleton features. In: 2013 8th IEEE Conference on Industrial Electronics and Applications (ICIEA). IEEE (2013)
Jalal, A., Kamal, S., Kim, D.: A depth video sensor-based life-logging human activity recognition system for elderly care in smart indoor environments. Sensors 14(7), 11735–11759 (2014)
Article Google Scholar
Jalal, A., et al.: Robust human activity recognition from depth video using spatiotemporal multi-fused features. Pattern Recognit. 61, 295–308 (2017)
Article Google Scholar
Jalal, A., Kamal, S., Kim, D.: Shape and motion features approach for activity tracking and recognition from kinect video camera. In: 2015 IEEE 29th International Conference on Advanced Information Networking and Applications Workshops (WAINA). IEEE (2015)
Jalal, A., Kim, Y.: Dense depth maps-based human pose tracking and recognition in dynamic scenes using ridge data. In: 2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE (2014)
Jalal, A., Kamal, S., Kim, D.: Individual detection-tracking-recognition using depth activity images. In: 2015 12th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI). IEEE (2015)
Rezaei, M., Klette, R.: Look at the driver, look at the road: no distraction! no accident!. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2014)
Kumar, P., Perrollaz, M., Lefevre, S., Laugier, C.: Learning-based approach for online lane change intention prediction. In: IEEE International Vehicle Symposium Proceedings (2013)
Frohlich, B., Enzweiler, M., Franke, U.: Will this car change the lane?—Turn signal recognition in the frequency domain. In: IEEE International Vehicle Symposium Proceedings (2014)
Doshi, A., Morris, B., Trivedi, M.M.: On-road prediction of driver’s intent with multimodal sensory cues. IEEE Pervasive Comput. (2011)
Bhatt, D., Gite, S. (2016) Novel driver behavior model analysis using hidden Markov model to increase road safety in smart cities. In: ACM Conference ICTCS’16
Shia, V., Gao, Y., Vasudevan, R., Campbell, K.D., Lin, T., Borrelli, F., Bajcsy, R.: Semiautonomous vehicular control using driver modeling. IEEE Trans. Intell. Transp. Syst. 15(6), 2696–2709 (2014)
Article Google Scholar
Vasudevan, R., Shia, V., Gao, Y., Cervera-Navarro, R., Bajcsy, R., Borrelli, F.: Safe semi-autonomous control with enhanced driver modeling. In: American Control Conference (2012)
Jabon, M.E., Bailenson, J.N., Pontikakis, E.D., Takayama, L., Nass, C.: Facial expression analysis for predicting unsafe driving behavior. IEEE Pervasive Comput. 10, 84–95 (2011)
Article Google Scholar
Du, Y., Wang, W., Wang, L.: Hierarchical recurrent neural network for skeleton based action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)
Wang, Z., Mülling, K., Deisenroth, M., Amor, H., Vogt, D., Schölkopf, B., Peters, J.: Probabilistic movement modeling for intention inference in human–robot interaction. Int. J. Robot. Res. 32(7) (2013)
Koppula, H., Saxena, A.: Anticipating human activities using object affordances for reactive robotic response. IEEE Trans. Pattern Anal. Mach. Intell. (2015)
Zhang, C., Zhang, Z.: A survey of recent advances in face detection. Technical report, Microsoft Research (2010)
Koppula, H., Saxena, A.: Learning spatio-temporal structure from rgb-d videos for human activity detection and anticipation. In: Proceedings of the International Conference on Machine Learning (2013)
Jain, A., Koppula, H.S., Raghavan, B., Soh, S., Saxena, A.: Car that knows before you do: anticipating maneuvers via learning temporal driving models. In: ICCV, vol. 38, issue 1 (2015)
Jain, A., Koppula, H.S., Soh, S., Raghavan, B., Singh, A., Saxena, A.: Brain4Cars: car that knows before you do via sensory-fusion deep learning architecture. In: ICCV, vol. 38 (2016)
Chen, I.-K., et al.: A real-time system for object detection and location reminding with rgb-d camera. In: 2014 IEEE International Conference on Consumer Electronics (ICCE). IEEE (2014)
Kamal, S., Jalal, A., Kim, D.: Depth images-based human detection, tracking and activity recognition using spatiotemporal features and modified HMM. J. Electr. Eng. Technol 11(3), 1921–1926 (2016)
Google Scholar
Maria L, Massaru L, Ferreira E.: Digital image processing in remote sensing. In: Proceedings of Conference on Computer Graphics and Image Processing (2009)
Jalal, A., Kim, Y., Kim, D.: Ridge body parts features for human pose estimation and recognition from RGB-D video data. In: 2014 International Conference on Computing, Communication and Networking Technologies (ICCCNT). IEEE (2014)
Jalal, A., et al.: Human activity recognition via the features of labeled depth body parts. In: International Conference on Smart Homes and Health Telematics. Springer, Berlin, Heidelberg (2012)
Jalal, A., Kim, J.T., Kim, T.-S.: Development of a life logging system via depth imaging-based human activity recognition for smart homes. In: Proceedings of the International Symposium on Sustainable Healthy Buildings, Seoul, Korea. vol. 19 (2012)
Procházka, A., et al.: Satellite image processing and air pollution detection. In: Proceedings of 2000 IEEE international conference on acoustics, speech, and signal processing, ICASSP’00, vol. 4. IEEE (2000)
Virmani, S., Gite, S.: Developing a novel algorithm for identifying driver’s behaviour in ADAS using deep learning. IJCTA 10(8), 573–579 (2017)
Google Scholar
Bergasa, L.M., Almería, D., Almazán, J., Yebes, J.J., Arroyo, R.: Drivesafe: an app for alerting inattentive drivers and scoring driving behaviors. In: Proceedings of the 2014 IEEE Intelligent Vehicles Symposium Proceedings, Dearborn, MI, USA, 8–11 June 2014, pp 240–245
Wijayagunawardhane, N.R.B., Jinasena, S.D., Sandaruwan, C.B., Dharmapriya, W.A.N.S., Samarasinghe, R.: SmartV: intelligent vigilance monitoring based on sensor fusion and driving dynamics. In: Proceedings of the 2013 8th IEEE International Conference on Industrial and Information Systems (ICIIS), Peradeniya, Sri Lanka, 17–20 December 2013, pp. 507–512
Tango, F., Botta, M.: Real-time detection system of driver distraction using machine learning. IEEE Trans. Intell. Transp. Syst. 14, 894–905 (2013)
Article Google Scholar
Dong, Y., Hu, Z., Uchimura, K., Murayama, N.: Driver inattention monitoring system for intelligent vehicles: a review. IEEE Trans. Intell. Transp. Syst. 12, 596–614 (2011)
Article Google Scholar
Bosch urban. http://bit.ly/1feM3JM. Accessed 23 April 2015
Gite, S., Agrawal, H.: Early prediction of driver’s action using deep neural networks. Int. J. Inf. Retr. Res. (2019). https://doi.org/10.4018/ijirr. (in press)
Google Scholar
Al-Sultan, S., Al-Bayatti, A., Zedan, H.: Context-aware driver behavior detection system in intelligent transportation systems. IEEE Trans. Veh. Technol. 62, 4264–4275 (2013)
Article Google Scholar
Gite, S., Aggrawal, H.: On context awareness for multisensor data fusion in IOT. In: Proceedings of the Second International Conference on Computer and Communication Technologies, pp. 85–93
Martin, S., et al.: Dynamics of Driver’s gaze: explorations in behavior modeling and maneuver prediction. IEEE Trans. Intell. Veh. 3(2), 141–150 (2018)
Article Google Scholar
Baccouche, M., Mamalet, F., Wolf, C., Garcia, C., Baskurt, A.: Sequential deep learning for human action recognition. In: Salah, A.A., Lepri, B. (eds.) Human Behavior Understanding. Lecture Notes in Computer Science, vol. 7065, pp. 29–39. Springer, Berlin (2011)
Google Scholar
Dong, W., Li, J., Yao, R., Li, C., Yuan, T., Wang, L.: Characterizing driving styles with deep learning. (2016). arXiv:1607.03611
Wu, D., Sharma, N., Blumenstein, M.: Recent advances in video-based human action recognition using deep learning: a review. In: 2017 International Joint Conference on Neural Networks (IJCNN). IEEE (2017)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Article Google Scholar
Jalal, A., et al.: Robust human activity recognition from depth video using spatiotemporal multi-fused features. Pattern Recognit. 61, 295–308 (2017)
Article Google Scholar
Hammerla, N.Y., Halloran, S., Ploetz, T.; Deep, convolutional, and recurrent models for human activity recognition using wearables. arXiv preprint arXiv:1604.08880 (2016)
Nweke, H.F., et al.: Deep learning algorithms for human activity recognition using mobile and wearable sensor networks: state of the art and research challenges. Expert Syst. Appl. 105 (2018)
Bux, A.: Vision-based human action recognition using machine learning techniques. Dissertation. Lancaster University (2017)
Fridman, Lex, et al. “MIT Autonomous Vehicle Technology Study: Large-Scale Deep Learning Based Analysis of Driver Behavior and Interaction with Automation.” arXiv preprint arXiv:1711.06976 (2017)
Martinez, C.M., et al.: Driving style recognition for intelligent vehicle control and advanced driver assistance: a survey. IEEE Trans. Intell. Transp. Syst. 19(3), 666–676 (2018)
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Wu, H., et al.: Human activity recognition based on the combined svm&hmm. In: 2014 IEEE International Conference on Information and Automation (ICIA). IEEE (2014)
Tang, K., Zhu, S., Xu, Y., Wang, F.: Modeling drivers’ dynamic decision-making behavior during the phase transition period: an analytical approach based on hidden Markov model theory. IEEE Trans. Intell. Transp. Syst. (2015). https://doi.org/10.1109/tits.2015.2462738
Google Scholar
Svozil, D., Kvasnicka, V., Pospichal, J.: Introduction to multi-layer feed-forward neural networks. Chem. Intell. Lab. Syst. 39(1), 43–62 (1997)
Article Google Scholar
https://towardsdatascience.com/deeplearning-feedforward-neural-network-26a6705dbdc7. Accessed 3 Sept 2018

Download references

Author information

Authors and Affiliations

Symbiosis Institute of Technology, Symbiosis International (Deemed University), Pune, India
Shilpa Gite, Himanshu Agrawal & Ketan Kotecha

Authors

Shilpa Gite
View author publications
You can also search for this author in PubMed Google Scholar
Himanshu Agrawal
View author publications
You can also search for this author in PubMed Google Scholar
Ketan Kotecha
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shilpa Gite.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gite, S., Agrawal, H. & Kotecha, K. Early anticipation of driver’s maneuver in semiautonomous vehicles using deep learning. Prog Artif Intell 8, 293–305 (2019). https://doi.org/10.1007/s13748-019-00177-z

Download citation

Received: 22 January 2019
Accepted: 25 February 2019
Published: 11 March 2019
Issue Date: 01 September 2019
DOI: https://doi.org/10.1007/s13748-019-00177-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Early anticipation of driver’s maneuver in semiautonomous vehicles using deep learning

Abstract

Access this article

Similar content being viewed by others

3D Object Detection for Autonomous Driving: A Comprehensive Survey

A performance comparison of YOLOv8 models for traffic sign detection in the Robotaxi-full scale autonomous vehicle competition

Deep Learning vs. Traditional Computer Vision

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Early anticipation of driver’s maneuver in semiautonomous vehicles using deep learning

Abstract

Access this article

Similar content being viewed by others

3D Object Detection for Autonomous Driving: A Comprehensive Survey

A performance comparison of YOLOv8 models for traffic sign detection in the Robotaxi-full scale autonomous vehicle competition

Deep Learning vs. Traditional Computer Vision

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation