Skip to main content

Graph Convolutional Nets for Tool Presence Detection in Surgical Videos

  • Conference paper
  • First Online:
Information Processing in Medical Imaging (IPMI 2019)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11492))

Included in the following conference series:

Abstract

Surgical tool presence detection is one of the key problems in automatic surgical video content analysis. Solving this problem benefits many applications such as the evaluation of surgical instrument usage and automatic surgical report generation. Given the fact that each video is only sparsely labeled at the frame level, meaning that only a small portion of video frames will be properly labeled, existing approaches only model this problem as an image (frame) classification problem without considering temporal information in surgical videos. In this paper, we propose a deep neural network model utilizing both spatial and temporal information from surgical videos for surgical tool presence detection. The proposed model uses Graph Convolutional Networks (GCNs) along the temporal dimension to learn better features by considering the relationship between continuous video frames. To the best of our knowledge, this is the first work taking videos as input to solve the surgical tool presence detection problem. Our experiments demonstrate the employment of temporal information offers a significant improvement to this problem, and the proposed approach achieves better performance than all state-of-the-art methods.

This work was partially supported by US National Science Foundation IIS-1718853 and the NSF CAREER grant IIS-1553687.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    M2CAI Surgical Tool Presence Detection Challenge 2016: http://camma.u-strasbg.fr/m2cai2016/.

References

  1. Carreira, J., Zisserman, A.: Quo vadis, action recognition? A new model and the kinetics dataset. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4724–4733. IEEE (2017)

    Google Scholar 

  2. Choi, B., Jo, K., Choi, S., Choi, J.: Surgical-tools detection based on convolutional neural network in laparoscopic robot-assisted surgery. In: 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 1756–1759. IEEE (2017)

    Google Scholar 

  3. Cleary, K., Chung, H.Y., Mun, S.K.: OR 2020 workshop overview: operating room of the future. In: International Congress Series, vol. 1268, pp. 847–852. Elsevier (2004)

    Google Scholar 

  4. Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. In: Advances in Neural Information Processing Systems, pp. 3844–3852 (2016)

    Google Scholar 

  5. Durand, T., Mordan, T., Thome, N., Cord, M.: Wildcat: weakly supervised learning of deep convnets for image classification, pointwise localization and segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017), vol. 2 (2017)

    Google Scholar 

  6. Gu, C., et al.: Ava: a video dataset of spatio-temporally localized atomic visual actions. arXiv preprint arXiv:1705.08421 (2017)

  7. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  8. Hu, X., Yu, L., Chen, H., Qin, J., Heng, P.-A.: AGNet: attention-guided network for surgical tool presence detection. In: Cardoso, M.J., et al. (eds.) DLMIA/ML-CDS -2017. LNCS, vol. 10553, pp. 186–194. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67558-9_22

    Chapter  Google Scholar 

  9. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: CVPR, vol. 1, p. 3 (2017)

    Google Scholar 

  10. Jin, A., et al.: Tool detection and operative skill assessment in surgical videos using region-based convolutional neural networks. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 691–699. IEEE (2018)

    Google Scholar 

  11. Jin, Y., et al.: SV-RCNet: workflow recognition from surgical videos using recurrent convolutional network. IEEE Trans. Med. Imaging 37(5), 1114–1126 (2018)

    Article  Google Scholar 

  12. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)

  13. Li, R., Wang, S., Zhu, F., Huang, J.: Adaptive graph convolutional neural networks. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  14. Loukas, C.: Video content analysis of surgical procedures. Surg. Endosc. 32(2), 553–568 (2018)

    Article  Google Scholar 

  15. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)

    Google Scholar 

  16. Sznitman, R., Becker, C., Fua, P.: Fast part-based classification for instrument detection in minimally invasive surgery. In: Golland, P., Hata, N., Barillot, C., Hornegger, J., Howe, R. (eds.) MICCAI 2014. LNCS, vol. 8674, pp. 692–699. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10470-6_86

    Chapter  Google Scholar 

  17. Twinanda, A.P., Shehata, S., Mutter, D., Marescaux, J., De Mathelin, M., Padoy, N.: EndoNet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans. Med. Imaging 36(1), 86–97 (2017)

    Article  Google Scholar 

  18. Wang, S., Raju, A., Huang, J.: Deep learning based multi-label classification for surgical tool presence detection in laparoscopic videos. In: 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017), pp. 620–623. IEEE (2017)

    Google Scholar 

  19. Wang, S., Yao, J., Xu, Z., Huang, J.: Subtype cell detection with an accelerated deep convolution neural network. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 640–648. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_74

    Chapter  Google Scholar 

  20. Wang, X., Gupta, A.: Videos as space-time region graphs. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 413–431. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01228-1_25

    Chapter  Google Scholar 

  21. Xu, Z., Wang, S., Zhu, F., Huang, J.: Seq2seq fingerprint: an unsupervised deep molecular embedding for drug discovery. In: Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, pp. 285–294. ACM (2017)

    Google Scholar 

  22. Yengera, G., Mutter, D., Marescaux, J., Padoy, N.: Less is more: surgical phase recognition with less annotations through self-supervised pre-training of CNN-LSTM networks. arXiv preprint arXiv:1805.08569 (2018)

  23. Zhang, X., Wang, S., Zhu, F., Xu, Z., Wang, Y., Huang, J.: Seq3seq fingerprint: towards end-to-end semi-supervised deep drug discovery. In: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, pp. 404–413. ACM (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junzhou Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, S., Xu, Z., Yan, C., Huang, J. (2019). Graph Convolutional Nets for Tool Presence Detection in Surgical Videos. In: Chung, A., Gee, J., Yushkevich, P., Bao, S. (eds) Information Processing in Medical Imaging. IPMI 2019. Lecture Notes in Computer Science(), vol 11492. Springer, Cham. https://doi.org/10.1007/978-3-030-20351-1_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-20351-1_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-20350-4

  • Online ISBN: 978-3-030-20351-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics