Skip to main content

DCT-DWT-FFT Based Method for Text Detection in Underwater Images

  • Conference paper
  • First Online:
Pattern Recognition (ACPR 2021)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13189))

Included in the following conference series:

Abstract

Text detection in underwater images is an open challenge because of the distortions caused by refraction, absorption of light, particles, and variations depending on depth, color, and nature of water. Unlike existing methods aimed at text detection in natural scene images, in this paper, we have proposed a novel method for text detection in underwater images through a new enhancement model. Based on observations that fine details of text in image share with high energy, spatial resolution, and brightness, we consider Discrete Cosine Transform (DCT), Discrete Wavelet Transform (DWT), and Fast Fourier Transform (FFT) for image enhancement to highlight the text features. The enhanced image is fed to a modified Character Region Awareness for Text Detection (CRAFT) model to detect text in underwater images. To explore enhancement methods, we evaluate six combinations of image enhancement techniques, namely, DCT-DWT-FFT, DCT-FFT-DWT, DWT-DCT-FFT, DWT-FFT-DCT, FFT-DCT-DWT, FFT-DWT-DCT. Experimental results on our dataset of underwater images and benchmark datasets of natural scene text detection, namely, MSRA-TD500, ICDAR 2019 MLT, ICDAR 2019 ArT, Total-Text, CTW1500, and COCO Text show that the proposed method performs well for both underwater and natural scene images and outperforms the existing methods on all the datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Xue, M., et al.: Deep invariant texture features for water image classification. SN Appl. Sci. 2(12), 1–19 (2020). https://doi.org/10.1007/s42452-020-03882-w

    Article  Google Scholar 

  2. Kezebou, L., Oludare, V., Panetta, K., Againa, S.S.: Underwater object tracking benchmark and dataset. In: Proceedings of the HST (2019). https://doi.org/10.1109/HST47167.2019.9032954

  3. Wang, Y., Xie, H., Zha, Z.J., Xing, M., Fu, Z., Zhang, Y.: ContourNet: Taking a further step toward accurate arbitrary-shaped scene text detection. In: Proceedings of the CVPR, pp. 11753–11762 (2020)

    Google Scholar 

  4. Cao, Y., Ma, S., Pan, H.: FDTA: Fully convolutional scene text detection with text attention. IEEE Access 155441–155449 (2020)

    Google Scholar 

  5. Zhang, W., Xiang, S.: Face anti-spoofing detection based on DWT-LBP-DCT features. Signal Process. Image Commun. (2020). https://doi.org/10.1016/j.image.2020.115990

    Article  Google Scholar 

  6. Baek, Y., Lee, B., Han, D., Yun, S., Lee, H.: Character region awareness for text detection. In: Proceedings of the CVPR, pp. 9365–9374 (2019)

    Google Scholar 

  7. Liu, C., Yang, C., Hou, J.B., Wu, L.H., Zhu, X.B., Xiao, L.: GCCNet: Grouped channel composition network for scene text detection. Neurocomputing 454, 135–151 (2021)

    Article  Google Scholar 

  8. Shi, J., Chen, L., Su, F.: Accurate arbitrary-shaped scene text detection via iterative polynomial parameter regression. In: Ishikawa, H., Liu, C.-L., Pajdla, T., Shi, J. (eds.) ACCV 2020. LNCS, vol. 12624, pp. 241–256. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-69535-4_15

    Chapter  Google Scholar 

  9. Qin, X., Jiang, J., Yuan, C.A., Qiao, S., Fan, W.: Arbitrary shape natural scene text detection method based on soft attention mechanism and dilated convolution. IEEE Access 122685–122694 (2020)

    Google Scholar 

  10. Dai, P., Li, Y., Zhang, H., Li, J., Cao, X.: Accurate scene text detection via scale-aware data augmentation and shape similarity constraint. IEEE Trans. Multim. (2021). https://doi.org/10.1109/TMM.2021.3073575

    Article  Google Scholar 

  11. Hu, Z., Wu, X., Wang, J.: TCATD: text contour attention for scene text detection. In: Proceedings of the ICPR, pp. 1083–1088 (2021)

    Google Scholar 

  12. Liao, M., Lyu, P., He, M., Yao, C., Wu, W., Bai, X.: Mask TextSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 71–88. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_5

    Chapter  Google Scholar 

  13. Deng, G., Ming, Y., Xue, J.-H.: RFRN: A recurrent feature refinement network for accurate and efficient scene text detection. Neurocomputing 453, 465–481 (2021)

    Article  Google Scholar 

  14. Liu, J., Zhong, Q., Yuan, Y., Su, H., Du, B.: SemiText: scene text detection with semi-supervised learning. Neurocomputing 407, 343–353 (2020)

    Article  Google Scholar 

  15. Xue, M., et al.: Arbitrarily-oriented text detection in low light natural scene images. IEEE Trans. Multim. 23, 2706–2720 (2020)

    Article  Google Scholar 

  16. Chowdhury, P.N., et al.: A new episodic learning-based network for text detection on human body in sports images. IEEE Trans Circuits Syst. Video Technol. (2021). https://doi.org/10.1109/TCSVT.2021.3092713

    Article  Google Scholar 

  17. Chowdhury, T., Shivakumara, P., Pal, U., Tong, L., Raghavendra, R., Chanda, S.: DCINN: deformable convolution and inception based neural network for tattoo text detection through skin region. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) Document Analysis and Recognition – ICDAR 2021: 16th International Conference, Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part II, pp. 335–350. Springer International Publishing, Cham (2021). https://doi.org/10.1007/978-3-030-86331-9_22

    Chapter  Google Scholar 

  18. Zhou, X., et al.: East: an efficient and accurate scene text detector. In: Proceedings of the CVPR, pp. 2642–2651 (2017)

    Google Scholar 

  19. Roy, S., Shivakumara, P., Pal, U., Lu, T., Kumar, G.H.: Delaunay triangulation-based text detection from multi-view images of natural scene. Pattern Recogn. Lett. 129, 92–100 (2020)

    Article  Google Scholar 

  20. Chng, C.K., Liu, Y., Sun, Y., Ng, C.C., Luo, C., Ni, Z.: ICDAR2019 robust reading challenge on arbitrarily-shaped text-RRC-ArT. In: Proceedings of the ICDAR, pp. 1571–1576 (2019)

    Google Scholar 

Download references

Acknowledgements

The work of Cheng-Lin Liu was supported by the National Key Research and Development Program under Grant No. 2018AAA0100400 and the National Natural Science Foundation of China under Grant No. 61721004. This work also partially supported by TIH, ISI, Kolkata.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Palaiahnakote Shivakumara .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Banerjee, A., Shivakumara, P., Pal, S., Pal, U., Liu, CL. (2022). DCT-DWT-FFT Based Method for Text Detection in Underwater Images. In: Wallraven, C., Liu, Q., Nagahara, H. (eds) Pattern Recognition. ACPR 2021. Lecture Notes in Computer Science, vol 13189. Springer, Cham. https://doi.org/10.1007/978-3-031-02444-3_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-02444-3_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-02443-6

  • Online ISBN: 978-3-031-02444-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics