Abstract
When the picking robot picks green peaches, there are problems such as the color of the fruit being similar to the background color, overlapping fruits, and small fruit size, uneven lighting, and branches and leaves occlusion. As a result, the picking robot cannot quickly detect green peaches. In order to solve the above problems, a lightweight object detection network for fast detection of green peaches is proposed, which is composed of a backbone network, feature enhancement network, Lightweight Self-Attention (LSA) network, and four-scale prediction network. First, the lightweight detection unit LeanNet of the backbone network is designed, which uses the idea of deep separable convolution to achieve fast detection. Secondly, the feature enhancement module (P-Enhance) is designed, which uses convolution kernels of different receptive fields to extract different perceptual information in the feature map, which enhances the network’s feature extraction ability for green peach. Then, the LSA module is designed to generate a local saliency map based on green peach features, which effectively suppressed the irrelevant area of the branch and leaf background. Finally, a four-scale prediction network is designed, in which the Four-scale Pyramid Fusion (FSPF) module can generate a four-scale feature pyramid, which includes the color and shape of the green peach at different network depths, and is conducive to the detection of small volume green peaches. The experimental results show that precision, recall, and F1 of our method in the green peach test set reached 97.3%, 99.7%, and 98.5%, respectively. In the actual picking scenes, Qualcomm Snapdragon 865 embedded devices equipped with different state-of-the-art methods are used. Through comparative experiments in various scenarios, compared with the state-of-the-art method, both in terms of experimental data and visual effects, there is a significant improvement, which can meet the real-time object detection needs of picking robots.
Similar content being viewed by others
References
Xu ZF, Jia RS, Liu YB (2020) Fast method of detecting tomatoes in a complex scene for picking robots. IEEE Access 8:55289–55299
Rong D, Wang H, Ying Y, Zhang Z, Zhang Y (2020) Peach variety detection using VIS-NIR spectroscopy and deep learning. Comput Electron Agric 175:105553
Jizhan L (2017) Research progress analysis of robotic harvesting technologies in greenhouse. Trans Chin Soc Agricul Mach 48(12):1–18
Africa ADM, Tabalan ARV, Tan MAA (2020) Ripe fruit detection and classification using machine learning. Int J 8(5)
Bansal R, Lee WS, Satish S (2013) Green citrus detection using fast Fourier transform (FFT) leakage. Precis Agric 14(1):59–70
Chaivivatrakul S, Dailey MN (2014) Texture-based fruit detection. Precis Agric 15(6):662–683
Lu J, Hu X (2017) Detecting green citrus fruit on trees in low light and complex background based on MSER and HCA. Trans Chin Soc Agricult Eng 33(19):196–201
Ding Y, Lee WS, Li M (2018) Feature extraction of hyperspectral images for detecting immature green citrus fruit. Front Agricult Sci Eng 5(4):475–484
Gan H, Lee WS, Alchanatis V, Ehsani R, Schueller JK (2018) Immature green citrus fruit detection using color and thermal images. Comput Electron Agric 152(9):117–125
Wan S, Goudos S (2020) Faster R-CNN for multi-class fruit detection using a robotic vision system. Comput Netw 168:107036
Liaw A, Wiener M (2020) Classification and regression by randomForest. R News 2(3):18–22
S.V. M. Vishwanathan, M. Narasimha Murty. (2002) SSVM: a simple SVM algorithm. Proceedings of the 2002 international joint conference on neural networks (IJCNN), Honolulu, HI, USA, 3, 2393-2398. https://doi.org/10.1109/IJCNN.2002.1007516
Rätsch G, Onoda T, Müller KR (2001) Soft margins for AdaBoost. Mach Learn 42(3):287–320
R. Girshick, J. Donahue, T. Darrell, J. Malik. (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. IEEE conference on computer vision and pattern recognition (CVPR), Columbus, OH, 580-587. https://doi.org/10.1109/CVPR.2014.81
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
R. Girshick. (2015) Fast R-CNN. IEEE international conference on computer vision (ICCV), Santiago, 1440-1448. https://doi.org/10.1109/ICCV.2015.169
Ren S, He K, Girshick R, Sun J (2016) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
K. He, G. Gkioxari, P.R. Dollár (2017) Girshick, Mask R-CNN. IEEE International Conference on Computer Vision (ICCV), Venice, 2980–2988. https://doi.org/10.1109/ICCV.2017.322
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas, NV, 779-788. https://doi.org/10.1109/CVPR.2016.91
Redmon J, Farhadi A (2018): YOLOv3: an incremental improvement. [online] Available: https://arxiv.org/abs/1804.02767
J. Redmon, A. Farhadi. (2017) YOLO9000: better, faster, stronger. IEEE conference on computer vision and pattern recognition (CVPR), Honolulu, HI, 6517-6525. https://doi.org/10.1109/CVPR.2017.690
Bochkovskiy A, Wang C Y, Liao H Y M. (2020) YOLOv4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934
Reed S, Fu CY, Berg AC (2016) Ssd: single shot multibox detector. European Conference on Computer Vision (ECCV) 9905:21–37. https://doi.org/10.1007/978-3-319-46448-0_2
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, L. Chen. (2018) MobileNetV2: inverted residuals and linear bottlenecks. IEEE/CVF conference on computer vision and pattern recognition (CVPR), Salt Lake City, UT, 4510-4520. https://doi.org/10.1109/CVPR.2018.00474
X. Zhang, X. Zhou, M. Lin, J. Sun. (2018) ShuffleNet: an extremely efficient convolutional neural network for Mobile devices. IEEE/CVF conference on computer vision and pattern recognition (CVPR), Salt Lake City, UT, 6848-6856. https://doi.org/10.1109/CVPR.2018.00716
Ma N., Zhang X., Zheng HT., Sun J. (2018) ShuffleNet V2: practical guidelines for efficient CNN architecture design. Proceedings of the European conference on computer vision (ECCV), 11218, 116-131. https://doi.org/10.1007/978-3-030-01264-9_8
Szegedy C, Ioffe S, Vanhoucke V, Alemi A. (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Mountain View, CA 31,1
Wu Y, He K (2020) Group normalization. Int J Comput Vis 128(3):742–755
Woo S., Park J., Lee JY., Kweon I.S. (2018) CBAM: convolutional block attention module. Proceedings of the European conference on computer vision (ECCV),11211, 3-19. https://doi.org/10.1007/978-3-030-01234-2_1
Feldman D, Schmidt M, Sohler C (2020) Turning big data into tiny data: constant-size coresets for k-means, PCA, and projective clustering. SIAM J Comput 49(3):601–657
Kandhway P, Bhandari AK, Singh A (2020) A novel reformed histogram equalization based medical image contrast enhancement using krill herd optimization. Biomed Signal Process Control 56:101677
Veluchamy M, Subramani B (2020) Fuzzy dissimilarity color histogram equalization for contrast enhancement and color correction. Appl Soft Comput 89:106077
Rao BS (2020) Dynamic histogram equalization for contrast enhancement for digital images. Appl Soft Comput 89:106114
Dai J, Li Y, He K, Sun, J. (2016) R-fcn: object detection via region-based fully convolutional networks. Conference and workshop on neural information processing systems (NIPS), 379-387
J. Pang, K. Chen, J. Shi, H. Feng, W. Ouyang, D. Lin. (2019) Libra R-CNN: towards balanced learning for object detection. IEEE/CVF conference on computer vision and pattern recognition (CVPR), Long Beach, CA, USA, 821-830. https://doi.org/10.1109/CVPR.2019.00091
Zhang H, Chang H, Ma B. (2020) Dynamic R-CNN: towards high quality object detection via dynamic training. Proceedings of the European conference on computer vision (ECCV), 2, arXiv preprint arXiv:2004.06002
P. Zhang, Y. Zhong, X. Li. (2019) SlimYOLOv3: narrower, faster and better for real-time UAV applications. IEEE/CVF international conference on computer vision workshop (ICCVW), Seoul, Korea (south), 37-45. https://doi.org/10.1109/ICCVW.2019.00011
C. Chen, M. Liu, X. Meng, W. Xiao, Q. Ju. (2020) RefineDetLite: a lightweight one-stage object detection framework for CPU-only devices. IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW), Seattle, WA, USA, 2997-3007. https://doi.org/10.1109/CVPRW50498.2020.00358
Xu ZF, Jia RS, Sun HM (2020) Light-YOLOv3: fast method for detecting green mangoes in complex scenes using picking robots. Appl Intell 50(12):4670–4687
Lin T, Goyal P, Girshick R, He K, Dollár P (2020) Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell 42(2):318–327. https://doi.org/10.1109/TPAMI.2018.2858826
Z. Tian, C. Shen, H. Chen, T. He. (2019) FCOS: fully convolutional one-stage object detection. IEEE/CVF international conference on computer vision (ICCV), Seoul, Korea (south), 9626-9635. https://doi.org/10.1109/ICCV.2019.00972
A. Sheshkus, A. Ingacheva, V. Arlazarov, D. Nikolaev. (2019) HoughNet: neural network architecture for vanishing points detection. International conference on document analysis and recognition (ICDAR), Sydney, Australia, 844-849. https://doi.org/10.1109/ICDAR.2019.00140
Acknowledgements
The authors are grateful for collaborative funding support from the Natural Science Foundation of Shandong Province, China (ZR2018MEE008).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Cui, Z., Sun, HM., Yu, JT. et al. Fast detection method of green peach for application of picking robot. Appl Intell 52, 1718–1739 (2022). https://doi.org/10.1007/s10489-021-02456-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02456-6