Accurate detection and depth estimation of table grapes and peduncles for robot harvesting, combining monocular depth estimation and CNN methods

Precision agriculture is a growing field in the agricultural industry and it holds great potential in fruit and vegetable harvesting. In this work, we present a robust accurate method for the detection and localization of the peduncle of table grapes, with direct implementation in automatic grape ha...

Descripción completa

Detalles Bibliográficos
Autores: Coll-Ribes, Gabriel, Torres-Rodríguez, Iván J., Grau Saldes, Antoni, Guerra, Edmundo, Sanfeliu, Alberto
Tipo de recurso: artículo
Estado:Versión publicada
Fecha de publicación:2023
País:España
Institución:Consejo Superior de Investigaciones Científicas (CSIC)
Repositorio:DIGITAL.CSIC. Repositorio Institucional del CSIC
OAI Identifier:oai:digital.csic.es:10261/351137
Acceso en línea:http://hdl.handle.net/10261/351137
https://api.elsevier.com/content/abstract/scopus_id/85175342374
Access Level:acceso abierto
Palabra clave:Grape bunch and peduncle depth estimation
Grape bunch and peduncle detection
Image segmentation
Monocular depth
Robot harvesting
Descripción
Sumario:Precision agriculture is a growing field in the agricultural industry and it holds great potential in fruit and vegetable harvesting. In this work, we present a robust accurate method for the detection and localization of the peduncle of table grapes, with direct implementation in automatic grape harvesting with robots. The bunch and peduncle detection methods presented in this work rely on a combination of instance segmentation and monocular depth estimation using Convolutional Neural Networks (CNN). Regarding depth estimation, we propose a combination of different depth techniques that allow precise localization of the peduncle using traditional stereo cameras, even with the particular complexity of grape peduncles. The methods proposed in this work have been tested on the WGISD (Embrapa Wine Grape Instance Segmentation) dataset, improving the results of state-of-the-art techniques. Furthermore, within the context of the EU project CANOPIES, the methods have also been tested on a dataset of 1,326 RGB-D images of table grapes, recorded at the Corsira Agricultural Cooperative Society (Aprilia, Italy), using a Realsense D435i camera located at the arm of a CANOPIES two-manipulator robot developed in the project. The detection results on the WGISD dataset show that the use of RGB-D information (mAP=0.949) leads to superior performance compared to the use of RGB data alone (mAP=0.891). This trend is also evident in the CANOPIES Grape Bunch and Peduncle dataset, where the mAP for RGB-D images (mAP=0.767) outperforms that of RGB data (mAP=0.725). Regarding depth estimation, our method achieves a mean squared error of 2.66 cm within a distance of 1 m in the CANOPIES dataset.