摘" " 要:中國在水果產量方面處于全球領先地位,但因人力資源減少和老齡化問題,傳統(tǒng)的人工采摘方式已經無法滿足快速高效的采摘需求,研發(fā)集成計算機視覺的自動化水果采摘設備成為解決勞動力短缺難題的關鍵。水果大多呈類球狀,相關的識別算法研究居多,探討了柑橘、蜜桃等類球狀水果的識別算法。根據應用場景的不同,分析了傳統(tǒng)類球狀水果識別算法與基于深度學習的類球狀水果識別算法在網絡結構方面的差異與改進,對水果采摘識別算法進行總結并提出算法的未來發(fā)展趨勢。傳統(tǒng)算法在簡單場景下表現有效,但在復雜環(huán)境中往往會受到設計特征的限制,基于深度學習的算法因其高效性和準確性更適合自動化水果采摘的需求??偨Y了類球狀水果識別算法的研究進展,在處理復雜環(huán)境時深度學習算法具有良好的有效性和適應性,更適合部署在自動化采摘設備;也提出了未來的研究方向,即通過優(yōu)化算法性能、數據集構建及擴增,以及結合多模態(tài)數據提升算法的精度和適應性。
關鍵詞:水果采摘;目標檢測算法;深度學習;卷積神經網絡;計算機視覺
中圖分類號:S66 文獻標志碼:A 文章編號:1009-9980(2025)02-0412-15
Research progress in globular fruit picking recognition algorithm based on deep learning
LI Hui1, ZHANG Jun2*, YU Shuochen2, LI Zhixin2
(1School of Information Engineering, Huzhou University, Huzhou 313000, Zhejiang, China; 2Food Science Research Institute, Zhejiang Academy of Agricultural Sciences, Hangzhou 310021, Zhejiang, China)
Abstract: China is a global leader in fruit production, and fruit picking mainly relies on manual labor, which helps to select fruits according to fruit size and quality to reduce loss in this way. Different techniques and tools can be adopted according to the characteristics and picking needs of each fruit crop. However, the present picking field is faced with the problem of decreasing human resources and aging problem. Meanwhile, the traditional manual picking method has become unable to meet the demand for fast and efficient picking. To solve the problem of labor shortage, the research and development of automated fruit picking equipment with integrated computer vision have become the key to solve the problem of labor shortage. It can effectively improve the efficiency and quality of fruit picking. Automatic picking equipment combined with computer vision often uses object detection algorithms to identify objects, and object detection algorithms can be divided into both traditional algorithms and deep learning-based object detection algorithms.Traditional algorithms identify the position and bounding box of a specific object in an image or a video, usually by preprocessing the image (Scaling, grayscale or normalization), feature extraction (using traditional hand-designed features or automatic learning based on machine learning), classification or regression (confirming object class and location), and non-maximum suppression to further optimize and filter detected objects. When traditional fruit detection algorithms process images in complex environments, their limited expression ability and robustness are easily affected by illumination, occlusion and other factors, resulting in a decline in recognition accuracy. Furthermore, with the increase of feature complexity and computation amount, the algorithm processing speed will be reduced. When changing scenes, adding fruit types and updating features, the feature extractor needs to be redesigned and adjusted, and in special cases, the entire system needs to be retrained. Compared with traditional fruit detection algorithms, the fruit detection algorithm based on deep learning can extract and learn rich features from a large amount of data, and has higher accuracy and robustness when processing noisy data. When changing new environments and adding new categories, the fruit detection algorithm based on deep learning can improve the recognition ability and recognition accuracy of the model through transfer learning, data enhancement, multi-model combination, feature fusion and multi-modal data. Fruit detection algorithms based on deep learning can be divided into two categories: one-stage target detection algorithm and two-stage target detection algorithm. The one-stage object detection algorithm achieves end-to-end detection by using a single convolutional neural network to directly predict the target location and category. This method achieves fast detection while maintaining high accuracy, transforms the problem of target detection into a regression problem, and completes the location and classification of the target directly. In the training and deployment phase of the algorithm, the first-stage object detection algorithm uses pruning and quantization techniques to reduce the model size, which is suitable for running in mobile devices or embedded systems with limited resources. The two-stage target detection algorithm is called the target detection algorithm based on region of interest or region suggestion, which is usually divided into two stages: 1) Generate a large number of candidate regions by selective search, regional suggestion network (RPN) and other methods; 2) Through the network processing including classifiers and boundary box regressors, the candidate region is identified and accurately located. Traditional algorithms are effective in simple scenarios, but are often limited by design features in complex environments. Algorithms based on deep learning are more suitable for automated fruit picking due to their high efficiency and accuracy. This paper summarizes the improvement and application of traditional object detection algorithm and deep learning-based object detection algorithm. Also, this paper summarizes the improvements and applications of traditional spherical fruit detection algorithms and deep learning-based spherical fruit detection algorithms, and analyzes the advantages and disadvantages of these algorithms in different use scenarios. This paper summarizes the fruit picking recognition algorithm and puts forward the future development trend of the algorithm. With model optimization and lightweight as the starting point, the efficient network architecture or model compression technology is adopted to reduce computational complexity and model size, improve model processing speed and adapt to mobile automatic picking equipment. It is required to enhance data processing, improve model generalization by preprocessing and synthesizing data, and optimize model adaptability in changing environments. The accuracy and robustness of model recognition are improved by combining spectral, infrared, laser and other sensor data. The model adaptive adjustment algorithm should be developed to adjust strategies and parameters according to real-time feedback and adapt to different fruit picking operations and different picking environmental conditions. In the fruit picking recognition algorithm based on deep learning, YOLO can directly predict the boundary box and category probability of the target in a single forward propagation to achieve near real-time detection, which is very important for the fruit picking robot in the orchard that needs fast response. The end-to-end design of YOLO simplifies the training inspection process, reduces complexity, and enables faster deployment in picking robot systems. In the changeable environment of orchards and groves, YOLO can effectively distinguish between fruit and background, improving the accuracy of detection. With the continuous research by domestic scholars, YOLO algorithm is also continuously iteratively optimized, and its ability to detect the objects of different sizes and shapes is significantly improved, which can adapt to the maturity degree, size and occlusion of fruits, and improve the detection performance in complex environments.
Key words: Fruit picking; Object detection algorithms; Deep learning; Convolutional neural networks; Computer vision
中國水果產量位居全球第一[1],常見的類球狀水果如柑橘、蘋果和桃等的采摘工作主要依賴人工。人工采摘時能根據果實成熟程度、大小和品質選擇果實以減少損耗,并能根據不同果樹的特點和采摘需求采用不同的技術與工具。因人口老齡化勞動力數量減少,僅靠人工無法高效完成采摘任務,所以開發(fā)自動化水果采摘設備成為解決這一問題的重要途徑。
結合計算機視覺的自動化水果采摘設備,利用采摘識別算法識別果園中的水果、判斷水果品質并精確定位[2]。與傳統(tǒng)的人力采摘相比,結合計算機視覺的自動化采摘設備具有識別速度快、精度高、成本低、提升水果采摘效率和質量等優(yōu)點[3]。在采摘類球狀水果時,類球狀水果在二維圖像中呈現接近圓形的輪廓,穩(wěn)定的幾何形狀能幫助算法減少因形狀復雜性帶來的誤檢問題[4]。并且,類球狀水果因具有規(guī)則的形狀和一致的顏色分布,容易進行有效的數據增強操作,增加訓練數據的多樣性,提升算法的泛化能力,進一步提高設備的通用性和穩(wěn)定性以更好地適應不同的果園環(huán)境。自動化采摘設備在自然環(huán)境下作業(yè)同樣會遇到許多問題或要求,例如光照變化、遮擋問題、多樣性、復雜性、背景干擾以及實時性要求,這些都會影響識別的準確率。為克服這些挑戰(zhàn),需要持續(xù)改進目標檢測算法的魯棒性以及適應性,即在有限的算力資源下,降低算法參數量,提高運行速度,更好地應用于自動化采摘設備中[5]。
近年來,針對類球狀水果采摘中的目標檢測問題提出了不同的算法和技術,主要分為兩大類:傳統(tǒng)目標檢測算法和基于深度學習的目標檢測算法?;谏疃葘W習的目標檢測算法又進一步細分為單階段目標檢測算法[6]和兩階段目標檢測算法[7]。筆者在本文中對應用于類球狀水果采摘的目標檢測算法進行介紹,并對每類算法的研究成果進行歸納分析。
1 傳統(tǒng)目標檢測算法
傳統(tǒng)目標檢測算法[8]在識別圖像或視頻中特定對象的位置和邊界框時通常的步驟包括:對圖片進行預處理(如:縮放、灰度化或歸一化)、特征提?。ɡ脗鹘y(tǒng)手工設計特征或是基于機器學習的自動學習特征)、分類或回歸(確認對象類別和位置)以及利用非極大值抑制等方法對檢測到的對象進一步優(yōu)化和過濾。Liu等[9]提出使用簡單線性迭代聚類算法對果園圖像進行超像素塊分割,從超像素塊中提取顏色特征確定候選區(qū)域,利用方向梯度直方圖描述果實形狀,用于檢測、定位果實。夏康利等[10]提出基于HSV顏色空間統(tǒng)計特征的水果識別技術,將RGB水果圖像轉換為HSV顏色空間,將色調分布近似為拉普斯分布并將其作為果實的特征描述,使用Meanshift算法[11]進行圖像分割,通過計算輸入的水果圖像色調數據的馬氏距離、并與預設的特征馬氏距離進行比較,判斷輸入的水果類別。鄒偉[12]采用工業(yè)相機獲取柑橘RGB圖像,將柑橘圖像的RGB顏色空間轉換為HSV顏色空間,按H、S、V三通道計算顏色直方圖,利用H通道峰值以及對應色調值對柑橘成熟度進行判斷,實驗證明該方法對柑橘成熟度的檢測準確率在90%以上。陳雪鑫等[13]提出一種基于多顏色和紋理特征的水果識別算法,利用顏色矩算法和非均勻量化算法對圖像RGB、HSV顏色空間提取顏色特征,使用局部二值化提取紋理特征,將顏色和紋理特征向量優(yōu)化組合,利用BP神經網絡作為分類器對樣本進行訓練分類,最終的實驗結果表明通過多特征的結合可使分類準確率超過90.0%,高于單一特征算法的準確率。徐惠榮等[14]設計了基于彩色信息的樹上柑橘識別算法,對各種天氣、光照場景采集圖像,并對圖像進行顏色提取,利用柑橘果實、枝葉在R-B顏色指標的差異建立柑橘識別顏色空間,利用動態(tài)閾值法將柑橘果實從背景分割,實現對樹上單個或多個柑橘果實的識別。上述傳統(tǒng)目標檢測算法主要依賴于識別水果的形狀、顏色等單一或組合特征。此類算法通過對背景進行建模和特征融合,提取果實信息并對其進行分割,從而在自然環(huán)境下實現對水果的有效檢測。
傳統(tǒng)目標檢測算法在特定場景下表現良好,但具有依賴手工設計的特征,難以適應復雜場景和目標變化。在自然環(huán)境中,傳統(tǒng)檢測算法的表達能力和魯棒性有限,易受到光照變化、枝葉遮擋、果實重疊等因素的影響,導致識別準確率下降。當場景變更、添加水果種類和更新特征時,需要重新設計和調整特征提取器,特殊情況下甚至需要重新訓練整個系統(tǒng)。相比之下,基于深度學習的目標檢測算法能從大量數據提取、學習到豐富的特征,具備更高的準確性和魯棒性。當場景變更、添加水果種類時,基于深度學習的目標檢測算法可以通過遷移學習、數據增強、多模型組合、特征融合以及多模態(tài)數據提高模型的識別能力與魯棒性。
2 基于深度學習的目標檢測算法
基于深度學習的目標檢測算法可分為兩大類:單階段目標檢測算法和兩階段目標檢測算法。單階段目標檢測算法通過使用單個卷積神經網絡(Convolutional Neural Network,CNN)直接預測目標位置和類別,實現端到端的檢測。這種方法在保持高準確率的同時可實現快速檢測,即將目標檢測問題轉化為回歸問題,直接對目標完成位置定位與分類。在算法的訓練和部署階段,單階段目標檢測算法采用剪枝和量化等技術減小模型尺寸,適合在資源有限的移動設備或嵌入式系統(tǒng)中運行。兩階段目標檢測算法稱為基于感興趣區(qū)域或基于區(qū)域建議的目標檢測算法,這類算法運行時通常分為兩個階段:1)利用選擇性搜索、區(qū)域建議網絡(Region Proposal Network,RPN)[15]等方法生成大量候選區(qū)域;2)通過包含分類器和邊界框回歸器的網絡處理,對候選區(qū)域進行目標識別與精確定位。
2.1 單階段目標檢測算法
單階段目標檢測算法省略生成候選區(qū)域的步驟,直接在特征圖中生成類概率和位置坐標,再進行分類回歸。常見的單階段目標檢測算法有YOLO、SSD[16]、MobileNet[17]、ShuffleNet[18]、Swin-Transformer[19],其中YOLO系列模型應用最多。Redmon等[20]為解決兩階段目標檢測算法檢測速度慢、提取特征區(qū)域重復等問題提出了YOLOv1算法,將目標檢測轉化為回歸問題,使用全局特征預測邊界框。YOLOv1通過圖像均勻分割避免重復計算,提高了檢測速度,適用于實時性要求高的自動化采摘設備,但其精度較低。Redmon等[21]提出的YOLOv2算法,通過引入錨框和聯(lián)合訓練提升了精度,但在復雜場景和小目標檢測中仍存在誤檢。為了進一步提升多尺度特征提取能力,Redmon等[22]提出了YOLOv3算法,采用更深的DarkNet-53骨干網絡,并結合特征金字塔網絡(Feature Pyramid Networks,FPN)[23]進行特征融合,顯著增強了對不同尺度目標的檢測能力。Bochkovskiy等[24]提出的YOLOv4算法,引入Mosaic數據增強、CSPDarkNet-53骨干網絡和SPP模塊,提升了復雜背景和遮擋情況下的精度,同時保持實時性。在YOLOv4取得成功的基礎上,研究者進一步推出了YOLOv5算法[25],YOLOv5通過自適應錨框、Focus模塊和輕量化設計,更適合資源受限的設備。Li等[26]提出的YOLOv6算法取消了錨框,采用EifficientRep和Rep-PAN模塊,雖然提高了檢測精度,但其復雜的結構不適合部署在資源受限的移動采摘設備上。Wang等[27]提出了YOLOv7算法,通過BConv、E-ELAN和MPConv層優(yōu)化特征提取,不依賴錨框,提升了硬件的適應性。Ultraytics公司提出了YOLOv8算法[28],YOLOv8采用C2F模塊和解耦頭,進一步提升了檢測速度和精度,適應多場景需求。YOLO系列算法經過長期演變,其核心優(yōu)化始終圍繞速度與精度的平衡展開。YOLO以實時性為目標,在保持高檢測精度的同時,能夠滿足各類應用場景對快速響應的需求(如表1羅列的各類YOLO算法改進點及目標),而這正是自動化水果采摘領域研究的關鍵所在。因此,YOLO成為果實識別任務中應用最廣泛的目標檢測算法。隨著技術的發(fā)展,許多基于YOLO的優(yōu)化模型不斷涌現,為提高識別精度和實時性能提供了更多有效的解決方案。
在復雜的自然環(huán)境中,柑橘早期果實與背景的枝葉顏色相近,傳統(tǒng)算法很難精確識別果實,常出現把綠色枝葉背景錯誤識別為果實以及漏檢的情況。為解決上述類似問題,宋中山等[29]提出D-YOLOv3算法,即采用密集連接卷積網絡(Densely Connected Convolutional Networks,DenseNet)[30],加強特征傳播,實現特征的復用。在構建數據集時,宋中山等采集不同天氣狀況的柑橘圖片,對圖片進行高斯模糊、色彩平衡等處理,提高了數據集的多樣性,有效提高了模型的泛化能力與魯棒性,實驗表明D-YOLOv3對柑橘早期果實的識別精確率達83.0%。呂強等[31]基于YOLOv5s優(yōu)化改進,提出了柑橘早期果實的檢測算法YOLO-GC。針對模型精度低、模型大的問題,將骨干網絡換為輕量級GhostNet,嵌入全局注意力機制(Global Attention Mechanism,GAM)[32]以提升提取果實特征的能力。為改善枝葉遮擋、重疊造成的漏檢問題,YOLO-GC采用GIoU損失函數、結合非極大值抑制(Soft-Non-Maximum Suppression,Soft-NMS)[33]算法優(yōu)化邊界框的回歸機制,最終實驗表明,YOLO-GC與YOLOv5s相比,權重文件大小減少了53.9%,僅占用6.69 MB,平均精度提高1.2%達到了97.8%。在邊緣設備端對綠色柑橘果實檢測時,實驗推理僅用時108 ms。帖軍等[34]提出一種基于混合注意力機制和YOLOv5模型改進的柑橘識別方法YOLOv5-SC。在骨干網絡嵌入SE[35]注意力與CA[36]注意力,使網絡不僅能捕獲方向和位置信息,也能捕獲通道信息,讓模型更好提取、定位柑橘的圖像特征。YOLOv5-SC引入Varifocal Loss[37]作為損失函數,能夠更加平衡正負樣本的損失。實驗表明,YOLOv5-SC的平均精度達到了95.1%,改善了將綠色背景誤檢成綠色柑橘果實的問題。
在自然環(huán)境下,類球狀水果往往以各種姿態(tài)分布在果樹上,對果園這類遠距離、大視場的場景進行識別時,樹葉遮擋、果實目標較小或果實密集分布等因素均會導致目標檢測算法在識別過程中出現漏檢或誤檢的情況。為了解決此類問題,馬帥等[38]提出基于YOLOv4改進的梨果實識別算法,將SPP模塊中的最大池化法改為平均池化法,更多地保留目標信息,解決了漏檢和誤檢的問題。另外,該算法將SPP模塊前后的卷積、PANet中的部分卷積以及輸出部分的卷積替換為深度可分離卷積,即在保證卷積效果不變的情況下減少了模型所占的空間。使用訓練后的改進YOLOv4模型對新獲取的圖像樣本進行測試,改進后的模型權重文件大小為136 MB,平均精度達到90.2%。劉忠意等[39]對YOLOv5進行改進,提出了一種橙子果實的識別算法。將骨干網絡部分C3模塊替換為RepVGG模塊,加強特征提取能力,將頸部網絡中的普通卷積替換成鬼影混洗卷積(Ghost Shuffle Convolution,GConv)[40],在保證精度的同時也降低了模型參數量。為提高定位目標信息的準確率,該算法在預測頭前加入了高通道注意力機制(Efficient Channel Attention,ECA)[41],最后經實驗證明,改進后的算法對橙子檢測的平均精度達到90.1%,誤檢漏檢的問題被有效解決,檢測效果如圖1所示。該算法在無遮擋、復雜光照、枝葉遮擋及密集小目標場景下均展現了良好的檢測效果,具備較強的魯棒性和泛化能力。
賀英豪等[42]設計了一種基于YOLOv5s的改進算法,有效提升了對李識別的準確率,該算法骨干網絡中的下采樣卷積被替換成FM模塊,保證模型下采樣時不丟失嚴重遮擋目標和小目標的特征信息,使用focal loss和交叉熵函數的加權損失作為分類損失,提升密集目標的識別能力,最后測試模型性能發(fā)現平均精度提高了2.8%,達到97.6%,小目標識別的平均精度達到92.0%。為實現對柑橘果實的精確識別,黃彤鑌等[43]提出一種基于YOLOv5改進模型的識別方法。該算法通過引入卷積注意力模塊(Convolutional Block Attention Module,CBAM)[44]提高網絡的特征提取能力,緩解遮擋目標與小目標的漏檢問題,利用Alpha-IoU[45]損失函數代替GIoU損失函數作為邊界框回歸損失函數,提高邊界框定位的精度。最后結果顯示該模型的平均精度達到91.3%,對單張柑橘果實圖像的檢測時間為16.7 ms。苑迎春等[46]提出基于改進YOLOv4-Tiny的果園環(huán)境下桃的實時識別算法,YOLOv4-Tiny-Peach在骨干網絡引入CBAM,頸部網絡添加大尺度淺層特征層,提高小目標識別精度,采用雙向特征金字塔網絡(Bidirectional Feature Pyramid Network,BiFPN)[47]對不同尺度特征進行融合。通過訓練,YOLOv4-Tiny-Peach平均精度達87.9%,與YOLOv4-Tiny相比,在大視場和早期桃子識別場景下該模型檢測效果提升更明顯。為提升全天候自動化采摘設備在夜間環(huán)境中的視覺檢測能力,熊俊濤等[48]提出Des-YOLOv3算法,借鑒ResNet[49]與DenseNet,實現對多層特征的復用、融合,加強了夜間環(huán)境下算法對小目標、重疊遮擋果實識別的魯棒性,檢測效果如圖2所示,實驗表明Des-YOLOv3平均精度達97.7%。此后,熊俊濤等[50]再次針對夜間采摘作業(yè),提出基于YOLOv5s改進和主動光源結合的柑橘識別算法BI-YOLOv5s,即利用BiFPN進行多尺度交叉連接和加權特征融合,引入CA注意力加強定位信息提取,采用C3TR模塊減少計算量并提取全局信息。實驗后發(fā)現,在光源色環(huán)境下,該模型對夜間柑橘識別準確率達95.3%,實現了全天候自動化采摘作業(yè)。余圣新等[51]利用全維動態(tài)卷積替換YOLOv8系列模型中的部分普通卷積以提高YOLOv8系列的魯棒性,并將損失函數替換為MPDIoU[52],解決了原本CIoU損失函數退化的問題。通過實驗驗證,改進后的YOLOv8n、YOLOv8s、YOLOv8m、YOLOv8l、YOLOv8x模型的平均精度分別提高至88.3%、89.3%、89.6%、89.9%、90.1%。岳有軍等[53]基于YOLOv8設計了一個新的特征融合網絡Rep-YOLOv8實現高層語義和低層空間特征融合。通過集成EMA注意力模塊到YOLOv8中,抑制背景和枝葉遮擋等一般特征信息,使模型更關注果實區(qū)域。最后,將C2f模塊替換為三支路DWR模塊,通過多尺度特征融合提高小目標檢測能力,使用Inner-SIoU[54]損失函數提高模型精度。在果園環(huán)境中,以蘋果作為檢測對象,進行不同果實數量、不同成熟度的實驗對比。實驗結果表明,該算法平均精度達到94.0%,在成熟果實大視場的識別場景下,改進后算法的各項指標均有顯著提升,為果實識別任務提供有效支持。
目標檢測算法通常包含龐大的參數量和復雜的網絡結構,將這些模型部署到嵌入式平臺時,有限的計算資源會嚴重限制模型的實時響應速度。為解決這個問題,呂石磊等[55]提出基于YOLOv3改進的輕量化柑橘識別方法YOLO-LITE,使用MObileNet-v2作為骨干網絡,便于部署到移動終端,并引入GIoU[56]邊框回歸損失函數。最終實驗表明,YOLO-LITE對柑橘目標檢測速度可以達到246幀·s-1,權重文件大小為28 MB。王卓等[57]以YOLOv4算法為基礎提出輕量級蘋果實時檢測算法YOLOv4-CA,使用輕量級網絡MobileNet-v3作為特征提取網絡,并將SE注意力模塊集成其中作為頸部基本塊,提高網絡對特征通道的敏感程度,增強特征提取能力。為有效壓縮模型參數量和計算量,王卓等將特征融合網絡的普通卷積全部換為深度可分離卷積。最終實驗表明該算法平均檢測精度達到92.2%,在嵌入式平臺檢測速度為15.11幀·s-1,內存占用量54.1 MB,在保證精度的同時也可滿足對采摘機器人實時性的需求。曾俊等[58]提出利用YOLO-Faster算法對桃進行實時快速檢測,在YOLOv5s基礎上將骨干網絡替換為FasterNet[59],引入部分卷積(Partial Convolution,PConv)[60]有效減少計算冗余和內存訪問,模型檢測速度提升,變得更加輕量化。在骨干網絡和頸部網絡之間,增加串聯(lián)的卷積注意力模塊和常規(guī)卷積模塊,強化骨干網絡和頸部網絡之間的特征融合和特征提取能力,提高檢測的準確性。采用SIoU[61]作為損失函數,解決預測框與真實框之間不匹配的問題,更好地衡量預測框和真實框之間的匹配程度,提高檢測結果的質量。經過自建數據集的訓練和嵌入式設備Jetson Nano上的部署,該算法平均精度達到了88.6%,權重文件大小為8.3 MB,相較于YOLOv5s,平均精度提升了1%。趙輝等[62]提出基于YOLOv3改進的蘋果識別算法,將DarkNet-53網絡殘差模塊與CSP模塊[63]結合進而降低網絡計算量,通過加入SPP模塊將全局、局部特征融合,提高小目標召回率。采用SoftNMS算法增強重疊遮擋果實的識別能力。改進后算法的平均精度達到96.3%,相較于YOLOv3提高了3.8%,滿足了蘋果自動采摘識別準確性和實時性的要求。然而,當光線不足或果實表面紋理特征不明顯時,算法的準確率可能會受到影響。Yan等[64]對YOLOv5S算法進行了優(yōu)化改進,提高了模型表達能力和空間信息損失處理能力,使其更適合部署在嵌入式設備上。首先,將模型骨干網絡的BottleneckCSP模塊橋分支上的卷積層移除,把BottleneckCSP模塊輸入特征映射與另一個分支的輸出特征映射直接進行深度連接,減少模塊中的參數數量。其次,將SE注意力嵌入到網絡模型中,通過學習自動獲得一種新的特征重新校準策略,有效提高了模型的表達能力。最后,將下層感知視野較大的特征提取層輸出與位于中等大小目標檢測層之前的特征提取層輸出進行融合,以彌補因高層特征分辨率低造成的空間信息損失,檢測效果如圖3所示。王乙涵[65]致力于完成精確且高效的柑橘識別采摘任務,為此構建了適用于采摘機器人的輕量化目標檢測模型LT-YOLOv7,以解決YOLOv7模型存儲空間需求高、不適合移動終端等問題。采用RepVGG[66]作為骨干網絡,將其得到的多尺度特征圖與YOLOv7的頸部網絡進行多尺度特征拼接,以保留全局特征并降低整體網絡的計算量。頸部網絡引入深度可分離卷積,以減少參數量、節(jié)省內存并提高模型精度。此外,通過引入ECA增強特征表示,提升目標判別能力,降低葉片、枝干等因素對目標識別的干擾。在預測階段,模型采用soft DIoU_NMS算法進行目標預測框的篩選,以優(yōu)化對重疊物體的識別能力,優(yōu)化后的LT-YOLOv7模型對重疊遮擋柑橘果實檢測的平均精度達到了97.0%,如圖4所示,即使在果實被遮擋的情況下,該算法仍然能夠獲得良好的檢測效果。
Yang等[67]針對蘋果果實密度高、重疊、網絡模型參數化問題,提出了MobileOne-YOLOv7算法。MobileOne-YOLOv7采用多尺度特征提取方法,構建特征金字塔輸入模型。多尺度訓練提高了模型的魯棒性,避免多尺度特征提取過程中的計算過多問題。將骨干網絡的最后一個ELAN模塊替換為MobileOne模塊,增強模型的非線性和表示能力。同時,還將SPPCSPC模塊更改為SPPFCSPC模塊,將串行通道變?yōu)椴⑿型ǖ溃诒WC感受野不變的情況下加快特征融合速度。此外,在頸部網絡增加了一個預測頭,提高了對不同尺度物體的檢測精度。通過引入可重參數化的分支,訓練時增加模型容量,推理時簡化結構,降低內存訪問成本。張震等[68]提出基于YOLOv7改進的輕量化蘋果識別算法,將多分支堆疊模塊中的部分普通卷積換成PConv,以減少模型的參數量和計算量。同時,該算法加入ECA解決遮擋目標的錯檢漏檢問題,保證了模型的精度平衡。在模型訓練過程中,該算法采用了基于麻雀搜索算法(Sparrow Search Algorithm,SSA)[69]的學習率優(yōu)化策略,顯著提高了模型的檢測精度,實驗表明模型的平均精度達到了97.0%,模型參數量和計算量分別降低了22.9%、27.4%,適合部署在嵌入式設備中。
2.2 兩階段目標檢測算法
Girshick等[70]提出了R-CNN算法,R-CNN利用區(qū)域建議網絡提取大約2000個自上而下獨立于類別的區(qū)域建議。通過大型CNN計算這些區(qū)域的固定長度特征,使用線性支持向量機(Support Vector Machines,SVM)[71]對這些特征進行分類,確定每個區(qū)域是否包含特定的目標類別。He等[72]提出的SPP-Net算法改進了R-CNN,使其能夠處理任意比例的圖像。通過金字塔池化,利用SPP-Net提取不同尺度的特征并整合,生成固定長度的輸出。與R-CNN相比,SPP-Net無需處理所有候選區(qū)域,只需輸入整張圖像即可獲得特征圖,直接從中提取感興趣區(qū)域的特征,減少冗余計算并提高了速度。Girshick[73]在SPP-Net的基礎上提出了Fast R-CNN算法。Fast R-CNN將整個對象建議與整張圖像作為輸入,通過多個卷積和最大池化層生成特征圖。通過一次卷積操作解決了多次卷積產生的冗余問題。Fast R-CNN利用感興趣區(qū)域(Region Of Interest,ROI)池化層從特征圖中獲取固定長度的特征向量,然后通過全連接層進行處理,最終分為分類和回歸兩個輸出層。
Ren等[74]提出的Faster R-CNN算法摒棄傳統(tǒng)的選擇性搜索算法,引入了RPN。RPN通過滑動窗口生成不同尺寸的錨框,并根據設定的閾值對其進行正負判斷,輸出候選邊界框及概率數據。這些候選區(qū)域經過ROI池化層操作后,被映射為固定大小的特征圖,然后通過全連接層進行物體類別判斷和位置精確定位。Dai等[75]提出了基于區(qū)域的R-FCN算法,由共享的全卷積結構組成。R-FCN生成位置敏感分數圖作為輸出,編碼了相對空間位置信息,其ROI池化層從分數圖中提取信息。Cai等[76]提出了Cascade R-CNN算法,包括提議建立子網絡和ROI檢測子網絡。Cascade R-CNN利用級聯(lián)邊界框回歸將回歸任務分解,每一步驟都使用專門回歸器。通過級聯(lián)回歸作為重采樣機制,解決初始假設分布嚴重偏向低質量的問題。其中,Faster R-CNN是第一個實現端到端的基于深度學習的目標檢測算法。
在兩階段目標檢測算法的基礎上,研究者優(yōu)化檢測模型,提出高效準確的類球狀水果目標檢測算法應用于自動化采摘作業(yè)當中。任會等[77]利用果園內采集的柑橘果實圖像,通過實驗比較傳統(tǒng)檢測算法和Faster R-CNN對柑橘果實的識別效果,實驗發(fā)現傳統(tǒng)檢測算法在增強預處理且果實無遮擋的情況下,識別效果要優(yōu)于Faster R-CNN,但當果實重疊或遮擋時,則Faster R-CNN識別效果更優(yōu)。Wan等[78]提出了一種基于Faster R-CNN改進的多類水果檢測框架。骨干網絡為VGG-16,包含13個卷積層、13個ReLu層和4個池化層。為避免因樣本較少訓練出現過擬合問題、平衡模型的復雜度和數據量,該算法通過正則化對高位參數進行權值衰減,增加兩個損失函數優(yōu)化卷積層和池化層參數,根據拍攝角度自動調整保證每個卷積層的大小以及核參數的合理性,提高檢測精度。Liu等[79]提出基于R-FCN改進的水果識別定位算法,由RPN和FCN組成。RPN用于生成候選區(qū)域框,FCN用于像素級特征提取,通過反卷積可視化檢測結果。黃磊磊等[80]為解決算法識別遮擋重疊柑橘果實精度低的問題,提出基于深度學習的重疊柑橘分割與形態(tài)復原算法,引入Pointrend分支的Mask R-CNN,實現對柑橘的識別及邊緣細化的實例分割,采用編解碼器結構的U-Net作為主體網絡提出形態(tài)粗復原模型,設計局部懲罰損失函數及交并比形狀損失函數,通過機器視覺方法根據粗復原結果提取ROI,最后利用基于PConv的形態(tài)精復原模型完成果實的形態(tài)復原。采用該方法對柑橘果實識別的平均精度達93.7%,分割精確度達96.3%。荊偉斌等[81]針對蘋果園果實產量預估提出了一種基于不同特征網絡的蘋果樹側面果實識別方法。研究人員通過采集果園內蘋果樹的側視圖,測試不同特征提取網絡與Faster R-CNN模型結合的識別效果。實驗中,分別選用了VGG-16和ResNet-50作為特征提取網絡,對兩個Faster R-CNN模型進行訓練。結果顯示,雖然兩者使用了相同的學習參數,VGG-16作為特征網絡的Faster R-CNN模型在各項指標上優(yōu)于ResNet-50,識別精度達91.0%,單幅圖像的推理時間為1.4 s。賈艷平等[82]利用相機采集自然環(huán)境中不同水果的RGB圖像,在Faster R-CNN中添加似然函數和正則化函數保證卷積層的大小和核參數在合理范圍內,對不同水果進行識別測試,整體識別準確率達99.7%,其中,對橙子的識別準確率為77.3%。Lu等[83]利用相機采集果園內綠色柑橘果實的圖像,采用深淺層特征融合策略增加Mask R-CNN骨干網絡每一階段提取的特征信息,通過引入骨干網絡之間的組合連接塊、減少通道數并提高模型精度,改進后的Mask R-CNN在識別綠色柑橘果實的平均精度達95.4%,比原模型提高了1.4%。Min等[84]為了聚合CNN不同層次的注意力特征,設計了多尺度注意力網絡(Multi Scale Attention Network,MSANet)。MSANet引入混合注意力機制,能將空間通道注意力和不同層的多個注意力特征聚合到最終的統(tǒng)一表示,使最終表示更具魯棒性、全面性。
3 總結及未來發(fā)展趨勢
3.1 總結
本文對水果采摘領域中表現優(yōu)異的檢測算法進行了綜述,重點分析了傳統(tǒng)目標檢測算法和基于深度學習的目標檢測算法。針對類球狀水果識別任務的傳統(tǒng)目標檢測算法依賴手工設計特征,通過明確的規(guī)則提取,使得算法的各個步驟具備高度的可解釋性。傳統(tǒng)算法對數據需求較少,僅需少量標記數據就能實現模型的調試和優(yōu)化,且無需復雜的深度神經網絡運算,計算復雜度相對較低,對計算資源的要求不高。然而,在自然環(huán)境中,傳統(tǒng)水果檢測算法在處理果實重疊、光照變化和枝條遮擋等復雜場景時,往往難以準確地提取有效信息。在更換識別水果種類時,可能需要人工更改算法,缺乏良好的泛化能力。
相比之下,基于深度學習的目標檢測算法采用多層級的神經網絡架構,研究人員可以通過對網絡內模塊的調整,增強特征表達能力、減少模型參數量和提升圖像推理速度等。深度神經網絡架構的優(yōu)勢在于,能通過大規(guī)模數據的學習,自主提取復雜且抽象的多層次特征,并通過層次化特征學習逐步捕捉從低級到高級的語義信息,提高模型在面對復雜目標和多變環(huán)境時的檢測精度和魯棒性。
在類球狀水果的識別任務中,研究人員通過引入輕量級特征提取網絡(如MobileNet、GhostNet)和不同注意力機制模塊(如SE、CA和CBAM),降低計算成本和內存占用,使得這些算法更適合在資源有限的嵌入式設備上運行。同時,這些改進增強了模型對關鍵特征的敏感性,使網絡能夠更精準地捕捉到目標對象的關鍵特征,抑制一般特征信息,從而提高檢測精度,提升算法在復雜環(huán)境中的表現能力,解決由于背景復雜性導致的誤檢和漏檢問題。此外,研究人員還運用多種先進的損失函數(如SIoU、Alpha-IoU損失等),平衡了正負樣本的影響,提高邊界框的回歸精度。
3.2 發(fā)展趨勢
近年來,目標檢測算法在類球狀水果識別任務方面有廣泛應用,在遮擋重疊果實、產量預測、水果分類分級和表面缺陷等復雜檢測任務中展現出優(yōu)越的性能。但是,由于果園環(huán)境條件復雜多變,現有的類球狀水果目標檢測算法識別能力的普適性仍有待提高。根據類球狀水果目標檢測算法的發(fā)展趨勢分析,未來的研究可以重點集中在以下幾個方向:
(1)模型優(yōu)化:基于深度學習的目標檢測算法需要根據果實識別需求不斷改進,可以通過引入注意力機制、改變特征提取網絡結構、優(yōu)化損失函數和調整網絡深度、寬度等方法,提高果實目標識別的準確率、加快識別速度以及降低漏檢誤檢率。參考其他領域大模型,研究具有優(yōu)異表現的模型是否可以經過調整用于果實目標識別,進一步提高識別的準確率和效率。
(2)數據集構建與擴增:根據不同采摘任務需求,收集各個生長階段和不同品種的類球狀水果圖像,構建一個包含不同天氣、光照條件(順光、逆光)、果實重疊程度以及遮擋情況的數據集。結合圖像處理方法(如圖像旋轉、翻轉、裁剪、縮放、加噪聲、色彩變換等)或生成對抗網絡(Generative Adversarial Network,GAN)[85]的圖像生成技術進行數據擴增。利用多樣化的數據集進行訓練,可以增強模型的泛化能力和魯棒性[86]。
(3)多模態(tài)數據結合:為了進一步提升類球狀水果識別的精度與適應性,未來研究可以結合激光雷達、深度相機所獲取的三維信息[87],更全面地獲取果實形態(tài)和位置信息,特別是在果實被嚴重遮擋或在光照條件極差的情況下,多模態(tài)數據有助于增強模型的魯棒性。
參考文獻 References:
[1] 劉袁,黃彪,陳昌銀,楊文達,張華東,楊濤.水果采摘機器人采摘裝置機研究現狀[J].農業(yè)科學,2021,11(2):129-132.
LIU Yuan,HUANG Biao,CHEN Changyin,YANG Wenda,ZHANG Huadong,YANG Tao. Research status of fruit picking robot picking device[J]. Journal of Agricultural Sciences,2021,11(2):129-132.
[2] 戴軍. 機器視覺技術在瓜菜檢測應用中的研究進展[J]. 中國瓜菜,2023,36(11):1-9.
DAI Jun. Research progress of machine vision technology in the detection of cucurbits and vegetables[J]. China Cucurbits and Vegetables,2023,36(11):1-9.
[3] 吳劍橋,范圣哲,貢亮,苑進,周強,劉成良. 果蔬采摘機器手系統(tǒng)設計與控制技術研究現狀和發(fā)展趨勢[J]. 智慧農業(yè)(中英文),2020,2(4):17-40.
WU Jianqiao,FAN Shengzhe,GONG Liang,YUAN Jin,ZHOU Qiang,LIU Chengliang. Research status and development direction of design and control technology of fruit and vegetable picking robot system[J]. Smart Agriculture,2020,2(4):17-40.
[4] 初廣麗,張偉,王延杰,丁南南,劉艷瀅. 基于機器視覺的水果采摘機器人目標識別方法[J]. 中國農機化學報,2018,39(2):83-88.
CHU Guangli,ZHANG Wei,WANG Yanjie,DING Nannan,LIU Yanying. A method of fruit picking robot target identification based on machine vision[J]. Journal of Chinese Agricultural Mechanization,2018,39(2):83-88.
[5] 楊健,楊嘯治,熊串,劉力. 基于改進YOLOv5的番茄果實識別估產方法[J]. 中國瓜菜,2024,37(6):61-68.
YANG Jian,YANG Xiaozhi,XIONG Chuan,LIU Li. An improved YOLOv5-based method for tomato fruit identification and yield estimation[J]. China Cucurbits and Vegetables,2024,37(6):61-68.
[6] DENG J,XUAN X J,WANG W F,LI Z,YAO H W,WANG Z Q. A review of research on object detection based on deep learning[J]. Journal of Physics:Conference Series,2020,1684(1):012028.
[7] DU L X,ZHANG R Y,WANG X T. Overview of two-stage object detection algorithms[C]//Journal of Physics:Conference Series. IOP Publishing,2020,1544(1):012033.
[8] 蔣煥煜,彭永石,申川,應義斌. 基于雙目立體視覺技術的成熟番茄識別與定位[J]. 農業(yè)工程學報,2008,24(8):279-283.
JIANG Huanyu,PENG Yongshi,SHEN Chuan,YING Yibin. Recognizing and locating ripe tomatoes based on binocular stereovision technology[J]. Transactions of the Chinese Society of Agricultural Engineering,2008,24(8):279-283.
[9] LIU X Y,ZHAO D A,JIA W K,JI W,SUN Y P. A detection method for apple fruits based on color and shape features[J]. IEEE Access,2019,7:67923-67933.
[10] 夏康利,何強. 基于顏色統(tǒng)計的水果采摘機器人水果識別的研究[J]. 南方農機,2022,53(24):11-16.
XIA Kangli,HE Qiang. Research on fruit recognition for fruit-picking robots based on color statistics[J]. China Southern Agricultural Machinery,2022,53(24):11-16.
[11] COMANICIU D,MEER P. Mean shift:A robust approach toward feature space analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24(5):603-619.
[12] 鄒偉. 基于機器視覺技術的柑橘果實成熟度分選研究[J]. 農業(yè)與技術,2023,43(17):41-44.
ZOU Wei. Research on citrus fruit maturity sorting based on machine vision technology[J]. Agriculture and Technology,2023,43(17):41-44.
[13] 陳雪鑫,卜慶凱. 基于多顏色和局部紋理的水果識別算法研究[J]. 青島大學學報(工程技術版),2019,34(3):52-58.
CHEN Xuexin,BU Qingkai. Research on fruit recognition algorithm based on multi-color and local texture[J]. Journal of Qingdao University (Engineering amp; Technology Edition),2019,34(3):52-58.
[14] 徐惠榮,葉尊忠,應義斌. 基于彩色信息的樹上柑橘識別研究[J]. 農業(yè)工程學報,2005,21(5):98-101.
XU Huirong,YE Zunzhong,YING Yibin. Identification of citrus fruit in a tree canopy using color information[J]. Transactions of the Chinese Society of Agricultural Engineering,2005,21(5):98-101.
[15] FAN Q,ZHUO W,TANG C K,TAI Y W. Few-shot object detection with attention-RPN and multi-relation detector[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle,WA,USA:IEEE,2020:4012-4021.
[16] LIU W,ANGUELOV D,ERHAN D,SZEGEDY C,REED S,FU C Y,BERG A C. SSD:Single shot MultiBox detector[C]//Computer Vision-ECCV 2016:14th European Conference. Amsterdam,The Netherlands:Springer International Publishing,2016:21-37.
[17] HOWARD A G,ZHU M L,CHEN B,KALENICHENKO D,WANG W J,WEYAND T,ANDREETTO M,ADAM H,HEATON J. MobileNets:Efficient convolutional neural networks for mobile vision applications[EB/OL]. 2017:1704.04861. https://arxiv.org/abs/1704.04861v1.
[18] ZHANG X Y,ZHOU X Y,LIN M X,SUN J. ShuffleNet:An extremely efficient convolutional neural network for mobile devices[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City,UT,USA:IEEE,2018:6848-6856.
[19] LIU Z,LIN Y T,CAO Y,HU H,WEI Y X,ZHANG Z,LIN S,GUO B N. Swin transformer:Hierarchical vision transformer using shifted windows[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal,QC,Canada:IEEE,2021:9992-10002.
[20] REDMON J,DIVVALA S,GIRSHICK R,FARHADI A. You only look once:Unified,real-time object detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas,NV,USA:IEEE,2016:779-788.
[21] REDMON J,FARHADI A. YOLO9000:Better,faster,stronger[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu,HI,USA:IEEE,2017:6517-6525.
[22] REDMON J,FARHADI A. YOLOv3:An incremental improvement[EB/OL]. 2018:1804.02767. https://arxiv.org/abs/1804.02-767v1.
[23] LIN T Y,DOLLáR P,GIRSHICK R,HE K M,HARIHARAN B,BELONGIE S. Feature pyramid networks for object detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). July 21-26,2017,Honolulu,HI,USA. IEEE,2017:936-944.
[24] BOCHKOVSKIY A,WANG C Y,LIAO H Y M. YOLOv4:Optimal speed and accuracy of object detection[EB/OL]. 2020:2004.10934. https://arxiv.org/abs/2004.10934v1
[25] JOCHER G. YOLOv5 by Ultralytics (Version7.0) Computersoftware[CP]. 2020,https://doi.org/10.5281/zenodo.3908559.
[26] LI C Y,LI L L,JIANG H L,WENG K H,GENG Y F,LI L,KE Z D,LI Q Y,CHENG M,NIE W Q,LI Y D,ZHANG B,LIANG Y F,ZHOU L Y,XU X M,CHU X X,WEI X M,WEI X L. YOLOv6:A single-stage object detection framework for industrial applications[EB/OL]. arxiv preprint arxiv,2022:2209.02976.
[27] WANG C Y,BOCHKOVSKIY A,LIAO H Y M. YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver,BC,Canada:IEEE,2023:7464-7475.
[28] VARGHESE R,M S. YOLOv8:A novel object detection algorithm with enhanced performance and robustness[C]//2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS). Chennai,India:IEEE,2024:1-6.
[29] 宋中山,劉越,鄭祿,帖軍,汪進. 基于改進YOLOV3的自然環(huán)境下綠色柑橘的識別算法[J]. 中國農機化學報,2021,42(11):159-165.
SONG Zhongshan,LIU Yue,ZHENG Lu,TIE Jun,WANG Jin. Identification of green citrus based on improved YOLOV3 in natural environment[J]. Journal of Chinese Agricultural Mechanization,2021,42(11):159-165.
[30] HUANG G,LIU Z,VAN DER MAATEN L,WEINBERGER K Q. Densely connected convolutional networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu,HI,USA:IEEE,2017:2261-2269.
[31] 呂強,林剛,蔣杰,王明之,張皓楊,易時來. 基于改進YOLOv5s模型的自然場景中綠色柑橘果實檢測[J]. 農業(yè)工程學報,2024,40(18):147-154.
Lü Qiang,LIN Gang,JIANG Jie,WANG Mingzhi,ZHANG Haoyang,YI Shilai. Detecting green citrus fruit in natural scenes using improved YOLOv5s model[J]. Transactions of the Chinese Society of Agricultural Engineering,2024,40(18):147-154.
[32] LIU Y C,SHAO Z R,HOFFMANN N. Global attention mechanism:Retain information to enhance channel-spatial interaction-s[EB/OL]. 2021:2112.05561. https://arxiv.org/abs/2112.05561v1
[33] BODLA N,SINGH B,CHELLAPPA R,DAVIS L S. Soft-NMS-improving object detection with one line of code[C]//2017 IEEE International Conference on Computer Vision (ICCV). Venice:IEEE,2017:5561-5569.
[34] 帖軍,趙捷,鄭祿,吳立鋒,洪博文. 改進YOLOv5模型在自然環(huán)境下柑橘識別的應用[J]. 中國農業(yè)科技導報,2024,26(7):111-120.
TIE Jun,ZHAO Jie,ZHENG Lu,WU Lifeng,HONG Bowen. Application of improved YOLOv5 model in citrus recognition in natural environment[J]. Journal of Agricultural Science and Technology,2024,26(7):111-120.
[35] HU J,SHEN L,SUN G. Squeeze-and-excitation networks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City,UT,USA:IEEE,2018:7132-7141.
[36] HOU Q B,ZHOU D Q,FENG J S. Coordinate attention for efficient mobile network design[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville,TN,USA:IEEE,2021:13708-13717.
[37] ZHANG H Y,WANG Y,DAYOUB F,SüNDERHAUF N. VarifocalNet:An IoU-aware dense object detector[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville,TN,USA:IEEE,2021:8510-8519.
[38] 馬帥,張艷,周桂紅,劉博. 基于改進YOLOv4模型的自然環(huán)境下梨果實識別[J]. 河北農業(yè)大學學報,2022,45(3):105-111.
MA Shuai,ZHANG Yan,ZHOU Guihong,LIU Bo. Recognition of pear fruit under natural environment using an improved YOLOv4 model[J]. Journal of Hebei Agricultural University,2022,45(3):105-111.
[39] 劉忠意,魏登峰,李萌,周紹發(fā),魯力,董雨雪. 基于改進YOLOv5的橙子果實識別方法[J]. 江蘇農業(yè)科學,2023,51(19):173-181.
LIU Zhongyi,WEI Dengfen,LI Meng,ZHOU Shaofa,LU Li,DONG Yuxue. Orange fruit recognition method based on improved YOLOv5[J]. Jiangsu Agricultural Sciences,2023,51(19):173-181.
[40] HAN K,WANG Y H,TIAN Q,GUO J Y,XU C J,XU C. GhostNet:More features from cheap operations[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle,WA,USA:IEEE,2020:1577-1586.
[41] WANG Q L,WU B G,ZHU P F,LI P H,ZUO W M,HU Q H. ECA-net:efficient channel attention for deep convolutional neural networks[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle,WA,USA:IEEE,2020:11531-11539.
[42] 賀英豪,唐德釗,倪銘,蔡起起. 基于改進YOLOv5對果園環(huán)境中李的識別[J]. 華中農業(yè)大學學報,2024,43(5):31-40.
HE Yinghao,TANG Dezhao,NI Ming,CAI Qiqi. Recognizing plums in orchard environment based on improved YOLOv5[J]. Journal of Huazhong Agricultural University,2024,43(5):31-40.
[43] 黃彤鑌,黃河清,李震,呂石磊,薛秀云,代秋芳,溫威. 基于YOLOv5改進模型的柑橘果實識別方法[J]. 華中農業(yè)大學學報,2022,41(4):170-177.
HUANG Tongbin,HUANG Heqing,LI Zhen,Lü Shilei,XUE Xiuyun,DAI Qiufang,WEN Wei. Citrus fruit recognition method based on the improved model of YOLOv5[J]. Journal of Huazhong Agricultural University,2022,41(4):170-177.
[44] WOO S,PARK J,LEE J Y,KWEON I S. CBAM:Convolutional block attention module[EB/OL]. 2018:1807.06521. https://arxiv.org/abs/1807.06521v2
[45] HE J B,ERFANI S,MA X J,BAILEY J,CHI Y,HUA X S. Alpha-IoU:A family of power intersection over union losses for bounding box regression[EB/OL]. 2021:2110.13675. https://arxiv.org/abs/2110.13675v2
[46] 苑迎春,張傲,何振學,張若晨,雷浩. 基于改進YOLOv4-tiny的果園復雜環(huán)境下桃果實實時識別[J]. 中國農機化學報,2024,45(8):254-261.
YUAN Yingchun,ZHANG Ao,HE Zhenxue,ZHANG Ruochen,LEI Hao. Peach fruit real-time recognition in complex orchard environment based on improved YOLOv4-tiny[J]. Journal of Chinese Agricultural Mechanization,2024,45(8):254-261.
[47] ZHU L,DENG Z J,HU X W,FU C W,XU X M,QIN J,HENG P A. Bidirectional feature pyramid network with recurrent attention residual modules for shadow detection[C]//Proceedings of the European Conference on Computer Vision (ECCV). Cham:Springer International Publishing,2018:122-136.
[48] 熊俊濤,鄭鎮(zhèn)輝,梁嘉恩,鐘灼,劉柏林,孫寶霞. 基于改進YOLOv3網絡的夜間環(huán)境柑橘識別方法[J]. 農業(yè)機械學報,2020,51(4):199-206.
XIONG Juntao,ZHENG Zhenhui,LIANG Jia’en,ZHONG Zhuo,LIU Bolin,SUN Baoxia. Citrus detection method in night environment based on improved YOLOv3 network[J]. Transactions of the Chinese Society for Agricultural Machinery,2020,51(4):199-206.
[49] HE K M,ZHANG X Y,REN S Q,SUN J. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas,NV,USA:IEEE,2016:770-778.
[50] 熊俊濤,霍釗威,黃啟寅,陳浩然,楊振剛,黃煜華,蘇穎苗. 結合主動光源和改進YOLOv5s模型的夜間柑橘檢測方法[J]. 華南農業(yè)大學學報,2024,45(1):97-107.
XIONG Juntao,HUO Zhaowei,HUANG Qiyin,CHEN Haoran,YANG Zhengang,HUANG Yuhua,SU Yingmiao. Detection method of citrus in nighttime environment combined with active light source and improved YOLOv5s model[J]. Journal of South China Agricultural University,2024,45(1):97-107.
[51] 余圣新,韋瑩瑩,方輝,李敏,柴秀娟,曾志康,覃澤林. 基于改進YOLOv8的自然環(huán)境下柑橘果實識別[J]. 湖北農業(yè)科學,2024,63(8):23-27.
YU Shengxin,WEI Yingying,FANG Hui,LI Min,CHAI Xiujuan,ZENG Zhikang,QIN Zelin. Citrus fruit recognition in natural environment based on improved YOLOv8[J]. Hubei Agricultural Sciences,2024,63(8):23-27.
[52] MA S L,XU Y,MA S L,XU Y. MPDIoU:A loss for efficient and accurate bounding box regression[EB/OL]. 2023:2307.07662. https://arxiv. org/abs/2307.07662v1.
[53] 岳有軍,漆瀟,趙輝,王紅君. 基于改進YOLOv8的果園復雜環(huán)境下蘋果檢測模型研究[J/OL]. 南京信息工程大學學報,2024:1-13(2024-07-15). https://doi.org/10.13878/j.cnki.jnuist.20240410002.
YUE Youjun,QI Xiao,ZHAO Hui,WANG Hongjun. Research on apple detection model in complex orchard environments based on improved YOLOv8[J/OL]. Journal of Nanjing University of Information Science amp; Technology,2024:1-13(2024-07-15). https://doi.org/10.13878/j.cnki.jnuist.20240410002.
[54] ZHANG H,XU C,ZHANG S J. Inner-IoU:More effective intersection over union loss with auxiliary bounding box[EB/OL]. 2023:2311.02877. https://arxiv.org/abs/2311.02877v4.
[55] 呂石磊,盧思華,李震,洪添勝,薛月菊,吳奔雷. 基于改進YOLOv3-LITE輕量級神經網絡的柑橘識別方法[J]. 農業(yè)工程學報,2019,35(17):205-214.
Lü Shilei,LU Sihua,LI Zhen,HONG Tiansheng,XUE Yueju,WU Benlei. Orange recognition method using improved YOLOv3-LITE lightweight neural network[J]. Transactions of the Chinese Society of Agricultural Engineering,2019,35(17):205-214.
[56] REZATOFIGHI H,TSOI N,GWAK J,SADEGHIAN A,REID I,SAVARESE S. Generalized intersection over union:A metric and a loss for bounding box regression[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach,CA,USA:IEEE,2019:658-666.
[57] 王卓,王健,王梟雄,時佳,白曉平,趙泳嘉. 基于改進YOLOv4的自然環(huán)境蘋果輕量級檢測方法[J]. 農業(yè)機械學報,2022,53(8):294-302.
WANG Zhuo,WANG Jian,WANG Xiaoxiong,SHI Jia,BAI Xiaoping,ZHAO Yongjia. Lightweight real-time apple detection method based on improved YOLOv4[J]. Transactions of the Chinese Society for Agricultural Machinery,2022,53(8):294-302.
[58] 曾俊,陳仁凡,鄒騰躍. 基于改進YOLO的自然環(huán)境下桃子成熟度快速檢測模型[J]. 南方農機,2023,54(24):24-27.
ZENG Jun,CHEN Renfan,ZOU Tengyue. Rapid maturity detection model for peaches in natural environment based on improved YOLO[J]. China Southern Agricultural Machinery,2023,54(24):24-27.
[59] CHEN J R,KAO S H,HE H,ZHUO W P,WEN S,LEE C H,CHAN S H G. Run,Don’t walk:Chasing higher FLOPS for faster neural networks[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver,BC,Canada:IEEE,2023:12021-12031.
[60] LIU G L,REDA F A,SHIH K J,WANG T C,TAO A,CATANZARO B. Image inpainting for irregular holes using partial convolutions[C]//Proceedings of the European conference on computer vision (ECCV). Cham:Springer International Publishing,2018:85-100.
[61] GEVORGYAN Z. SIoU loss:More powerful learning for bounding box regression[EB/OL]. 2022:2205.12740. https://arxiv.org/abs/2205.12740v1
[62] 趙輝,喬艷軍,王紅君,岳有軍. 基于改進YOLOv3的果園復雜環(huán)境下蘋果果實識別[J]. 農業(yè)工程學報,2021,37(16):127-135.
ZHAO Hui,QIAO Yanjun,WANG Hongjun,YUE Youjun. Apple fruit recognition in complex orchard environment based on improved YOLOv3[J]. Transactions of the Chinese Society of Agricultural Engineering,2021,37(16):127-135.
[63] WANG C Y,MARK LIAO H Y,WU Y H,CHEN P Y,HSIEH J W,YEH I H. CSPNet:a new backbone that can enhance learning capability of CNN[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Seattle,WA,USA:IEEE,2020:1571-1580.
[64] YAN B,FAN P,LEI X Y,LIU Z J,YANG F Z. A real-time apple targets detection method for picking robot based on improved YOLOv5[J]. Remote Sensing,2021,13(9):1619.
[65] 王乙涵. 基于改進YOLOv7的自然環(huán)境下柑橘果實識別與定位方法研究[D]. 雅安:四川農業(yè)大學,2023.
WANG Yihan. Research on detection and localization of citrus in natural environment based on improved YOLOv7[D]. Ya’an:Sichuan agricultural university,2023.
[66] DING X H,ZHANG X Y,MA N N,HAN J G,DING G G,SUN J. RepVGG:Making VGG-style ConvNets great again[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville,TN,USA:IEEE,2021:13728-13737.
[67] YANG H W,LIU Y Z,WANG S W,QU H X,LI N,WU J,YAN Y F,ZHANG H J,WANG J X,QIU J F. Improved apple fruit target recognition method based on YOLOv7 model[J]. Agriculture,2023,13(7):1278.
[68] 張震,周俊,江自真,韓宏琪. 基于改進YOLOv7輕量化模型的自然果園環(huán)境下蘋果識別方法[J]. 農業(yè)機械學報,2024,55(3):231-242.
ZHANG Zhen,ZHOU Jun,JIANG Zizhen,HAN Hongqi. Lightweight apple recognition method in natural orchard environment based on improved YOLOv7 model[J]. Transactions of the Chinese Society for Agricultural Machinery,2024,55(3):231-242.
[69] XUE J K,SHEN B. A novel swarm intelligence optimization approach:Sparrow search algorithm[J]. Systems Science amp; Control Engineering,2020,8(1):22-34.
[70] GIRSHICK R,DONAHUE J,DARRELL T,MALIK J. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus,OH,USA:IEEE,2014:580-587.
[71] HEARST M A,DUMAIS S T,OSUNA E,PLATT J,SCHO-LKOPF B. Support vector machines[J]. IEEE Intelligent Systems and Their Applications,1998,13(4):18-28.
[72] HE K M,ZHANG X Y,REN S Q,SUN J. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916.
[73] GIRSHICK R. Fast R-CNN[C]//2015 IEEE International Conference on Computer Vision (ICCV). Santiago,Chile:IEEE,2015:1440-1448.
[74] REN S Q,HE K M,GIRSHICK R,SUN J. Faster R-CNN:Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[75] DAI J F,LI Y,HE K M,SUN J. R-FCN:Object detection via region-based fully convolutional networks[C]. Proceedings of the 30th International Conference on Neural Information Processing Systems,2016:379-387.
[76] CAI Z W,VASCONCELOS N. Cascade R-CNN:Delving into high quality object detection[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City,UT,USA:IEEE,2018:6154-6162.
[77] 任會,朱洪前. 基于深度學習的目標橘子識別方法研究[J]. 計算機時代,2021(1):57-60.
REN Hui,ZHU Hongqian. Research on the method of identifying target orange with deep learning[J]. Computer Era,2021(1):57-60.
[78] WAN S H,GOUDOS S. Faster R-CNN for multi-class fruit detection using a robotic vision system[J]. Computer Networks,2020,168:107036.
[79] LIU J,ZHAO M R,GUO X F. A fruit detection algorithm based on R-FCN in natural scene[C]//2020 Chinese Control and Decision Conference (CCDC). Hefei,China:IEEE,2020:487-492.
[80] 黃磊磊,苗玉彬. 基于深度學習的重疊柑橘分割與形態(tài)復原[J]. 農機化研究,2023,45(10):70-75.
HUANG Leilei,MIAO Yubin. Overlapping citrus segmentation and morphological restoration based on deep learning[J]. Journal of Agricultural Mechanization Research,2023,45(10):70-75.
[81] 荊偉斌,李存軍,競霞,趙葉,程成. 基于深度學習的蘋果樹側視圖果實識別[J]. 中國農業(yè)信息,2019,31(5):75-83.
JING Weibin,LI Cunjun,JING Xia,ZHAO Ye,CHENG Cheng. Fruit identification with apple tree side view based on deep learning[J]. China Agricultural Informatics,2019,31(5):75-83.
[82] 賈艷平,桑妍麗,李月茹. 基于改進Faster R-CNN模型的水果分類識別[J]. 食品與機械,2023,39(8):129-135.
JIA Yanping,SANG Yanli,LI Yueru. Fruit identification using improved Faster R-CNN model[J]. Food amp; Machinery,2023,39(8):129-135.
[83] LU J Q,YANG R F,YU C R,LIN J H,CHEN W D,WU H W,CHEN X,LAN Y B,WANG W X. Citrus green fruit detection via improved feature network extraction[J]. Frontiers in Plant Science,2022,13:946154.
[84] MIN W Q,WANG Z L,YANG J H,LIU C L,JIANG S Q. Vision-based fruit recognition via multi-scale attention CNN[J]. Computers and Electronics in Agriculture,2023,210:107911.
[85] GOODFELLOW I J,POUGET-ABADIE J,MIRZA M,XU B,WARDE-FARLEY D,OZAIR S,COURVILLE A,BENGIO Y. Generative adversarial networks[J]. Communications of the ACM,2020,63(11):139-144.
[86] 戈明輝,張俊,陸慧娟. 基于機器視覺的食品外包裝缺陷檢測算法研究進展[J]. 食品與機械,2023,39(9):95-102.
GE Minghui,ZHANG Jun,LU Huijuan. Research progress of food packaging defect detection based on machine vision[J]. Food amp; Machinery,2023,39(9):95-102.
[87] 任磊,張俊,陸勝民. 脫囊衣橘片自動分揀機器視覺算法研究[J]. 浙江農業(yè)學報,2015,27(12):2212-2217.
REN Lei,ZHANG Jun,LU Shengmin. Research on machine vision algorithm for automatic sorting of membrane-removed mandarin segments[J]. Acta Agriculturae Zhejiangensis,2015,27(12):2212-2217.
收稿日期:2024-06-18 接受日期:2024-12-06
基金項目:國家柑橘產業(yè)技術體系(CARS-26-29)
作者簡介:李輝,男,碩士,主要從事基于3D視覺的采摘機器人檢測算法研究。E-mail:3023763876@qq.com
*通信作者Author for correspondence. E-mail:hunterzju@163.com