曾敏 吳圣健 李坊 陳直
摘要:近年來,基于深度學習模型的圖像識別技術已成為智能零售柜的主要解決方案。設計了一種新的基于雙神經網絡模型的智能零售柜系統(tǒng)。該系統(tǒng)與單模型設計比較,除檢測召回率和分類準確度有顯著提升外,還大大降低了因增加新品種而導致的模型再訓練時間。首先,利用Faster RCNN模型完成商品大類(按包裝分類)的分類檢測任務,以提高檢測召回率;其次,利用ResNet50模型完成商品小類(按品種分類)的分類任務,以提高分類準確度。與此同時,還在最難分品種集上進行了多種數據增強消融實驗研究,以改進該難分品種集所屬大類數據集的分類準確度。
關鍵詞:深度學習;圖像檢測;圖像分類;智能零售柜;神經網絡模型
中圖分類號:TP181? ? ? 文獻標識碼:A
文章編號:1009-3044(2021)26-0009-05
開放科學(資源服務)標識碼(OSID):
Design and Implementation of Intelligent Retail Cabinet Based on Double Neural Network Model
ZENGMin1,WU Sheng-jian2, LI Fang1, CHEN Zhi1
(1. Dept. of Communication and Information Engineering, Shanghai Technical institute of Electronics & information, Shanghai 201411, China;2. FinVolution Group, Shanghai 201203, China)
Abstract:In recent years, image recognition technology based on deep learning models has become the main solution for intelligent retail cabinets. A new intelligent retail cabinet system based on dual neural network model is introduced. Compared with the single model design, this system not only significantly enhances the detection recall rate and classification accuracy, but also greatly reduces the model retraining time caused by the addition of new varieties. First, the Faster RCNN model is used to complete the rough classification and detection task of commodity categories (classified by packaging) to improve the detection recall rate; secondly, the ResNet50 model is employed to complete the fine classification task of commodity categories (classified by variety) to improve classification accuracy degree. At the same time, some data augment ablation experiments were conducted on the most difficult-to-classification variety set of this project to refine the fine classification accuracy of the commodity categories (classified by variety) to which the difficult-to-classification variety set belongs.
Key words:deeplearning; image detection; image classification; intelligent retail cabinets; neural network model
近年來,無人零售作為一種便利的零售新業(yè)態(tài),在我國許多城市得到了長足發(fā)展。根據前瞻產業(yè)研究院發(fā)布的《中國新零售行業(yè)商業(yè)模式創(chuàng)新與投資機會深度研究報告》預測,2022年無人零售用戶可達2.45億人,交易額將超1.8萬億元[1]。無人零售的快速增長,得益于多種技術的發(fā)展和融合,特別是移動支付的普及和人工智能、云計算等高新技術的應用落地[2]。
目前,我國無人值守零售柜有4種技術實現形式[3,5],分別是①以“友寶公司”為代表的機械式自動售賣機。其發(fā)展較早,技術難度低,產品成熟,但制造成本較高,購物流程相對煩瑣;②以“每日優(yōu)鮮”為代表的RFID(Radio Frequency Identification)零售柜。其技術成熟,市場占有率高,但RFID標簽制作成本也高;③以“京東到家”為代表的重力感應零售柜。其依靠重力感應來識別商品的品類和價格,商品可自由擺放,空間利用率高,但對稱重傳感器的靈敏度要求高;④以“深蘭”“購呀”為代表的視覺識別零售柜。其主要利用圖像識別技術,能適應復雜多樣的消費場景,是未來零售智能化的方向[6]。視覺識別零售柜又分為動態(tài)和靜態(tài)兩種,其中深蘭以3D動態(tài)視覺技術見長,其TakeGo與AmazonGo類似,識別率的提高除采用較大神經網絡模型外,還需要相應的糾錯算法來降低諸如用戶單手取多件商品等行為的識別誤差,設備成本和計算量相對于靜態(tài)識別都較高,擴大市場規(guī)模的難度較大;購呀目前專注于做靜態(tài)識別零售柜,其設備簡單,成本低,易于擴大規(guī)模[3-4]。但這種低成本的無人值守零售柜的技術難點是如何提高所售商品的檢測召回率和分類準確度。為此,本文設計了一種新的基于雙神經網絡模型的智能零售柜系統(tǒng),其售賣流程見圖1所示:通過手機掃碼開門,客戶自助取貨;關門后系統(tǒng)智能識別,結算扣款。該系統(tǒng)力圖在有限的硬件支持下,利用雙神經網絡模型,使其所售商品的檢測召回率和分類準確度達到落地商用的要求。