洪永勝 于 雷? 朱亞星 吳紅霞 聶 艷 周 勇 QI Feng 夏 天
(1 華中師范大學(xué)地理過程分析與模擬湖北省重點(diǎn)實(shí)驗(yàn)室,武漢 430079)
(2 華中師范大學(xué)城市與環(huán)境科學(xué)學(xué)院,武漢 430079)
(3 School of Environmental and Sustainability Sciences,Kean University,NJ 07083,USA)
基于EPO算法去除水分影響的土壤有機(jī)質(zhì)高光譜估算*
洪永勝1,2于 雷1,2?朱亞星1,2吳紅霞1,2聶 艷1,2周 勇1,2QI Feng3夏 天1,2
(1 華中師范大學(xué)地理過程分析與模擬湖北省重點(diǎn)實(shí)驗(yàn)室,武漢 430079)
(2 華中師范大學(xué)城市與環(huán)境科學(xué)學(xué)院,武漢 430079)
(3 School of Environmental and Sustainability Sciences,Kean University,NJ 07083,USA)
野外進(jìn)行土壤有機(jī)質(zhì)的光譜快速預(yù)測(cè)時(shí)需考慮土壤含水量的影響。在室內(nèi)設(shè)計(jì)人工加濕實(shí)驗(yàn)分別獲取9個(gè)土壤含水量梯度(0~32%,間隔4%)的土壤光譜數(shù)據(jù),分析土壤含水量變化對(duì)光譜的影響,再利用外部參數(shù)正交化法(external parameter orthogonalization,EPO)進(jìn)行濕土光譜校正,并結(jié)合偏最小二乘回歸和支持向量機(jī)回歸分別建立土壤有機(jī)質(zhì)預(yù)測(cè)模型。結(jié)果表明,土壤光譜反射率隨著土壤含水量的增加呈非線性降低趨勢(shì),偏最小二乘回歸模型的預(yù)測(cè)偏差比為1.16,模型不可用;經(jīng)EPO算法校正后,各土壤含水量梯度之間的光譜差異性降低,能實(shí)現(xiàn)土壤有機(jī)質(zhì)在不同土壤含水量梯度的有效估算,偏最小二乘回歸和支持向量機(jī)回歸模型的預(yù)測(cè)偏差比分別提高至1.76和2.15。研究結(jié)果可為田間快速預(yù)測(cè)土壤有機(jī)質(zhì)提供必要參考。
土壤光譜;有機(jī)質(zhì);含水量;外部參數(shù)正交化法;支持向量機(jī)回歸;江漢平原
土壤有機(jī)質(zhì)是土壤理化特性的一個(gè)重要參數(shù),是指導(dǎo)農(nóng)業(yè)科學(xué)施肥的重要依據(jù)[1]??焖?、準(zhǔn)確、低成本地獲取大面積土壤有機(jī)質(zhì)信息,是實(shí)現(xiàn)農(nóng)業(yè)現(xiàn)代化管理的重要基礎(chǔ)。近年來,以可見近紅外光譜為代表的土壤近地傳感技術(shù)憑借獨(dú)特的優(yōu)點(diǎn)(快速、方便、環(huán)保、無損等)為獲取土壤有機(jī)質(zhì)信息提供了新的途徑,逐漸成為動(dòng)態(tài)監(jiān)測(cè)土壤有機(jī)質(zhì)信息的理想手段[2-3]。
與室內(nèi)土壤光譜測(cè)量相比,野外原位監(jiān)測(cè)可以有效提高數(shù)據(jù)的分析效率,節(jié)約土樣采集與制備(風(fēng)干、研磨、過篩等)的工作時(shí)間。但是,野外田間光譜觀測(cè)極易受外部環(huán)境因素的干擾(如土壤濕度、環(huán)境溫度、粗糙表面等),其數(shù)據(jù)的模型精度通常要低于室內(nèi)相對(duì)穩(wěn)定環(huán)境下的模擬結(jié)果[4]。特別是土壤含水量變化對(duì)土壤有機(jī)質(zhì)的光譜表征及定量預(yù)測(cè)[5]。因此,克服土壤含水量對(duì)土壤光譜的影響至關(guān)重要。
外部參數(shù)正交化法(external parameter orthogonalization,EPO)由Roger等[6]首先提出并應(yīng)用于校正溫度變量對(duì)光譜預(yù)測(cè)水果中糖分含量的影響。Minasny等[7]將該方法借鑒至土壤,研究結(jié)果顯示EPO-偏最小二乘回歸方法可以實(shí)現(xiàn)濕土狀態(tài)下的土壤有機(jī)碳的精確估算。Ge等[8]、Ji等[9]、Ackerson等[10]、Liu等[11]進(jìn)一步考察了EPO法在光譜估算土壤各種理化屬性研究中的去水效果。目前,利用EPO算法校正后的轉(zhuǎn)換光譜的建模方法多集中于線性方法——偏最小二乘回歸,與非線性方法的結(jié)合并不多見。此外,EPO算法中的關(guān)鍵參數(shù)維度g對(duì)光譜的校正轉(zhuǎn)換有重要影響,定量分析其對(duì)光譜的轉(zhuǎn)換效果十分關(guān)鍵。
鑒于此,本研究在室內(nèi)設(shè)計(jì)人工加濕土壤樣本實(shí)驗(yàn)來模擬土壤含水量變化過程,1)定性考察土壤含水量變化對(duì)土壤光譜的影響;2)定量評(píng)估EPO算法結(jié)合線性和非線性模型對(duì)土壤有機(jī)質(zhì)的預(yù)測(cè)精度。
2010—2015年,以江漢平原的公安縣和潛江市為采樣區(qū)域,采集耕層(0~20 cm深度)土壤樣品。清除土樣中小礫石、表層枯枝落葉、動(dòng)物殘?bào)w等外來侵入物后,將土樣帶回實(shí)驗(yàn)室進(jìn)行風(fēng)干、研磨、過2 mm篩等處理,并建立樣品庫(234個(gè)樣本)。樣品充分混均后分為兩份,分別用于土壤農(nóng)化分析測(cè)試和光譜數(shù)據(jù)采集。其中,土壤有機(jī)質(zhì)含量的測(cè)定采用重鉻酸鉀容量法-外加熱法[12]。
從樣品庫中隨機(jī)選取122份組成集合S0;然后選取采自潛江市竹根灘鎮(zhèn)的潮土樣本95份,其土壤母質(zhì)為石灰質(zhì)近代河流沖積物,土壤表層質(zhì)地以砂質(zhì)壤土為主;按照土壤有機(jī)質(zhì)含量進(jìn)行升序排列,間隔2或3個(gè)樣本挑選出60份土樣組成S1,剩余35份組成S2。
本文采用土壤質(zhì)量含水率作為土壤含水量,共設(shè)計(jì)9個(gè)土壤含水量梯度(0、4%、8%、12%、16%、20%、24%、28%、32%)。每份土樣稱取150 g烘干土(土壤含水量≈0),放置于直徑7.5 cm、深度5 cm規(guī)格的圓柱形黑盒中進(jìn)行光譜測(cè)量,得到干土光譜數(shù)據(jù)(干土樣本);而后以每個(gè)土樣增加4%的土壤含水量進(jìn)行濕水處理(用噴霧器均勻噴灑約6 g水),并迅速密封以防水分蒸發(fā);靜置8h使水分被土壤充分吸收后再測(cè)量濕土光譜數(shù)據(jù),并稱重以計(jì)算實(shí)際土壤含水量;重復(fù)加水步驟直至完成所有梯度實(shí)驗(yàn)。
采用美國Analytical Spectral Devices公司生產(chǎn)的ASD FieldSpec3地物光譜儀測(cè)定土壤可見近紅外光譜反射率。儀器波譜范圍涵蓋350~2 500 nm,采樣間隔為1.4 nm(350~1 000 nm)和2nm(1 000~2 500 nm)。配備50 W鹵素?zé)艉?°視場(chǎng)角光纖探頭。光源到土壤表面距離(L)為50 cm,光源入射角度(A)為30°,探頭到土壤表面距離(H)為15 cm[13]。光譜測(cè)量在暗室內(nèi)進(jìn)行。測(cè)試之前,先對(duì)儀器進(jìn)行標(biāo)準(zhǔn)白板校正,每個(gè)土樣測(cè)量4個(gè)方向(轉(zhuǎn)動(dòng)3次,每次90°),每個(gè)方向上保存5條光譜曲線,共20條,作算術(shù)平均后得到土樣實(shí)際的反射光譜數(shù)據(jù)。
S0共122條光譜數(shù)據(jù)(122個(gè)樣本),S1共540條光譜數(shù)據(jù)(60個(gè)樣本,9個(gè)土壤含水量梯度),S2共315條光譜數(shù)據(jù)(35個(gè)樣本,9個(gè)土壤含水量梯度)。
選用2階11點(diǎn)Savitzky-Golay(SG)濾波平滑法對(duì)3個(gè)樣本集的原始土壤光譜反射率進(jìn)行平滑去噪[14]。每份土樣保留400~2 400 nm的光譜數(shù)據(jù)用于處理分析,對(duì)全光譜數(shù)據(jù)的每10個(gè)相鄰波段進(jìn)行平均。
采用4種方法來分析土壤含水量對(duì)土壤光譜反射率的影響:差值光譜、包絡(luò)線去除、二維相關(guān)光譜、主成分分析。二維相關(guān)光譜:能反映分子間相互作用,可以提供比一維光譜更多的結(jié)構(gòu)信息,能夠清晰顯示出重疊峰[15],本研究采用二維同步相關(guān)光譜。主成分分析:通過變換保留方差大、包含信息量多的組分,舍棄信息量少的組分,其計(jì)算采用非線性迭代偏最小二乘算法,計(jì)算步驟簡(jiǎn)單,運(yùn)算速度快[16]。
通常,受土壤含水量影響的濕土光譜可以用矩陣表示為:
式中,P為X(濕土光譜矩陣)中對(duì)土壤有機(jī)質(zhì)有用信息的投影矩陣;Q為無用信息(受土壤含水量的影響)的投影矩陣;R為殘差矩陣。EPO算法的思想就是從X中提取有用光譜信息的投影矩陣P,采用S1開發(fā)投影矩陣P,具體算法描述如下:(1)干土樣本(n×m),計(jì)算干土樣本平均光譜(1×m);(2)濕土樣本(N×m),計(jì)算不同水分梯度樣本平均光譜(h×m);(3)計(jì)算干土樣本與濕土樣本的差值D(h×m);(4)對(duì)DTD進(jìn)行主成分分析,得到矩陣V(m×m);(5)定義EPO的維度g,計(jì)算V矩陣的一個(gè)子集Vs(m×g);(6)計(jì)算Q=VsVsT;(7)計(jì)算投影矩陣P,P=I-Q,I為一個(gè)單位矩陣。式中,n=60(S1包含樣本的總量),h=8(土壤含水量梯度個(gè)數(shù)),N=n×h=480(濕土樣本總數(shù)),m=201(波長(zhǎng)變量個(gè)數(shù)),g(EPO維度)。
偏最小二乘回歸是一種多變量統(tǒng)計(jì)分析方法,是利用光譜技術(shù)估算土壤屬性參數(shù)的常見方法之一,屬于線性建模技術(shù)。本研究中偏最小二乘回歸模型采用Leave-one-out交叉驗(yàn)證法來確定回歸模型中最佳因子數(shù),防止過擬合的發(fā)生[17]。
支持向量機(jī)回歸在建模過程中將原來的樣本空間中低維非線性的輸入映射到高維線性的輸出,并在高維特征空間進(jìn)行線性回歸,屬于非線性建模技術(shù)。研究采用ε-支持向量機(jī)回歸算法和RBF徑向基核函數(shù)。通常,ε-支持向量機(jī)回歸算法有兩個(gè)參數(shù)需要優(yōu)化確定:懲罰參數(shù)C和RBF核參數(shù)γ,本研究中為與 EPO維度g一起進(jìn)行參數(shù)尋優(yōu),將RBF核參數(shù)γ進(jìn)行固定[18]。
采用決定系數(shù)R2、均方根誤差RMSE和預(yù)測(cè)偏差比(ratio of prediction to deviation,RPD)值作為模型精度的評(píng)價(jià)指標(biāo)。其中,R2越接近于1,說明模型的穩(wěn)定性越好、擬合程度越高;RMSE越接近0,說明模型估算預(yù)測(cè)能力越好。當(dāng)RPD<1.4時(shí),模型不可用;1.4≤RPD<2時(shí),模型效果一般,只能進(jìn)行粗略估計(jì);RPD≥2時(shí),模型具有較好的預(yù)測(cè)能力[19]。
第一步,利用S0建立偏最小二乘回歸模型(S0的122個(gè)樣本的光譜數(shù)據(jù)為自變量,土壤有機(jī)質(zhì)數(shù)據(jù)為因變量),基于偏最小二乘回歸模型將S2的35個(gè)樣本的315條光譜數(shù)據(jù)作為輸入變量預(yù)測(cè)土壤有機(jī)質(zhì)。
第二步,利用S1構(gòu)建EPO轉(zhuǎn)換矩陣P。
第三步,將P矩陣應(yīng)用于處理S0、S2中各個(gè)樣本的光譜數(shù)據(jù),構(gòu)建S0的轉(zhuǎn)換樣本集S0*和S2的轉(zhuǎn)換樣本集S2*。
第四步,利用S0*建立EPO-偏最小二乘回歸模型和EPO-支持向量機(jī)回歸模型(S0*的122個(gè)樣本的光譜數(shù)據(jù)為自變量,土壤有機(jī)質(zhì)數(shù)據(jù)為因變量),基于兩個(gè)模型將S2*的35個(gè)樣本的315條光譜數(shù)據(jù)作為輸入變量分別預(yù)測(cè)土壤有機(jī)質(zhì)。
第五步,對(duì)比偏最小二乘回歸模型和EPO-偏最小二乘回歸模型的預(yù)測(cè)精度,以檢驗(yàn)土壤含水量的消除效果;對(duì)比EPO-偏最小二乘回歸模型和EPO-支持向量機(jī)回歸模型的預(yù)測(cè)精度,優(yōu)選出最佳的估算模型。
對(duì)3個(gè)樣本集(S0,S1和S2)的土壤有機(jī)質(zhì)統(tǒng)計(jì)特征進(jìn)行分析描述(表1),三者樣本集之間的土壤樣本不重復(fù)。與S1、S2相比,S0具有較大的標(biāo)準(zhǔn)差和變異系數(shù),適合反演模型的構(gòu)建,所建模型將具有很好的代表性。
表1 3個(gè)樣本集(S0,S1和S2)的土壤有機(jī)質(zhì)描述統(tǒng)計(jì)特征Table 1 Descriptive statistics of the soil organic matter in the three subsets(S0,S1and S2)
計(jì)算S1和S2(共95個(gè)樣本,下同)在9個(gè)土壤含水量梯度的平均土壤光譜反射率(圖1a),在整個(gè)光譜區(qū)域內(nèi)(400~2 400 nm),隨著土壤含水量的增加,光譜反射率整體呈降低趨勢(shì),這主要是因?yàn)橥寥浪衷斐晒饩€在土壤顆粒間的折射增多,導(dǎo)致光譜反射率降低;當(dāng)土壤含水量<16%時(shí),土壤含水量變化對(duì)土壤光譜反射率變化的影響相對(duì)較大,當(dāng)土壤含水量≥16%時(shí),土壤含水量變化對(duì)土壤光譜反射率變化的影響降低,土壤光譜反射率降低速率趨緩,尤其是400~800 nm范圍內(nèi)的光譜反射率基本重疊[20-21]。
計(jì)算S1和S2不同土壤含水量梯度濕土樣本與干土樣本平均光譜反射率的差值(圖1b),1 940 nm附近的光譜吸收谷是全波段區(qū)域最大的吸收谷,要大于其他波段;土壤含水量對(duì)土壤近紅外波段光譜反射率的影響要明顯強(qiáng)于可見光波段。
計(jì)算S1和S2不同土壤含水量梯度濕土樣本與干土樣本平均光譜反射率的包絡(luò)線去除(圖1c),在1 450、1 940和2 200 nm波段附近存在明顯的吸收谷;在1 450 nm附近,隨著土壤含水量的增加,光譜吸收帶向右移動(dòng)(紅外方向),而該現(xiàn)象在1 940 nm附近則不明顯。
基于上述包絡(luò)線去除結(jié)果進(jìn)行二維相關(guān)光譜分析,結(jié)果見圖1d(為便于對(duì)比,將colorbar中0附近的值設(shè)置顯示為白色)。在1 450 nm和1 940 nm附近出現(xiàn)了自相關(guān)峰,且1 940 nm附近的自相關(guān)峰要強(qiáng)于1 450 nm附近;1 940 nm附近以層間水為主的H2O譜帶對(duì)水分微擾較為敏感,1 450 nm附近的以羥基為主的羥基(-OH)帶譜則相比不敏感[22]。
已有研究表明,在1 450、1 940和2 200 nm附近存在有機(jī)質(zhì)的響應(yīng)波段[23-24],但是,由以上分析可知,土壤含水量對(duì)土壤光譜反射率的影響十分明顯,尤其是1 940nm附近強(qiáng)烈的水分吸收谷甚至部分遮蓋了2 200 nm附近的吸收谷,從而可能影響土壤有機(jī)質(zhì)的預(yù)測(cè)。因此,必須采用特定土壤含水量消除算法對(duì)濕土光譜進(jìn)行分析校正,降低土壤含水量對(duì)有機(jī)質(zhì)估算的影響。
選取維度g等于2、4、6、8、10時(shí),分析投影矩陣P的變化(圖2)。當(dāng)維度g=2、4時(shí),投影矩陣P的變化比較均勻、較為光滑;而維度g=6、8、10時(shí),投影矩陣P的變化較錯(cuò)亂、“毛刺”較多、噪聲明顯。這主要是由于,較大的EPO維度包含著由DTD矩陣分解時(shí)產(chǎn)生大量的特征向量,子空間Q矩陣會(huì)引入外部光譜噪音,進(jìn)而導(dǎo)致投影矩陣P(P=I-Q)也有更多的噪音信息。因此,不同EPO維度g對(duì)轉(zhuǎn)換光譜的模型精度有重要影響,選取合適的維度g至關(guān)重要。
圖1 四種光譜分析方法Fig. 1 Four spectral analysis methods
圖2 2(a)、4(b)、6(c)、8(d)、10(e)個(gè)EPO維度的投影矩陣P視覺圖Fig. 2 Visual diagrams of EPO projection matrix P with two(a),four(b),six(c),eight(d)and ten(e)EPO dimensions
EPO-偏最小二乘回歸模型中有兩個(gè)參數(shù)需要優(yōu)化確定:EPO維度g和偏最小二乘回歸最佳因子數(shù)k,以模型中的RMSEcv值最小作為確定最佳參數(shù)組合的標(biāo)準(zhǔn)。EPO-偏最小二乘回歸模型中維度g優(yōu)化范圍為1~10,因子數(shù)k優(yōu)化取值范圍為1~20,利用S1建立土壤有機(jī)質(zhì)的偏最小二乘回歸估算模型。當(dāng)因子數(shù)k≥14,RMSEcv的值變化趨緩;當(dāng)維度g=4時(shí),可以獲得最小的RMSEcv=1.48值(圖3a)。因此,g=4,k=14為最佳參數(shù)組合,用于后續(xù)數(shù)據(jù)建模分析。
同理,EPO-支持向量機(jī)回歸模型中EPO維度g和懲罰參數(shù)C需要優(yōu)化確定。維度g優(yōu)化范圍為1~10,固定支持向量機(jī)回歸算法中的RBF核參數(shù)γ,懲罰參數(shù)C的變化范圍為8~64(步長(zhǎng)為8),以模型中的RMSEcv最小值作為確定最佳參數(shù)組合(g,C)的標(biāo)準(zhǔn)。當(dāng)維度g=2、C=64時(shí),可以獲得最小的RMSEcv=1.17值(圖3b)。因此,g=2,C=64為最佳參數(shù)組合,用于后續(xù)數(shù)據(jù)建模分析。
對(duì)EPO算法校正前、后(EPO-偏最小二乘回歸模型的維度g=4、EPO-支持向量機(jī)回歸模型的維度g=2)的S2光譜數(shù)據(jù)集的樣本進(jìn)行均值中心化(mean centre)后,再利用主成分分析法計(jì)算每個(gè)樣本在前兩個(gè)主成分得分(圖4)。EPO算法校正前的樣本點(diǎn)分布總體上相對(duì)分散(圖4a),相同土壤含水量梯度的樣本點(diǎn)聚集程度高,濕土樣本點(diǎn)與干土樣本點(diǎn)之間相互分離程度大,不同土壤含水量梯度的濕土樣本點(diǎn)之間空間重疊程度很低,反映出不同土壤含水量梯度樣本點(diǎn)之間具有明顯差異;這表明土壤含水量的變化對(duì)土樣樣本的光譜數(shù)據(jù)具有明顯的影響。而EPO算法校正后的濕土樣本點(diǎn)在主成分得分空間的位置基本與干土樣本點(diǎn)相同(圖4b、圖4c),說明因土壤含水量造成的光譜差異基本消除。
EPO算法校正前(表2),偏最小二乘回歸模型的R2
圖3 EPO訓(xùn)練集S1建立EPO-偏最小二乘回歸(a)、EPO-支持向量機(jī)回歸(b)的土壤有機(jī)質(zhì)估算模型中不同參數(shù)組合所產(chǎn)生的RMSEcv值Fig. 3 RMSEcv values yielded by the soil organic matter prediction models,EPO-partial least squares regression(a)and EPO-support vector machine regression(b),different in parameter composition built for EPO training subset
圖4 EPO算法校正前(a)、后的驗(yàn)證樣本集S2光譜的前兩個(gè)主成分得分圖(EPO-偏最小二乘回歸(b),EPO-支持向量機(jī)回歸(c))Fig. 4 Scatter diagrams of the scores obtained by the first and second principal components in the principal component analysis of the specra of Subset S2 before(a)and after EPO calibration(EPO-partial least squares regression(b),EPO-support vector machine regression(c))
c為0.75,S2的R2p、RPD分別為0.28、1.16,模型精度較差,無法實(shí)現(xiàn)土壤有機(jī)質(zhì)在不同土壤含水量梯度的有效估算(圖5)。經(jīng)EPO算法校正后(表2),EPO-偏最小二乘回歸模型的R2c為0.80,S2*的R2p、RPD分別為0.68、1.76,模型精度一般,可以實(shí)現(xiàn)土壤有機(jī)質(zhì)在不同土壤含水量梯度的粗略估算(圖5);相比校正前,預(yù)測(cè)精度有所提升,S2*的驗(yàn)證R2p提高了0.4,RMSEp降低了0.88,RPD值從1.16提高至1.76,模型從不可用級(jí)別提高至可用級(jí)別,表明EPO算法對(duì)于去除土壤含水量影響、提高土壤有機(jī)質(zhì)的估測(cè)精度具有很好的可行性。
EPO-支持向量機(jī)回歸模型的R2p、RPD分別為0.78、2.15(表2),可以實(shí)現(xiàn)土壤有機(jī)質(zhì)在不同土壤含水量梯度的精確估算,在給定5%顯著性水平,計(jì)算得到95%預(yù)測(cè)置信區(qū)間(圖5),說明模型精度較好、偏差小,解釋能力較強(qiáng);相比EPO-偏最小二乘回歸模型,S2*的R2p、RPD值分別提高了0.1、0.39,模型從可用級(jí)別提高到較好級(jí)別,表明EPO算法與非線性支持向量機(jī)回歸建模方法相結(jié)合可以有效處理不同土壤含水量梯度中非線性影響因素,提高有機(jī)質(zhì)估測(cè)精度。
表2 EPO算法校正前、后的干土模型(S0建模)驗(yàn)證S2的土壤有機(jī)質(zhì)預(yù)測(cè)結(jié)果Table 2 Verification of predictions of soil organic matter in soil samples of S2 with the S0 based model calibrated with or without EPO
圖5 EPO算法校正前(a)、后的驗(yàn)證樣本集S2的土壤有機(jī)質(zhì)預(yù)測(cè)散點(diǎn)圖(EPO-偏最小二乘回歸(b),EPO-支持向量機(jī)回歸(c))Fig.5 Scatter plots of predicted soil organic matter in the soil samples in the S2 validation dataset before(a)and after EPO calibration(EPO-partial least squares regression(b),EPO-support vector machine regression(c))
本研究以潮土為試驗(yàn)載體,通過在室內(nèi)人工設(shè)置不同土壤含水量梯度,模擬野外環(huán)境土壤含水量的影響,獲取9個(gè)梯度土壤含水量(0~32%)的土壤光譜數(shù)據(jù)。Nocita等[25]設(shè)計(jì)了0~25%的土壤含水量變化范圍(5%的土壤含水量梯度間隔),采用主成分分析法研究表明當(dāng)土壤含水量>15%時(shí)光譜對(duì)變化土壤含水量的敏感度有所降低。這與本研究的結(jié)果具有很好的相似性(本研究設(shè)置了4%的土壤含水量梯度間隔):土壤含水量的增加會(huì)導(dǎo)致光譜反射率整體降低,當(dāng)土壤含水量<16%時(shí),土壤含水量變化對(duì)土壤光譜反射率變化的影響相對(duì)較大,而當(dāng)土壤含水量≥16%時(shí),土壤含水量變化對(duì)土壤光譜反射率變化的影響降低,土壤光譜反射率降低速率趨緩,繼續(xù)提高土壤含水量至32%(小于田間最大持水量)過程中土壤光譜反射率的變化逐漸趨于穩(wěn)定。因此,本文設(shè)置0~32%范圍的土壤含水量,為揭示土樣在不同土壤含水量條件的光譜響應(yīng)規(guī)律提供了基礎(chǔ)。
通過對(duì)比EPO算法校正前、后的土樣在不同土壤含水量梯度的光譜主成分分析散點(diǎn)圖,發(fā)現(xiàn)EPO算法能夠有效去除土壤含水量對(duì)土壤光譜的影響,與Ge等[8]、Ackerson等[10]、Liu等[11]研究較為一致。但不同EPO維度g對(duì)轉(zhuǎn)換光譜有重要影響(圖2),在實(shí)際應(yīng)用中,應(yīng)注意選取合適的維度g。此外,由于本研究已經(jīng)在室內(nèi)獲取了濕土光譜的土壤含水量消除轉(zhuǎn)換矩陣P,野外實(shí)踐時(shí)不必掌握土壤含水量的變化情況(但小于田間最大持水量),而且在實(shí)際應(yīng)用操作時(shí)簡(jiǎn)單方便,只需將轉(zhuǎn)換矩陣P與濕土光譜數(shù)據(jù)進(jìn)行相乘。
Morgan等[26]利用田間濕土光譜直接進(jìn)行土壤有機(jī)碳預(yù)測(cè),其RPD可達(dá)1.45,但相比室內(nèi)過篩研磨后的干土光譜的模型精度(RPD=1.71),發(fā)現(xiàn)田間原位光譜的土壤有機(jī)碳預(yù)測(cè)精度要低;Stenberg[27]研究表明土樣加濕對(duì)于提升土壤有機(jī)碳建模精度具有積極作用,利用Vis-NIR建立濕土光譜時(shí)模型的驗(yàn)證R2最大可提升0.15,但必須在特定的土壤含水量狀態(tài)下(建模和驗(yàn)證樣本需要來自相似的土壤含水量)才可以減小土壤含水量對(duì)土壤有機(jī)碳的光譜預(yù)測(cè)影響。因此,在保證其他外界干擾因素(環(huán)境光、礫石、植被殘積物等)差異最小化的前提下,需要采用特定的土壤含水量消除算法結(jié)合非線性建模技術(shù)用以進(jìn)一步優(yōu)化模型性能。本研究中,相比偏最小二乘回歸建模方法,采用EPO算法與支持向量機(jī)回歸方法的結(jié)合可以在消除土壤含水量影響的基礎(chǔ)上提高模型精度,這與Li等[28]將bagging-LS-SVM方法應(yīng)用到原位光譜監(jiān)測(cè)中較為相似。
近年來,隨著全球、洲際、國家、區(qū)域等各個(gè)不同尺度的土壤光譜數(shù)據(jù)庫的建立與完善,積累了大量的存量土壤數(shù)據(jù),形成了非常有價(jià)值的先驗(yàn)知識(shí)庫,這為研究結(jié)合已有土壤光譜庫的先驗(yàn)信息來消除野外動(dòng)態(tài)變化的土壤含水量對(duì)土壤屬性參數(shù)的反演提供了數(shù)據(jù)基礎(chǔ)[29-30]。因此,在利用光譜數(shù)據(jù)庫對(duì)田間濕土光譜進(jìn)行預(yù)測(cè)時(shí),對(duì)濕土光譜數(shù)據(jù)進(jìn)行校正處理是十分必要的。在本研究中,利用前期采樣的干土樣本作為EPO算法校正的建模集,再對(duì)不同土壤含水量梯度的土壤有機(jī)質(zhì)進(jìn)行預(yù)測(cè),其中EPO-支持向量機(jī)回歸模型的RPD為2.15,模型精度穩(wěn)定可靠,該思路為今后野外監(jiān)測(cè)土壤屬性參數(shù)提供了理論基礎(chǔ)和技術(shù)參考。
隨著土壤含水量的增加,土壤光譜反射率呈非線性降低趨勢(shì);土壤高光譜受土壤含水量的影響非常嚴(yán)重,掩蓋了土壤有機(jī)質(zhì)的光譜吸收信息。當(dāng)EPO維度g=2、4時(shí),投影矩陣P的變化比較均勻;而EPO維度g=6、8、10時(shí),投影矩陣P的變化比較錯(cuò)亂,噪音明顯,優(yōu)選出最佳的維度g對(duì)轉(zhuǎn)換光譜的模型精度至關(guān)重要。EPO算法校正后的濕土光譜數(shù)據(jù)得到的主成分得分的樣本點(diǎn)空間分布與干土樣本位于基本相同的區(qū)域,干土、濕土樣本之間的光譜數(shù)據(jù)相似性很高,EPO能夠有效消除土壤含水量對(duì)土壤光譜的影響。EPO-支持向量機(jī)回歸模型的驗(yàn)證RPD值為2.15,可以更好地處理土壤光譜與土壤有機(jī)質(zhì)之間的非線性關(guān)系,實(shí)現(xiàn)土壤有機(jī)質(zhì)的有效估算。
[1] 方少文,楊梅花,趙小敏,等. 紅壤區(qū)土壤有機(jī)質(zhì)光譜特征與定量估算——以江西省吉安縣為例. 土壤學(xué)報(bào),2014,51(5):1003—1010 Fang S W,Yang M H,Zhao X M,et al. Spectral characteristics and quantitative estimation of SOM in red soil typical of Ji’an County,Jiangxi Province(In Chinese). Acta Pedological Sinica,2014,51(5):1003—1010
[2] Xu S X,Shi X Z,Wang M Y,et al. Effects of subsetting by parent materials on prediction of soil organic matter content in a hilly area using Vis-NIR spectroscopy. Plos One,2016,11(3):e0151536
[3] Shi Z,Ji W J,Viscarra Rossel R A,et al. Prediction of soil organic matter using a spatially constrained local partial least squares regression and the Chinese vis-NIR spectral library. European Journal of Soil Science,2015,66(4):679—687
[4] Ji W J,Li S,Chen S C,et al. Prediction of soil attributes using the Chinese soil spectral library and standardized spectra recorded at field conditions. Soil &Tillage Research,2016,155(SI):492—500
[5] Ji W J,Viscarra Rossel R A,Shi Z. Improved estimates of organic carbon using proximally sensed vis-NIR spectra corrected by piecewise direct standardization.European Journal of Soil Science,2015,66(4):670—678
[6] Roger J M,Chauchard F,Bellon-Maurel V. EPO-PLS external parameter orthogonalisation of PLS application totemperature-independent measurement of sugar content of intact fruits. Chemometrics and Intelligent Laboratory Systems,2003,66(2):191—204
[7] Minasny B,McBratney A B,Bellon-Maurel V,et al.Removing the effect of soil moisture from NIR diffuse reflectance spectra for the prediction of soil organic carbon. Geoderma,2011,167/168:118—124
[8] Ge Y F,Morgan C L S,Ackerson J P. VisNIR spectra of dried ground soils predict properties of soils scanned moist and intact. Geoderma,2014,221:61—69
[9] Ji W J,Viscarra Rossel R A,Shi Z. Accounting for the effects of water and the environment on proximally sensed vis-NIR soil spectra and their calibrations.European Journal of Soil Science,2015,66(3):555—565
[10] Ackerson J P,Demattê J A M,Morgan C L S.Predicting clay content on field-moist intact tropical soils using a dried,ground VisNIR library with external parameter orthogonalization. Geoderma,2015,259:196—204
[11] Liu Y,Pan X Z,Wang C K,et al.Predicting soil salinity with Vis-NIR spectra after removing the effects of soil moisture using external parameter orthogonalization.Plos One,2015,10(10):e0140688
[12] 張甘霖,龔子同. 土壤調(diào)查實(shí)驗(yàn)室分析方法. 北京:科學(xué)出版社,2012:1—54 Zhang G L,Gong Z T. Soil survey laboratory methods(In Chinese). Beijing:Science Press,2012:1—54
[13] 洪永勝,于雷,耿雷,等. 應(yīng)用DS算法消除室內(nèi)幾何測(cè)試條件對(duì)土壤高光譜數(shù)據(jù)波動(dòng)性的影響.華中師范大學(xué)學(xué)報(bào)(自然科學(xué)版),2016,50(2):303—308 Hong Y S,Yu L,Geng L,et al. Using direct standardization algorithm to eliminate the effect of laboratory geometric parameters on soil hyperspectral data fluctuate characteristic(In Chinese). Journal of Central China Normal University(NaturalSciences),2016,50(2):303—308
[14] Savitzky A,Golay M J E. Smoothing and differentiation of data by simplified least squares procedures.Analytical Chemistry,1964,36(8):1627—1639
[15] NodaI. Generalized two-dimensional correlation method applicable to infrared,raman,and other types of spectroscopy. Applied Spectroscopy,1993,47(9):1329—1336
[16] 褚小立. 化學(xué)計(jì)量學(xué)方法與分子光譜分析技術(shù). 北京:化學(xué)工業(yè)出版社,2011 Chu X L. Molecular spectroscopy analytical technology combined with chemometrics and its applications(In Chinese). Beijing:Chemical IndustryPress,2011
[17] Wold S,Sj?str?m M,Eriksson L. PLS-regression:A basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems,2001,58(1):109—130
[18] Gao Y,Cui L J,Lei B,et al. Estimating soil organic carbon content with Visible-Near-Infrared(Vis-NIR)Spectroscopy. Applied Spectroscopy,2014,68(7):712—722
[19] 于雷,洪永勝,周勇,等. 高光譜估算土壤有機(jī)質(zhì)含量的波長(zhǎng)變量篩選方法. 農(nóng)業(yè)工程學(xué)報(bào),2016,32(13):95—102 Yu L,Hong Y S,Zhou Y,et al. Wavelength variable selection methods for estimation of soil organic matter content using hyperspectral technique(In Chinese).Transactions of the CSAE,2016,32(13):95—102
[20] Lobell D B,Asner G P. Moisture effects on soil reflectance. Soil Science Society of America Journal,2002,66(3):722—727
[21] 劉洋,丁瀟,劉煥軍,等. 黑土土壤水分反射光譜特征定量分析與預(yù)測(cè). 土壤學(xué)報(bào),2014,51(5):1021—1026 Liu Y,Ding X,Liu H J,et al. Quantitative analysis of reflectance spectrum of black soil as affected by soil moisture for prediction of soil moisture in black soil(In Chinese). Acta Pedologica Sinica,2014,51(5):1021—1026
[22] 宋海燕,程旭. 水分對(duì)土壤近紅外光譜檢測(cè)影響的二維相關(guān)光譜解析. 光譜學(xué)與光譜分析,2014,34(5):1240—1243 Song H Y,Cheng X. Analysis of the effect of moisture on soil spectra detection by using two-dimensional correlation near infrared spectroscopy(In Chinese).Spectroscopy and Spectral Analysis,2014,34(5):1240—1243
[23] Viscarra Rossel R A,Behrens T. Using data mining to model and interpret soil diffuse reflectance spectra.Geoderma,2010,158(1/2):46—54
[24] 史舟,王乾龍,彭杰,等. 中國主要土壤高光譜反射特性分類與有機(jī)質(zhì)光譜預(yù)測(cè)模型. 中國科學(xué):地球科學(xué),2014,44(5):978—988 Shi Z,Wang Q L,Peng J,et al. Development of a national VNIR soil-spectral library for soil classification and prediction of organic matter concentrations(InChinese). Science China:Earth Sciences,2014,44(5):978—988
[25] Nocita M,Stevens A,Noon C,et al. Prediction of soil organic carbon for different levels of soil moisture using Vis-NIR spectroscopy. Geoderma,2013,199(SI):37—42
[26] Morgan C L S,Waiser T H,Brown D J,et al.Simulated in situ characterization of soil organic and inorganic carbon with visible near-infrared diffuse reflectance spectroscopy. Geoderma,2009,151(3/4):249—256
[27] Stenberg B. Effects of soil sample pretreatments and standardized rewetting as interacted with sand classes on VIS-NIR predictions of clay and soil organic carbon.Geoderma,2010,158(1/2):15—22
[28] Li S,Shi Z,Chen S C,et al. In situ measurements of organic carbon in soil profiles using vis-NIR spectroscopy on the Qinghai-Tibet Plateau.Environmental Science & Technology,2015,49(8):4980—4987
[29] Viscarra Rossel R A,Behrens T,Ben-Dor E,et al. A global spectral library to characterize the world’s soil.Earth-Science Reviews,2016,155:198—230
[30] 陳頌超,馮來磊,李碩,等. 基于局部加權(quán)回歸的土壤全氮含量可見-近紅外光譜反演. 土壤學(xué)報(bào),2015,52(2):312—320 Chen S C,F(xiàn)eng L L,Li S,et al. Vis-NIR spectral inversion for prediction of soil total nitrogen content in laboratory based on locally weighted regression(In Chinese). Acta Pedologica Sinica,2015,52(2):312—320
Removingthe Effect of Soil Moisture on Prediction of Soil Organic Matter with Hyperspectral Reflectance Using External Parameter Orthogonalization
HONG Yongsheng1,2YU Lei1,2?ZHU Yaxing1,2WU Hongxia1,2NIE Yan1,2ZHOU Yong1,2QI Feng3XIA Tian1,2
(1Key Laboratory for Geographical Process Analysis & Simulation,Hubei Province,Central China Normal University,Wuhan430079,China)
(2College of Urban & Environmental Science,Central China Normal University,Wuhan430079,China)
(3School of Environmental and Sustainability Sciences,Kean University,NJ07083,USA)
【Objective】Soil organic matter is an important index of soil properties,because it is vital to crop growth and soil quality. The technology of hyperspectral analysis is a rapid,convenient,lowcost and alternative method and exhibits an increasingly remarkable development potential in estimation of soil organic matter. However,when hyperspectral reflectance is used in the field,there are several external environmental factors,including soil moisture content,temperature,and surface of the soil that may affect soil spectra. Especially soil moisture content,a major limit to field hyperspectral survey,might mask the absorption features of soil organic matter,and hence dramatically lower accuracy of the prediction of soil organic matter. Therefore,it is essential to find a method capable of removing the impact of soil moisture content on spectral reflectance,so as to improve the accuracy of quantitative prediction of soil organic matter.In this paper,the EPO(external parameter orthogonalization)algorithm was introduced for that purpose.【Method】A total of 217 soil samples were collected from the 0~20 cm soil layer in the Jianghan Plain.In the laboratory,the soil samples were air-dried and ground to pass a sieve with mesh < 2 mm. Then the soil samples were analyzed separately for soil organic matter content with the potassium dichromate external heating method.The total of 217 soil samples were further divided into three non-overlapping subsets:a model calibration set(S0),consisting of 122 samples and dedicated to development of a multivariate model for soil organic matter;an EPO development subset(S1)consisting of 60 samples for EPO development;and a validation subset(S2)consisting of 35 samples for independentEPO validation. Then,the samples in S1 and S2 were rewetted in line with the following procedure:from each soil sample 150 g oven-dried soil was weighed out,put in a black cylindrical box and rewetted along the gradient of soil moisture content increment with interval being 4% each,making up a total of 9 treatments in soil moisture content along the gradient i.e.0,4%,8%,12%,16%,20%,24%,28% and 32%. An spectrometer was used to acquire hyperspectral reflectances of the samples of three subsets(S0,S1 andS2,including the rewetting samples)on 350 to 2500 nm. And then influences of the soil moisture content on the soil spectra were analyzed,and the scores of the first two principal components in the principal component analysis were used for comparison to determine performance of EPO algorithmin removing the effects of soil moisture content on spectral reflectance of the wet samples. In the end,modeling for the S0 subset was done using the partial least squares regression and support vector machine regression,and the S2 subset of wet samples were used as external validation set before and after calibration with EPO. The coefficient of determination(R2),root mean squared error(RMSE)and the ratio of prediction to deviation(RPD)between the predicted and measured values of soil organic matter were used to compare the 3 models in performance:HighR2,RPD and low RMSE were indicators of optimal models for partial least squares regression(before EPO calibration),EPO-partial least squares regression and EPO-support vector machine regression. 【Result】Results show that(1)Soil moisture content does have obvious influence on spectral reflectance,and the reflectance decreases in value across the entire wavelength domain with increasing soil moisture content,making it more challenging to identify useful features of soil organic matter with spectra;(2)For Subset S2 before EPO calibration,no spectral overlaps are observed between the wet and dry samples,and spectra of the wet sample cluster in spaces free from those of the dry sample(mutual independent space). However,after EPO calibration of Subset S2 set,the spectra of the wet sample appear almost in the same positions as those of the dry sample do within the eigen space,demonstrating that the two groups of spectra are highly similar;(3)Before EPO calibration,the partial least squares regression model is the poorest in prediction accuracy(the validation RPD=1.16). EPO calibration has improved prediction accuracy of the model up to an acceptable level(the validation RPD=1.76). And EPO-support vector machine regression model performs better than the other two with validationR2reaching 0.78,and RPD = 2.15,which indicates that the effects of soil moisture content on spectra are successfully eliminated.【Conclusion】In the future,this approach will facilitate rapid measurement of soil organic matter for this study area.
Soil spectra;Soil organic matter;Moisture content;External parameter orthogonalization;Support vector machine regression;Jianghan Plain
S127;TP79
A
10.11766/trxb201612040396
* 國家自然科學(xué)基金項(xiàng)目(41401232)、中央高?;究蒲袠I(yè)務(wù)費(fèi)專項(xiàng)資金項(xiàng)目(CCNU15A05006)、湖北省自然科學(xué)基金面上項(xiàng)目(2016CFB558)、華中師范大學(xué)研究生教育創(chuàng)新資助項(xiàng)目(2016CXZZ15)共同資助 Supported by the National Natural Science Foundation of China(No. 41401232),the Fundamental Research Funds for the Central Universities(No.CCNU15A05006),the General Project of Natural Science Foundation of Hubei Province(No. 2016CFB558),the Education Innovation Projects for Graduates of Central China Normal University(No. 2016CXZZ15)
? 通訊作者 Corresponding author,E-mail:yulei@mail.ccnu.edu.cn
洪永勝(1990—),男,安徽六安人,碩士研究生,主要從事土壤高光譜研究。E-mail:ccnuhys@sina.com
2016-12-04;
2017-04-24;優(yōu)先數(shù)字出版日期(www.cnki.net):2017-05-25
(責(zé)任編輯:檀滿枝)