蔡亮紅,丁建麗(1. 新疆大學(xué)資源與環(huán)境科學(xué)學(xué)院,烏魯木齊 830046;2. 新疆大學(xué)綠洲生態(tài)教育部重點(diǎn)實(shí)驗(yàn)室,烏魯木齊 830046)
小波變換耦合CARS算法提高土壤水分含量高光譜反演精度
蔡亮紅,丁建麗※
(1. 新疆大學(xué)資源與環(huán)境科學(xué)學(xué)院,烏魯木齊 830046;2. 新疆大學(xué)綠洲生態(tài)教育部重點(diǎn)實(shí)驗(yàn)室,烏魯木齊 830046)
為實(shí)現(xiàn)干旱地區(qū)土壤水分含量(soil moisture content,SMC)的快速監(jiān)測,該文以渭干河-庫車河綠洲為靶區(qū),采用小波變換(wavelet transform,WT)對反射光譜進(jìn)行1~8層小波分解,通過相關(guān)性分析確定最大分解層數(shù),再通過競爭性自適應(yīng)重加權(quán)(competitive adaptive reweighted sampling,CARS)濾除冗余變量,篩選出與SMC相關(guān)性較好的波長變量,并疊加各層特征光譜的優(yōu)選波長變量作為最優(yōu)變量集,用偏最小二乘回歸(partial least squares regression,PLSR)構(gòu)建土壤水分含量預(yù)測模型并進(jìn)行分析。結(jié)果顯示:1)小波分解過程中,土壤反射率與SMC的相關(guān)性不斷增強(qiáng),到小波變換第6層分解(L6)處達(dá)到最高,因此小波變換最大分解層數(shù)為6層分解;2)通過對土樣進(jìn)行WT-CARS耦合算法篩選出變量,得出的最優(yōu)變量集包括400~500、1 320~1 461、1 851~1 961、2 125~2 268 nm區(qū)域之間共131個(gè)波長變量;3)相對于全波段預(yù)測模型,各層特征光譜的CARS優(yōu)選變量預(yù)測模型的精度均高,并且基于最優(yōu)變量集的預(yù)測模型的精度最高,該模型的建模集均方根誤差0.021、建模集決定系數(shù)0.721、預(yù)測集均方根誤差0.028、預(yù)測集決定系數(shù)0.924、相對分析誤差2.607。說明WT-CARS耦合算法使其在建立模型時(shí)盡可能少地?fù)p失光譜細(xì)節(jié)、較為徹底的去除噪聲,同時(shí)還能對無信息變量進(jìn)行有效去除,為該研究區(qū)SMC的預(yù)測提供新的思路。
土壤;含水率;光譜分析;小波變換;競爭適應(yīng)重加權(quán)采樣算法;變量優(yōu)選
土壤水分含量(soil moisture content,SMC)是土壤系統(tǒng)中物質(zhì)和能量循環(huán)的載體,對土壤特性、植被生長分布以及區(qū)域生態(tài)系統(tǒng)有著重要的影響[1-2]。然而,土壤水分極易受環(huán)境的影響,并且SMC傳統(tǒng)的監(jiān)測方法由于費(fèi)時(shí)費(fèi)力,難以實(shí)現(xiàn)田間實(shí)時(shí)觀測,也難以滿足實(shí)施精準(zhǔn)農(nóng)業(yè)管理對土壤水分監(jiān)測的需求[3],因此,SMC的監(jiān)測需要一種高效、精準(zhǔn)的方法。近年來,高光譜遙感技術(shù)以其大面積、非接觸、時(shí)效性等優(yōu)勢,在SMC的監(jiān)測研究中得到重視[4]。然而,通過高光譜技術(shù)所獲取的土壤光譜原始數(shù)據(jù)存在明顯的光譜噪聲和嚴(yán)重的散射現(xiàn)象,土壤高光譜中必然存在與SMC不相關(guān)的噪聲[5],將會增加SMC光譜信息的探測難度。因此,盡可能少地?fù)p失光譜細(xì)節(jié)、較為徹底的去除噪聲成為光譜分析建模過程的關(guān)鍵環(huán)節(jié)。
目前比較成熟地光譜去噪方法包括Savitzky-Golay濾波、中值運(yùn)算、移動(dòng)平均等,但這些方法對于白噪聲,特別是隨機(jī)和低頻的信號,則難以去除噪聲的同時(shí)又不影響有用信號[6]。作為一種新型的去噪處理技術(shù),小波變換已成功應(yīng)用于高光譜數(shù)據(jù)處理[7],并隨著小波變換等新算法的不斷完善,小波變換逐漸被用于土壤屬性的估測中[8]。廖欽洪等[9]對北京順義地區(qū)64個(gè)土壤樣本高光譜曲線進(jìn)行小波分析,其有機(jī)質(zhì)的反演精度高達(dá)75%;張銳等[6]的研究表明第6層分解與重構(gòu)更能精確地描述土壤有機(jī)質(zhì)特性;Zheng等[10]在第8層分解與重構(gòu)的基礎(chǔ)上建立了土壤各屬性特征光譜。上述研究均體現(xiàn)了小波變換通過頻域分析聚焦光譜細(xì)節(jié)的優(yōu)越性能,但由于高光譜數(shù)據(jù)量大,導(dǎo)致光譜重構(gòu)后仍存在大部分冗余噪聲。因此,在定量分析高光譜數(shù)據(jù)時(shí)變量優(yōu)選成為模型構(gòu)建的關(guān)鍵,目前常用的變量優(yōu)選方法主要包括遺傳算法(genetic algorithms,GA)、連續(xù)投影算法(successive projections algorithm,SPA)、蒙特卡羅無信息變量消除(Monte Carlo-uninformative variable elimination,MCUVE)和競爭性自適應(yīng)重加權(quán)算法(competitive adaptive reweighted sampling,CARS)等[11]。于雷等[12]利用CARS算法在預(yù)測SMC時(shí)取得較好結(jié)果;溫珍才等[13]和于霜等[14]通過比較上述4種方法,得出基于CARS算法的模型最優(yōu)。這表明CARS算法的優(yōu)勢在于可以從高維數(shù)據(jù)中優(yōu)選變量,克服組合爆炸問題,提高模型的預(yù)測能力[15-16]。
目前,小波變換多用于土壤有機(jī)質(zhì)的研究處理[17],而在SMC估算中的應(yīng)用尚有待進(jìn)一步探究。本研究以渭庫-綠洲土壤樣品為研究對象,基于小波變換對反射光譜進(jìn)行分解,并結(jié)合CARS算法構(gòu)建出SMC最優(yōu)變量子集,以期盡可能少地?fù)p失光譜細(xì)節(jié)、較為徹底的去除噪聲,同時(shí)對無信息變量進(jìn)行有效去除,最終構(gòu)建基于最優(yōu)變量集的PLSR預(yù)測模型,為土壤水分等研究及當(dāng)?shù)鼐珳?zhǔn)農(nóng)業(yè)提供科學(xué)支撐和參考。
1.1 土壤樣本
以新疆南部、塔里木盆地中北部的渭-庫綠洲(41°08′~41°55′N、81°06′~83°37′E)為研究區(qū),根據(jù)研究區(qū)特點(diǎn),共布設(shè)39個(gè)樣點(diǎn)(圖1),并利用GPS記錄樣點(diǎn)位置,以便用于驗(yàn)證。各樣點(diǎn)采用5點(diǎn)混合法采集土樣,深度0~20 cm,各采樣點(diǎn)采集2份樣本(一份通過鋁盒帶回、另一份用塑料袋帶回),帶回實(shí)驗(yàn)室后,對鋁盒中的樣品進(jìn)行室內(nèi)烘干法(將鋁盒中的樣品置于105 ℃恒溫箱烘干48 h)獲得相應(yīng)的土壤水分含量,另一份樣本在室內(nèi)自然風(fēng)干、研磨并用2 mm孔篩篩過后獲取高光譜數(shù)據(jù)。
圖1 野外樣點(diǎn)分布Fig.1 Field sample distribution
使用美國ASD(Analytical Spectral Devices)公司生產(chǎn)的FieldSpec3型光譜儀,在暗室中采集光譜數(shù)據(jù),其波長范圍為350~2 500 nm,350~1 000 nm波長范圍采樣間隔為1.4 nm,1 000~2 500 nm 范圍采樣間隔為2 nm,重采樣間隔1 nm。把通過2 mm篩的土樣將黑色器皿(直徑11 cm,深1.4 cm)裝滿,并且將其表面刮平。在暗室中所用光源為50 W的鹵素?zé)?,光源與試驗(yàn)樣品之間相隔50 cm,鹵素?zé)籼祉斀菫?5°,光譜儀探頭與樣品之間的相距10 cm。每次測量前均用漫反射標(biāo)準(zhǔn)參考板定標(biāo)。本試驗(yàn)各土樣采集10條光譜曲線,其算術(shù)平均值為該土樣的光譜數(shù)據(jù)。
1.2 小波變換
小波變換繼承傅里葉分析的優(yōu)勢,并克服了傅里葉分析不能對局部信號的局部頻譜特征進(jìn)行分析的缺點(diǎn)[18],通過對小波母函數(shù)的縮放和平移,將信號分解為不同子頻帶的時(shí)頻分量,實(shí)現(xiàn)對原始信號特定頻率特征的更好觀察[19],被稱為時(shí)-頻分析的顯微鏡。
小波變換為一個(gè)有限長序列和一個(gè)離散小波母函數(shù)的內(nèi)基[20],其表達(dá)式如下:
式中Wf(j,k)是小波分析結(jié)果,f(n)是信號序列,Ψj,k(n)是小波母函數(shù),(n)是Ψj,k(n)的共軛。與傅里葉變換不同是,它得到的是信號不同子頻帶在空域上的表現(xiàn)[21]。按照這一理論,小波分解的每一層子頻帶可表示為原始光譜某一頻率的吸收特征,而相應(yīng)高頻信號則被小波濾波器所去除。
根據(jù)于雷等[12]研究結(jié)論,本研究選擇db4小波母函數(shù),并對原始光譜進(jìn)行1~8層小波變換并構(gòu)建各層特征光譜,分別用L1~L8表征。
1.3 競爭適應(yīng)重加權(quán)采樣
CARS方法模仿達(dá)爾文生物進(jìn)化理論中的“適者生存”原則,借助自適應(yīng)重加權(quán)采樣技術(shù)(adaptive reweighted sampling,APS)和指數(shù)衰減函數(shù)(exponentially decreasing function,EDP)優(yōu)選出PLSR模型中回歸系數(shù)絕對值大的波長變量,并通過十折交互檢驗(yàn)優(yōu)選出交互驗(yàn)證均方根誤差(RMSECV)最小的變量子集,確定為最優(yōu)變量子集。它可篩選出對土壤屬性較敏感的波長變量,并可以解決變量篩選時(shí)的組合爆炸問題,對高維數(shù)據(jù)比較適用[22]。
1.4 PLSR模型建立與驗(yàn)證
PLSR集成了主成分分析、典型相關(guān)分析和普通多元線性回歸3種方法的優(yōu)點(diǎn),它克服了自變量之間的多重線性相關(guān)和樣本數(shù)量小于波長變量的問題,使構(gòu)建的模型更穩(wěn)定,有助于多元數(shù)據(jù)統(tǒng)計(jì)分析[23-24]。
本研究選擇下列參數(shù)來評估模型的精度,包括建模集決定系數(shù)(determination coefficients of cablibration,R2)、
c驗(yàn)證集決定系數(shù)(predicting determination of cablibration,R2)、建模集均方根誤差(root mean square error of
pcalibration, RMSEC)、預(yù)測均方根誤差(root mean square error of prediction, RMSEP)以及相對分析誤差(residual predictive deviation,RPD)。R2越大,模型精度越高;RMSEC與RMSEP表示模型的精確性,其值的大小與模型精度成反比。另外,RPD≥2時(shí),模型預(yù)測效果較好,1.4≤RPD<2時(shí),模型預(yù)測效果一般,當(dāng)RPD<1.4時(shí),模型無預(yù)測能力[25]。
2.1 樣本土壤含水量狀況
由SMC的描述性統(tǒng)計(jì)特征(表1)可見,建模集和驗(yàn)證集所對應(yīng)的SMC的均值分別為14.59%、14.84%,而所有土壤樣本的SMC均值為14.66%,變異系數(shù)(coefficient of variation,CV)為38.89%,屬于中等變異,介于建模集和驗(yàn)證集之間。
表1 土壤樣品土壤水分含量(SMC)統(tǒng)計(jì)特征Table 1 Statistical characteristics of soil moisture content (SMC) of soil samples
2.2 小波變換及最大分解層
本研究在MatlabR2012a中以db4小波母函數(shù)對原始光譜數(shù)據(jù)進(jìn)行8層小波分解,然后對分解后的每一層小波系數(shù)分別進(jìn)行小波重構(gòu),得到各層的特征光譜,分別用L1~L8表征。
如圖2所示。圖2a(L0為未經(jīng)過小波變換的原始光譜)中,土壤在1 400、1 900 nm周圍存在顯著的水分吸收峰,而450、2 200 nm周圍較為微弱。L1噪聲較多,這是由于原始反射率噪聲傳遞導(dǎo)致,在350~400 nm處較為明顯,體現(xiàn)為該范圍內(nèi)的“小毛刺”;隨著分解的進(jìn)行,高頻信號被進(jìn)一步去除,噪聲傳遞現(xiàn)象越來越弱,到L5時(shí)噪聲很少;由于光譜細(xì)節(jié)被不斷去除,導(dǎo)致光譜曲線逐漸趨于平滑,使得某些表征土壤水分的吸收峰消失,例如在L6中1 400、1 900 nm處還存在顯著吸收峰,而在L7中幾乎不能表現(xiàn)出來。
進(jìn)行CARS算法的前提是確定合適的小波分解層。根據(jù)各層特征光譜與SMC的相關(guān)分析(表2)。L1特征光譜與SMC之間的相關(guān)性通過顯著性為0.01(閾值為±0.408)的波段數(shù)為393個(gè),隨著分解層數(shù)的增加,相應(yīng)的特征光譜與SMC的顯著性波段數(shù)逐漸增加,到L6時(shí)達(dá)到最多,為602個(gè),并且在L4處達(dá)到最大正相關(guān),為0.619,但隨著分解層數(shù)進(jìn)一步的增加,L7及以后特征光譜的顯著性波段數(shù)快速減少,同時(shí)最大相關(guān)性也快速降低??傮w來說L6處特征光譜不僅能去噪,還盡量保存原始光譜信息。因此,本研究確定最大分解層數(shù)為6層,并在L1~L6的基礎(chǔ)上進(jìn)一步分析。
圖2 小波變換1~8層重構(gòu)光譜Fig.2 Reconstruction spectra under 1-8 wavelet level
2.3 不同分解層的CARS優(yōu)選變量子集
在本研究的CARS變量優(yōu)選中,將蒙特卡羅采樣次數(shù)設(shè)定為50,對采樣次數(shù)進(jìn)行反復(fù)迭代,通過對比各次采樣的RMSECV值,當(dāng)其值最小時(shí),相應(yīng)采樣次數(shù)的變量被篩選為優(yōu)選變量子集??紤]到篇幅,只分析L1特征光譜的變量優(yōu)選過程。因?yàn)橹笖?shù)衰減函數(shù)(exponentially decreasing function,EDP)的存在,導(dǎo)致相應(yīng)優(yōu)選變量的數(shù)量隨迭代次數(shù)的增加呈指數(shù)減少(圖3a,圖3b整體上表現(xiàn)出隨采樣次數(shù)的不斷迭代,RMSECV值先減后升。1~28次迭代中,RMSECV值逐漸降低,表明在L1特征光譜中與SMC無關(guān)的大量信息或噪聲被去除,在28次采樣之后,RMSECV值慢慢回升,這是因?yàn)閷MC較敏感的關(guān)鍵變量被不斷去除所致。圖3c中28次采樣次數(shù)時(shí)RMSECV最小,圖中各線表示隨著運(yùn)行次數(shù)的增加各波長變量回歸系數(shù)的變化趨勢。由圖3可知,第28次采樣的RMSECV值最小,相應(yīng)光譜變量為優(yōu)選變量集,該子集包含23個(gè)光譜變量。L1~L6特征光譜的RMSECV最小值和相應(yīng)采樣次數(shù)及優(yōu)選變量集見表3。
表2 SMC與各層特征光譜相關(guān)分析Table 2 Correlation analysis between SMC and spectra from wavelet analysis in each level
圖3 CARS方法變量篩選過程Fig.3 Variable filtering process by competitive adaptive reweighted sampling(CARS)
2.4 適用于SMC土樣的最優(yōu)變量集
在小波變換的基礎(chǔ)上采用CARS方法對土樣不同分解層進(jìn)行變量優(yōu)選,得到各個(gè)分解層的優(yōu)選變量的分布狀況(圖4)。各層特征光譜的優(yōu)選變量集大致分布在水分吸收峰(450、1 400、1 900、2 200 nm)周圍。由于隨著分解層數(shù)的增加,一些反應(yīng)土壤屬性的信息也隨之消失,每層特征光譜只能表征土壤的部分屬性,故將各層特征光譜得到的優(yōu)選變量進(jìn)行疊加,得到400~500、1 320~1 461、1 851~1 961、2 125~2 268 nm區(qū)域之間共131個(gè)波長變量,并作為最優(yōu)變量集VWT-CARS。在所篩選的最優(yōu)變量集中有相當(dāng)一部分位于1 800~2 400 nm范圍內(nèi),該波長范圍的土壤光譜特征主要表現(xiàn)為Al-OH、C-H、O-H、C=O基團(tuán)的基頻振動(dòng)以及合頻和倍頻振動(dòng)吸收[26-27]。
表3 L1~L6各層特征光譜變量優(yōu)選結(jié)果Table 3 Optimal variables of spectral characteristics of the levels of L1~L6
圖4 L1~L6各層特征光譜的優(yōu)選變量分布Fig.4 Distribution of optimal variables based on CARS for characteristic spectrum of L1-L6
2.5 基于優(yōu)選變量PLSR模型的建立與驗(yàn)證
以WT-CARS耦合算法篩選的優(yōu)選變量為SMC預(yù)測模型的自變量,SMC為因變量,構(gòu)建SMC預(yù)測模型(簡稱L(i)-CARS-PLSR模型,i=1、2、3、4、5、6),為了更好的突出變量優(yōu)選的優(yōu)勢,引入全波段(L0)的PLSR模型進(jìn)行比較,同時(shí)構(gòu)建基于VWT-CARS的SMC預(yù)測模型,并且根據(jù)表4中各種參數(shù)來分析各層特征光譜模型的精度。
通過綜合分析表4中各模型的精度可知,通過WT-CARS耦合算法,SMC預(yù)測精度得到提高。其中基于最優(yōu)變量集VWT-CARS所構(gòu)建的模型精度最高,說明本研究所選的最優(yōu)變量集能夠更好地預(yù)測研究區(qū)土樣SMC,其RMSEC=0.021、=0.721、RMSEP=0.028、=0.924、RPD=2.607??傮w來說WT-CARS耦合算法所構(gòu)建模型精度比全波段所建立模型均高,并且穩(wěn)定性較好。利用WT-CARS耦合算法能夠減少建模變量數(shù),并且提高模型精度,說明兩者的耦合可作為一種變量優(yōu)選的有效方法。在本研究的最佳預(yù)測模型中,從2 151個(gè)波段中篩選出131個(gè)波段進(jìn)行建模,這大大的壓縮了建模時(shí)間,提高模型精度,為該區(qū)域用土壤高光譜反射率反演其他土壤屬性信息時(shí)關(guān)鍵波段的篩選提供參考。
表4 土壤水分含量預(yù)測結(jié)果Table 4 Results of estimation for SMC
采用VWT-CARS-PLSR模型對預(yù)測集進(jìn)行驗(yàn)證,所得的PLSR模型的R2p=0.924、RMSEP=0.028、RPD=2.607。圖5為該模型中實(shí)測值和預(yù)測值的散點(diǎn)圖,可見模型的實(shí)測值樣點(diǎn)和預(yù)測值樣點(diǎn)基本均勻分布在1∶1線附近,模型精度較高,這說明WT-CARS耦合算法能夠篩選出預(yù)測SMC的有效波段,減少建模變量數(shù),有助于模型精度的提升。
圖5 VWT-CARS-PLSR模型SMC預(yù)測值與實(shí)測值Fig.5 Relationship between measured and predicted soil moisture content by VWT-CARS-PLSR model
SMC的快速無損預(yù)測在對干旱地區(qū)農(nóng)業(yè)干旱程度的評價(jià)中具有重要意義。而土壤高光譜反射率曲線是由不同土壤屬性的綜合表現(xiàn),其中與SMC無關(guān)的噪聲大量存在,同時(shí)波段數(shù)量較多,增加了數(shù)據(jù)的冗余,本研究通過小波變換與CARS算法相結(jié)合,篩選出有效變量,構(gòu)建PLSR模型。構(gòu)建CARS最優(yōu)變量集之前,首先通過小波變換獲取的各層特征光譜信號與SMC之間的相關(guān)性分析,確定小波變換最大分解層數(shù)為6層。而各層特征光譜在進(jìn)行數(shù)據(jù)壓縮時(shí),也使得特定頻率的光譜吸收特征得到凸顯,而其他譜段的非相關(guān)光譜特征及噪聲被抑制。陳至坤等[28]在對礦物油熒光光譜數(shù)據(jù)進(jìn)行小波變換時(shí),認(rèn)為第4層分解在更多的保留原始信號的基礎(chǔ)上實(shí)現(xiàn)對光譜數(shù)據(jù)的去噪處理;Zheng等[10]研究發(fā)現(xiàn)第8層小波變換能夠更好地反映土壤屬性特征光譜;王延倉等[29]研究發(fā)現(xiàn)小波第4層光譜特征所構(gòu)建的有機(jī)質(zhì)含量預(yù)測模型最佳。上述研究中的分解層數(shù)不盡相同,這是由于土壤類型、小波母函數(shù)、光譜特征重構(gòu)的選擇等的不同所導(dǎo)致。但上述研究均顯示,模型在中等分解尺度出表現(xiàn)出最佳的效果,過低分解尺度的去噪效果不佳,而過度的分解又會隨著高頻信號的不斷剝離,導(dǎo)致一些反應(yīng)土壤屬性特征的峰谷消失,使其對原始光譜的解釋能力下降。本研究通過相關(guān)性分析確定了CARS最優(yōu)變量集構(gòu)建中最大分解層數(shù)為6層,其屬于中等分解尺度,與上述結(jié)論較為一致。
高光譜提供了大量的連續(xù)光譜,而獲取的原始高光譜數(shù)據(jù)通常噪聲明顯、散射嚴(yán)重,數(shù)據(jù)具有一定的冗余性[30],并且目前的研究已經(jīng)表明冗余信息的存在能夠削弱模型的預(yù)測性能和穩(wěn)健性[31]。所以光譜變量的優(yōu)選在土壤高光譜分析中是很必要的,不但能降低預(yù)測模型的復(fù)雜度,還能去除相關(guān)性較低的波段變量。目前常用的光譜變量篩選方法主要包括SPA(successive projections algorithm)、MC-UVE(Monte Carlo-uninformative variable elimination)、GA(genetic algorithms)和CARS(competitive adaptive reweighted sampling)等,詹白勺等[11]研究發(fā)現(xiàn)上述變量篩選方法能夠從原始高光譜數(shù)據(jù)中有效地優(yōu)選出敏感波段,并且其中CARS方法的篩選效果最佳;李江波等[32]比較了CARS與MC-UVE、GA變量篩選方法,發(fā)現(xiàn)CARS方法篩選結(jié)果最佳,其在減少無信息變量的同時(shí),變量間的共線性也隨之減小。本研究結(jié)合小波變換和CARS方法兩者的優(yōu)勢,通過對土樣不同分解層數(shù)進(jìn)行CARS變量優(yōu)選,最終得到的優(yōu)選變量集包含了不同分解層數(shù)的優(yōu)選變量,共131個(gè)變量,其分布在400~500、1 320~1 461、1 851~1 961、2 125~2 268 nm區(qū)域之間,均位于水分吸收峰(450、1 400、1 900、2 200 nm)附近,并且這些波段與SMC均有較高的相關(guān)性,是適用于整體SMC的最優(yōu)變量集,這與于雷等[12]研究結(jié)果基本一致。如果僅僅考慮一個(gè)分解層的CARS優(yōu)選結(jié)果,這容易忽略其他分解層數(shù)的水分敏感波段,導(dǎo)致所選變量不能完全地反映土壤屬性,建立的模型具有局限性。因此本研究優(yōu)選出來的變量集可作為預(yù)測SMC的最優(yōu)變量集。
通過WT-CARS耦合算法,SMC預(yù)測精度得到提高。相對于全波段預(yù)測模型來說,各層特征光譜的CARS優(yōu)選變量的預(yù)測模型的精度都要高,并且最優(yōu)變量集的預(yù)測模型的精度最高,RMSEP=0.028、=0.924、RPD= 2.607,說明WT-CARS耦合算法是土壤水分含量光譜分析有效的波段篩選方法。兩者的有機(jī)結(jié)合在對無信息變量有效去除的同時(shí),還可以盡可能的減少共線性變量對模型的影響,這為土壤其他屬性敏感波段的篩選提供理論支撐。
本研究利用WT-CARS(wavelet transform-competitive adaptive reweighted sampling)耦合算法篩選出SMC(soil moisture content)的最優(yōu)變量集,探究了該算法對土壤水分含量的預(yù)測效果,得出以下結(jié)論:
1)小波分解過程中,土壤反射率與SMC的相關(guān)性呈先增后減趨勢,L6(小波變換第6層特征光譜)處通過0.01水平下的顯著性波段達(dá)到最多,總體來說L6的特征光譜在去噪的同時(shí),還最大限度的保留光譜細(xì)節(jié),為本研究中的最大分解層。
2)通過對土樣進(jìn)行小波變換與CARS算法耦合篩選出變量,得到的最優(yōu)變量集包括400~500、1 320~1 461、1 851~1 961、2 125~2 268 nm區(qū)域之間共131個(gè)波長變量。
3)相對于全波段預(yù)測模型來說,各層特征光譜的CARS優(yōu)選變量的預(yù)測模型的精度都要高,并且最優(yōu)變量集的預(yù)測模型的精度最高,該模型的建模集均方根誤差RMSEC=0.021、建模集決定系數(shù)=0.721、預(yù)測集均方根誤差RMSEP=0.028、預(yù)測集決定系數(shù)=0.924、相對分析誤差RPD=2.607,說明WT-CARS耦合算法使其在建立模型時(shí)盡可能少地?fù)p失光譜細(xì)節(jié)、較為徹底地去除噪聲,同時(shí)還能對無信息變量進(jìn)行有效去除,為該研究區(qū)SMC的預(yù)測提供新的思路。
[1] 鄒文秀,韓曉增,江恒,等. 東北黑土區(qū)降水特征及其對土壤水分的影響[J]. 農(nóng)業(yè)工程學(xué)報(bào),2011,27(9):196-202. Zou Wenxiu, Han Xiaozeng, Jiang Heng, et al. Characteristics of precipitation in black soil region and response of soil moisture dynamics in Northeast China[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2011, 27(9): 196-202. (in Chinese with English abstract)
[2] 張定海,李新榮,陳永樂. 騰格里沙漠人工植被區(qū)固沙灌木影響深層土壤水分的動(dòng)態(tài)模擬研究[J]. 生態(tài)學(xué)報(bào),2016, 36(11):3273-3279. Zhang Dinghai, Li Xinrong, Chen Yongle. Simulation study on the effects of sand binding shrub on the deep soil water in arecovered area on the southeast fringe of Tengger Desert, North China[J]. Acta Ecologica Sinica, 2016, 36(11): 3273-3279. (in Chinese with English abstract)
[3] 孫越君,鄭小坡,秦其明,等. 不同質(zhì)量含水量的土壤反射率光譜模擬模型[J]. 光譜學(xué)與光譜分析,2015(8):2236-2240. Sun Yuejun, Zheng Xiaopo, Qin Qiming, et al. Modeling soil spectral reflectance with different mass moisture content[J]. Spectroscopy and Spectral Analysis, 2015(8): 2236-2240. (in Chinese with English abstract)
[4] Yin Z, Lei T, Yan Q, et al. A near-infrared reflectance sensor for soil surface moisture measurement[J]. Computers & Electronics in Agriculture, 2013, 99(99): 101-107.
[5] Blanco M, Coello J, Iturriaga H, et al. NIR calibration in non-linear systems: Different PLS approaches and artificial neural networks[J]. Chemometrics & Intelligent Laboratory Systems, 2000, 50(1): 75-82.
[6] 張銳,李兆富,潘劍君. 小波包-局部最相關(guān)算法提高土壤有機(jī)碳含量高光譜預(yù)測精度[J]. 農(nóng)業(yè)工程學(xué)報(bào),2017,33(1):175-181. Zhang Rui, Li Zhaofu, Pan Jianjun. Coupling discrete wavelet packet transformation and local correlation maximization improving prediction accuracy of soil organic carbon based on hyperspectral reflectance[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(1): 175-181. (in Chinese with English abstract)
[7] Kaewpijit S, Moigne J L, El-Ghazawi T. Automatic reduction of hyperspectral imagery using wavelet spectral analysis[J]. IEEE Transactions on Geoscience & Remote Sensing, 2003, 41(4): 863-871.
[8] 李瑞平,史海濱,張曉紅,等. 基于小波變換的最大凍深期氣溫與土壤水鹽特征分析[J]. 農(nóng)業(yè)工程學(xué)報(bào),2012,28(6):82-87. Li Ruiping, Shi Haibin, Zhang Xiaohong, et al. Characteristic analysis of temperature, soil water and salt during maximum freezing depth period based on wavelet transform[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE) , 2012, 28(6): 82-87. (in Chinese with English abstract)
[9] 廖欽洪,顧曉鶴,李存軍,等. 基于連續(xù)小波變換的潮土有機(jī)質(zhì)含量高光譜估算[J]. 農(nóng)業(yè)工程學(xué)報(bào),2012,28(23):132-139. Liao Qinhong, Gu Xiaohe, Li Cunjun, et al. Estimation of fluvo-aquic soil organic matter content from hyperspectral reflectance based on continuous wavelet transformation[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2012, 28(23): 132-139. (in Chinese with English abstract)
[10] Zheng L H, Li M Z, Pan L, et al. Application of wavelet packet analysis in estimating soil parameters based on NIR spectra[J]. Spectroscopy & Spectral Analysis, 2009, 29(6): 1549-1552.
[11] 詹白勺,倪君輝,李軍. 高光譜技術(shù)結(jié)合CARS算法的庫爾勒香梨可溶性固形物定量測定[J]. 光譜學(xué)與光譜分析,2014(10):2752-2757. Zhan Baishao, Ni Junhui, Li Jun. Hyperspectral technology combined with CARS algorithm to quantitatively determine the SSC in Korla Fragrant Pear[J]. Spectroscopy and Spectral Analysis, 2014(10): 2752-2757. (in Chinese with English abstract)
[12] 于雷,朱亞星,洪永勝,等. 高光譜技術(shù)結(jié)合CARS算法預(yù)測土壤水分含量[J]. 農(nóng)業(yè)工程學(xué)報(bào),2016,32(22):138-145. Yu Lei, Zhu Yaxing, Hong Yongsheng, et al. Determination of soil moisture content by hyperspectral technology with CARS algorithm[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2016, 32(22): 138-145. (in Chinese with English abstract)
[13] 溫珍才,孫通,許朋,等. 可見/近紅外聯(lián)合變量優(yōu)選檢測油茶籽油摻假[J]. 江蘇大學(xué)學(xué)報(bào):自然科學(xué)版,2015,36(6):673-678. Wen Zhencai, Sun Tong, Xu Peng, et al. Adulteration detection of camellia oils by Vis/NIR spectroscopy and variable selection method[J]. Journal of Jiangsu University: Natural Science Edition, 2015, 36(6): 673-678. (in Chinese with English abstract)
[14] 于霜,劉國海,梅從立,等. 基于變量篩選的固態(tài)發(fā)酵值近紅外檢測[J]. 計(jì)算機(jī)與應(yīng)用化學(xué),2014,31(9):1143-1146. Yu Shuang, Liu Guohai, Mei Congli, et al. NIRS detection of pH value in solid-state fermentation process based on variable selection method of CARS[J]. Computers and Applied Chemistry, 2014, 31(9): 1143-1146. (in Chinese with English abstract)
[15] 孫通,許文麗,林金龍,等. 可見/近紅外漫透射光譜結(jié)合CARS 變量優(yōu)選預(yù)測臍橙可溶性固形物[J]. 光譜學(xué)與光譜分析,2012,32(12):3229-3233. Sun Tong, Xu Wenli, Lin Jinlong, et al. Determination of soluble solids content in navel oranges by Vis/NIR diffuse transmission spectra combined with CARS method[J]. Spectroscopy and Spectral Analysis, 2012, 32(12): 3229-3233. (in Chinese with English abstract)
[16] 張華秀,李曉寧,范偉,等. 近紅外光譜結(jié)合CARS 變量篩選方法用于液態(tài)奶中蛋白質(zhì)與脂肪含量的測定[J]. 分析測試學(xué)報(bào),2010,29(5):430-434. Zhang Huaxiu, Li Xiaoning, Fan Wei, et al. Determination of proteinand fat in liquid milk by NIR combined with CARS variables screening method[J]. Journal of Instrumental Analysis, 2010, 29(5): 430-434. (in Chinese with English abstract)
[17] Lin L, Wang Y, Teng J, et al. Hyperspectral analysis of soil organic matter in coal mining regions using wavelets, correlations, and partial least squares regression[J]. Environmental Monitoring & Assessment, 2016, 188(2): 1-11.
[18] Qian, Shie. Introduction to time-frequency and wavelet transforms[M]. Beijing: China Machine Press, 2005.
[19] 劉燕德,歐陽愛國,應(yīng)義斌. 小波分析用于光譜信號處理及其在Matlab中的實(shí)現(xiàn)[J]. 傳感技術(shù)學(xué)報(bào),2006,19(3):821-823. Liu Yande, Ouyang Aiguo, Ying Yibin. Application of wavelet analysis in signal process using matlab[J]. Chinese Journal of Sensors and Actuators, 2006, 19(3): 821-823. (in Chinese with English abstract)
[20] Xu Changfa, Cai Chao, Pi Minghong, et al. Correlation wavelet and its applications[J]. Chinese Quarterly Journal of Mathematics, 1999, 14(1): 5-9.
[21] Kaewpijit S, Moigne J L, Elghazawi T. Spectral data reduction via wavelet decomposition[C]//Aerosense. 2002: 56-63.
[22] Li Hongdong, Liang Yizeng, Xu Qingsong, et al. Key wavelengths screening using competitive adaptive reweighted sampling method for multivariate calibration[J]. Analytica Chimica Acta, 2009, 648(1): 77-84.
[23] 于雷,洪永勝,耿雷,等. 基于偏最小二乘回歸的土壤有機(jī)質(zhì)含量高光譜估算[J]. 農(nóng)業(yè)工程學(xué)報(bào),2015,31(14):103-109. Yu Lei, Hong Yongsheng, Geng Lei et al. Hyperspectral estimation of soil organic matter content based on partial least squares regression[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2015, 31(14): 103-109. (in Chinese with English abstract)
[24] 薛利紅,周鼎浩,李穎,等. 不同利用方式下土壤有機(jī)質(zhì)和全磷的可見近紅外高光譜反演[J]. 土壤學(xué)報(bào),2014,51(5):993-1002. Xue Lihong, Zhou Dinghao, Li Ying, et al. Prediction of soilorganic matter and total phosphorus with Vis-NIR hyperspectral inversion relative to land use[J]. Acta Pedologica Sinica, 2014, 51(5): 993-1002. (in Chinese with English abstract)
[25] Shi Zhou, Wang Qianlong, Peng Jie, et al. Development of a national VNIR soil-spectral library for soil classification and prediction of organic matter concentrations[J]. Science China Earth Sciences, 2014, 57(7): 1671-1680.
[26] Viscarra Rossel R A, Behrens T. Using data mining to model and interpret soil diffuse reflectance spectra[J]. Geoderma, 2010, 158(1/2): 46-54.
[27] 于雷,洪永勝,周勇,等. 高光譜估算土壤有機(jī)質(zhì)含量的波長變量篩選方法[J]. 農(nóng)業(yè)工程學(xué)報(bào),2016,32(13):95-102. Yu Lei, Hong Yongsheng, Zhou Yong, et al. Wavelength variable selection methods for estimation of soil organic matter content using hyperspectral technique[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE) , 2016, 32(13): 95-102. (in Chinese with English abstract)
[28] 陳至坤,張菡潔,王玉田,等. 基于小波變換的礦物油熒光光譜數(shù)據(jù)處理方法[J]. 激光雜志,2016(10):78-81. Chen Zhikun, Zhang Hanjie, Wang Yutian, et al. Fluorescence spectral date of mineral oil processing based on wavelet transform[J]. Laser Journal, 2016(10): 78-81. (in Chinese with English abstract)
[29] 王延倉,楊貴軍,朱金山,等. 基于小波變換與偏最小二乘耦合模型估測北方潮土有機(jī)質(zhì)含量[J]. 光譜學(xué)與光譜分析,2014(7):1922-1926. Wang Yancang, Yang Guijun, Zhu jinshan, et al. Estimation of organic matter content of north fluvo-aquic soil based on the coupling model of wavelet transform and partial least squares[J]. Spectroscopy and Spectral Analysis, 2014(7): 1922-1926. (in Chinese with English abstract)
[30] Hymer D C, Moran M S, Keefer T O. Soil water evaluation using a hydrologic model and calibrated sensor network[J]. Soil Science Society of America Journal, 2000, 64(1): 319-326
[31] 楊愛霞,丁建麗. 新疆艾比湖濕地土壤有機(jī)碳含量的光譜測定方法對比[J]. 農(nóng)業(yè)工程學(xué)報(bào),2015,31(18):162-168. Yang Aixia, Ding Jianli. Comparative assessment of twomethods for estimation of soil organic carbon content by Vis-NIR spectra[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2015, 31(18): 162-168. (in Chinese with English abstract)
[32] 李江波,彭彥昆,陳立平,等. 近紅外高光譜圖像結(jié)合CARS算法對鴨梨SSC含量定量測定[J]. 光譜學(xué)與光譜分
析,2014(5):1264-1269.
Li Jiangbo, Peng Yankun, Chen Lioing, et al. Near-infrared hyperspectral imaging combined with CARS algorithm to quantitatively determine soluble solids content in “Ya” pear[J]. Spectroscopy and Spectral Analysis, 2014(5): 1264-1269. (in Chinese with English abstract)
Wavelet transformation coupled with CARS algorithm improving prediction accuracy of soil moisture content based on hyperspectral reflectance
Cai Lianghong, Ding Jianli※
(1. College of Resources & Environmental Science, Xinjiang University, Urumqi 830046, China; 2. Key Laboratory of Oasis Ecology, Ministry of Education, Xinjiang University, Urumqi 830046, China)
The rapid estimation of soil moisture content (SMC) is of great significance to precision agriculture in arid areas. Hyperspectral remote sensing technology has been widely used in the estimation of SMC due to that it’s non-destructive and rapid, and has high spectral resolution characteristics. Meanwhile, there are a lot of factors, such as massive spectral data, and surface conditions, which might affect the spectra, increasing the difficulty in extracting the effective information, and reducing the prediction accuracy of SMC. Noise reduction must be considered in developing hyperspectral estimation models, but how to reduce noise while retaining as much useful information as possible needs investigation. As advanced spectral mining methods, competitive adaptive reweighted sampling (CARS) was used to solve this problem in this study. In the present study, a total of 39 soil samples at 0-20 cm depth were collected from the delta oasis in Xinjiang. The samples were brought back to the laboratory to be dried naturally, ground and passed through a screen with 2 mm hole, and then filled into the black boxes with 12 cm diameter and 1.8 cm depth, which were leveled at the rim with a spatula. Reflectance of soil samples was measured using ASD (analytical spectral devices) Fieldspec 3 Spectrometer in a dark room. We used the following steps to process soil reflectance: First, discrete wavelet transformation (DWT) was used to decompose the original spectra in 8 levels using db4 wavelet basis with MATLAB programming language. In order to select the maximum level of DWT, correlation coefficients between the SMC and the spectra of each level were computed. Secondly, the CARS was used to filter the redundant variables, the wavelength variables with better correlation with SMC were screened out and the characteristic wavelength variables of each decomposition level were superimposed as the optimal variable set. Thirdly, partial least squares regression (PLSR) was employed to build the hyperspectral estimation models of SMC. And then, root mean square error of calibration set (RMSEC), determination coefficient of calibration set), root mean square error of prediction set (RMSEP), determination coefficient of predicting set) and relative prediction deviation (RPD) were used for accuracy assessment. The results showed that: 1) With the increase of the number of decomposed layers, the correlation between soil reflectance and SMC showed a trend of increasing first and then decreasing, and L6 was the most significant band at 0.01 level. In general, the characteristic spectrum of L6 was denoised, and at the same time, the spectral detail was preserved to the maximum extent, so the maximum decomposition order of the wavelet was 6-order decomposition. 2) The characteristic wavelength variable of the characteristic spectrum was selected by coupling wavelet transform and CARS algorithm. However, if only the CARS selection result of the feature spectrum was taken into account, it was easy to ignore the water features of other characteristic spectra. Therefore, in this study, by adding the characteristic wavelength variables of each layer as the optimal set of variables, it contained 131 wavelength variables near the absorption band (450, 1 400, 1 900, 2 200 nm). 3) Compared with the full-band PLSR model, the accuracy of PLSR model of CARS preferred variables for each decomposition level was high, and the PLSR model of the optimal variable set had the highest accuracy and a better performance in predicting SMC in the study area (RMSEC=0.021,=0.721, RMSEP=0.028,=0.924, RPD=2.607). It is shown that the combination of wavelet transform and CARS algorithm makes it possible to remove the noise as much as possible and to remove the noise completely when the model is established, and at the same time, it can effectively remove the non-information variable and provide a new idea of the screening of the SMC spectral variable in this region.
soil; moisture content; spectrum analysis WT; CARS; variable selection
10.11975/j.issn.1002-6819.2017.16.019
S127
A
1002-6819(2017)-16-0144-08
蔡亮紅,丁建麗. 小波變換耦合CARS算法提高土壤水分含量高光譜反演精度[J]. 農(nóng)業(yè)工程學(xué)報(bào),2017,33(16):144-151.
10.11975/j.issn.1002-6819.2017.16.019 http://www.tcsae.org
Cai Lianghong, Ding Jianli. Wavelet transformation coupled with CARS algorithm improving prediction accuracy of soil moisture content based on hyperspectral reflectance[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(16): 144-151. (in Chinese with English abstract)
doi:10.11975/j.issn.1002-6819.2017.16.019 http://www.tcsae.org
2017-05-12
2017-07-10
國家自然科學(xué)基金(U1303381,41261090);自治區(qū)重點(diǎn)實(shí)驗(yàn)室專項(xiàng)基金(2016D03001);自治區(qū)科技支疆項(xiàng)目(201591101)
蔡亮紅,男(漢族),貴州遵義人,主要從事干旱區(qū)遙感應(yīng)用方面的研究。烏魯木齊 新疆大學(xué)資源與環(huán)境科學(xué)學(xué)院, 830046。
Email:1173716776@qq.com.
※通信作者:丁建麗,男(漢族),新疆烏魯木齊人,博士,博士生導(dǎo)師,主要從事干旱區(qū)生態(tài)環(huán)境遙感研究。烏魯木齊 新疆大學(xué)資源與環(huán)境科學(xué)學(xué)院,830046。Email:2187736938@qq.com