易 輝 宋曉峰 姜 斌 冒澤慧
(南京航空航天大學(xué)自動(dòng)化學(xué)院,南京210016)
基于誤判損失最小化支持向量機(jī)的故障診斷
易 輝 宋曉峰 姜 斌 冒澤慧
(南京航空航天大學(xué)自動(dòng)化學(xué)院,南京210016)
為了解決決策導(dǎo)向無(wú)環(huán)圖支持向量機(jī)(DAG-SVM)故障診斷的劃分偏好問(wèn)題,采用誤判損失最小化支持向量機(jī)(MLM-SVM)對(duì)多分類決策結(jié)構(gòu)進(jìn)行尋優(yōu).傳統(tǒng)方法以最大化診斷正確率為目標(biāo),該方法則考慮了各類誤判引發(fā)損失的差異性,以最小化誤判帶來(lái)的損失為優(yōu)化目標(biāo).針對(duì)k類故障進(jìn)行診斷時(shí),MLM-SVM首先給出各誤判情況的懲罰因子,并生成k!種決策結(jié)構(gòu)的誤判損失混淆矩陣;然后,結(jié)合總體損失風(fēng)險(xiǎn)函數(shù),求解出各結(jié)構(gòu)對(duì)應(yīng)的誤判損失風(fēng)險(xiǎn),進(jìn)而獲取具有最小誤判損失風(fēng)險(xiǎn)的決策結(jié)構(gòu)用以故障診斷.將該方法用于變壓器故障診斷中,尋找得到最佳結(jié)構(gòu);然后采用k!種決策結(jié)構(gòu)進(jìn)行故障診斷,統(tǒng)計(jì)其誤判損失,得出最佳結(jié)構(gòu).兩者所選結(jié)構(gòu)吻合,從而驗(yàn)證了該方法的有效性.
故障診斷;誤判損失最小化支持向量機(jī);結(jié)構(gòu)選擇
支持向量機(jī)(support vector machine,SVM)[1]是目前常用的一種基于數(shù)據(jù)驅(qū)動(dòng)的故障診斷方法.相比神經(jīng)網(wǎng)絡(luò)方法,SVM具有需要樣本量少、無(wú)維數(shù)災(zāi)難、不存在局部最小解等優(yōu)點(diǎn),因而得到了廣泛的運(yùn)用.但是,傳統(tǒng)SVM是針對(duì)二分類問(wèn)題設(shè)計(jì)的,而故障診斷所需解決的多為多分類問(wèn)題.因此,采用支持向量機(jī)方法進(jìn)行故障診斷,首先需進(jìn)行多分類擴(kuò)展,如 1-against-rest法[2]、1-v-1 法[3]、決策樹(shù)法[4]等,其中Plantt等[5]提出的決策導(dǎo)向無(wú)環(huán)圖支持向量機(jī)(DAG-SVM)引起了廣泛的注意.這種方法借鑒圖論中的決策導(dǎo)向無(wú)環(huán)圖思想(decision directed acyclic graph,DDAG),有效避免了傳統(tǒng)多分類擴(kuò)展策略存在的決策冗余、樣本不對(duì)稱、存在劃分盲區(qū)等問(wèn)題,是目前最高效的擴(kuò)展策略之一[6].然而,對(duì)于一個(gè)k分類問(wèn)題(k>2,k∈R),DAG-SVM方法一共可提供k!種不同的DDAG決策結(jié)構(gòu),且不同的DDAG結(jié)構(gòu)存在不同的誤判偏好,可能會(huì)導(dǎo)致不同的劃分結(jié)果[7-9].如何選擇合適的DDAG結(jié)構(gòu),是DAG-SVM運(yùn)用時(shí)必須首先考慮的問(wèn)題.目前,通常通過(guò)重復(fù)試驗(yàn)比較正確率或者最小化決策結(jié)構(gòu)誤判概率來(lái)選取DDAG結(jié)構(gòu)[10-13].這些方法只考慮了診斷的正確率,而未考慮到實(shí)際工程應(yīng)用中不同誤判帶來(lái)的損失差異性.針對(duì)這一問(wèn)題,本文采用誤判損失最小化支持向量機(jī)對(duì)DAG-SVM結(jié)構(gòu)進(jìn)行選擇,該方法將誤判損失作為結(jié)構(gòu)選取的重要標(biāo)準(zhǔn),并將該方法應(yīng)用至故障診斷中.
對(duì)于k類問(wèn)題,DAG-SVM方法共有k(k-1)/2個(gè)結(jié)點(diǎn)分布于k層結(jié)構(gòu)中,每個(gè)結(jié)點(diǎn)對(duì)應(yīng)劃分特定類別的SVM分類器.其中,頂層只含1個(gè)根結(jié)點(diǎn),第2層含有2個(gè)葉結(jié)點(diǎn),依此類推,第i層含有i個(gè)結(jié)點(diǎn).對(duì)于任意樣本,只需經(jīng)過(guò)k-1個(gè)結(jié)點(diǎn)即可劃分其類別(除去最底層外,每層選用1個(gè)結(jié)點(diǎn)).第i+1層的選用結(jié)點(diǎn)由第i層中選用結(jié)點(diǎn)的決策結(jié)果所導(dǎo)向.采用DAG-SVM 法對(duì)一組4類樣本a,b,c,d進(jìn)行決策,流程見(jiàn)圖1.圖中,結(jié)點(diǎn)a-d為針對(duì)類別a和d設(shè)計(jì)的SVM二分類器,該分類器能夠有效地對(duì)a和d兩類數(shù)據(jù)進(jìn)行識(shí)別.采用圖1所示的流程,對(duì)于任意類別樣本,該系統(tǒng)只需要經(jīng)過(guò)3次識(shí)別即可對(duì)其進(jìn)行類別定位.
圖1 四分類決策結(jié)構(gòu)
這種決策方法能在不增加決策計(jì)算量的情況下,為不同類別樣本選取不同的決策路徑,從而提高了劃分精度.同時(shí),采用類似遞推式推理方式,可以避免傳統(tǒng)算法的劃分盲區(qū)問(wèn)題.由于每個(gè)分類器都只包含2類樣本,因此不易出現(xiàn)因正、負(fù)樣本不對(duì)稱而引起的過(guò)擬合現(xiàn)象.
DDAG結(jié)構(gòu)不是唯一的.對(duì)于k分類問(wèn)題,DAG-SVM共有k!種可行的DDAG結(jié)構(gòu),而不同的DDAG結(jié)構(gòu)都有不同的決策偏好[7].通常,根結(jié)點(diǎn)處對(duì)應(yīng)類別的劃分正確率低于葉結(jié)點(diǎn)處對(duì)應(yīng)類別.以圖1為例,假設(shè)每層劃分正確率為p,類別劃分正確率為r,則類別a,b,c,d的劃分正確率為
DAG-SVM任一分類節(jié)點(diǎn),通常都可認(rèn)為由弱分類器構(gòu)成,即滿足
結(jié)合式(1)、(5)和(6),可得
即對(duì)于四分類問(wèn)題,DAG-SVM決策有利于類別b和c.
在實(shí)際應(yīng)用中,決策系統(tǒng)的誤判往往會(huì)帶來(lái)相應(yīng)損失,而不同的誤判帶來(lái)的損失也不一樣[7].在故障診斷中,如果把正常狀態(tài)誤判為故障狀態(tài),就會(huì)帶來(lái)不必要的停工或維修;如果把故障狀態(tài)誤判為正常狀態(tài),則會(huì)導(dǎo)致設(shè)備損壞甚至更進(jìn)一步的破壞.通常而言,后者帶來(lái)的損失比前者大.
設(shè)X∈{A,B},誤判A→B的懲罰因子為P(A),誤判B→A的懲罰因子為P(B).分類器SVM1和SVM2對(duì)X具有相同的劃分正確率p.在SVM1中,誤判狀況為A→B;在SVM2中,誤判狀況為B→A.則這2組分類器對(duì)應(yīng)誤判損失為m(A)=P(A)(1-p)和m(B)=P(B)(1-p).
不同于傳統(tǒng)方法以最大化劃分正確率p為目標(biāo),誤判損失最小化支持向量機(jī)同時(shí)考慮了劃分正確率p及決策結(jié)構(gòu)對(duì)應(yīng)的誤判懲罰因子,使得多分類結(jié)構(gòu)具有最小的損失風(fēng)險(xiǎn),因此更具現(xiàn)實(shí)意義.誤判損失最小化支持向量機(jī)的算法流程如下:
①定義誤判損失矩陣.針對(duì)k類診斷問(wèn)題,生成k×k大小的誤判損失矩陣,即
式中,lij為第i類樣本被誤判為第j類所帶來(lái)的損失,由專家根據(jù)經(jīng)驗(yàn)給出.
②生成損失混淆矩陣.DAG-SVM具有k!種不同的決策結(jié)構(gòu),而每種結(jié)構(gòu)都對(duì)應(yīng)不同的誤判懲罰因子分布.根據(jù)給定的誤判損失矩陣,針對(duì)不同決策結(jié)構(gòu)生成不同的損失混淆矩陣P,以對(duì)懲罰因子的分布進(jìn)行描述.以四分類問(wèn)題為例,設(shè)其第4層分布順序?yàn)閍,b,c,d,則損失混淆矩陣為
式中,{a,b,c,d}∈{1,2,3,4}.
③構(gòu)建總體損失風(fēng)險(xiǎn)函數(shù).假設(shè)每個(gè)SVM二分類器劃分正確率為p,且p∈[0.6,1].樣本進(jìn)入無(wú)關(guān)分類器時(shí),其類別劃分隨機(jī),即概率為1/2.分別統(tǒng)計(jì)各類別樣本誤判會(huì)導(dǎo)致的損失風(fēng)險(xiǎn)mi,以四分類問(wèn)題為例,各類別損失風(fēng)險(xiǎn)如表1所示,表中Pij為損失混淆矩陣的第i行第j列元素.
表1 四分類誤判的損失風(fēng)險(xiǎn)
為誤判損失最小化支持向量機(jī)所用決策結(jié)構(gòu).
實(shí)驗(yàn)采用的變壓器故障數(shù)據(jù)來(lái)源于文獻(xiàn)[10].采用油中溶解氣體分析法,對(duì)變壓器運(yùn)行狀況進(jìn)行測(cè)量.充油電器設(shè)備內(nèi)的絕緣油和有機(jī)絕緣材料在電和熱的作用下會(huì)產(chǎn)生各種低分子烴類氣體,溶解在油內(nèi)的各種氣體含量與變壓器的運(yùn)行狀況密切相關(guān).選擇H2,CH4,C2H6,C2H4,C2H2五種氣體作為測(cè)量對(duì)象,針對(duì)變壓器正常工作情況、過(guò)熱、低能放電、高能放電這4種運(yùn)行狀況進(jìn)行故障數(shù)據(jù)選取(見(jiàn)表2).
表2 變壓器故障樣本
根據(jù)誤判所帶來(lái)的實(shí)際經(jīng)濟(jì)損失,由專家知識(shí)可得如下的誤判損失矩陣:
則四分類問(wèn)題的總體損失風(fēng)險(xiǎn)函數(shù)為
采用誤判損失最小化支持向量選擇的結(jié)構(gòu)為 S[1342],S[2431],S[1432]和 S[2341].采用3 重交叉驗(yàn)證,各結(jié)構(gòu)所對(duì)應(yīng)誤判損失值如圖2所示.本文算法所選的DAG-SVM決策結(jié)構(gòu)具有明顯優(yōu)勢(shì),驗(yàn)證了該算法的有效性.
圖2 各結(jié)構(gòu)對(duì)應(yīng)誤判損失
采用支持向量機(jī)方法解決故障診斷問(wèn)題面臨的一個(gè)重要問(wèn)題是如何選擇合理的決策結(jié)構(gòu).本文將誤判損失最小化支持向量機(jī)(MLM-SVM)應(yīng)用至故障診斷,與傳統(tǒng)方法以最大化劃分正確率為目標(biāo)不同,該方法以最小化誤判帶來(lái)的經(jīng)濟(jì)損失為目標(biāo),對(duì)決策結(jié)構(gòu)進(jìn)行有效選擇.經(jīng)變壓器故障實(shí)驗(yàn)驗(yàn)證,該算法具有工程實(shí)用性.
References)
[1]Vapnik V.Statistical learning theory[M].New York,USA:John Wiley&Sons,1998.
[2]Knerr S,Personnaz L,Dreyfus G.Single-layer learning revisited:a stepwise procedure for building and training a neural network[M].New York,USA:Springer-Veerlag,1990.
[3]Takahashi F,Abe S.Decision-tree-based multiclass support vector machines[C]//Proceedings of the 9th International Conference on Neural Information Proceeding.Singapore,2002(3):1418-1422.
[4]Platt J C,Cristianini N,Shawe-Taylor J.Large margin DAG's for multiclass classification[M].Cambridge,UK:MIT Press,2000:547-553.
[5]Hsu C W,Lin C J.A comparison of methods for multiclass support vector machines[J].IEEE Transactions on Neural Networks,2002,13(2):415-425.
[6]Ussivakual N,Kijsirikul B.Multiclass support vector machines using adaptive directed acyclic graph[C]//Proceedings of the IEEE/INNS'International Joint Conference On Neural Networks(IJCNN-2002).Honolulu,Hawaii,USA,2002:980-985.
[7]Ma Ruowei,Tang Chunyang.Building up default predicting model based on logistic model and misclassification loss[J].Systems Engineering:Theory &Practice,2007,27(8):33-38.
[8]Parag Pendharkar,Sudhir Nanda.A misclassification cost-minimizing evolutionary-neural classification approach[J].Naval Research Logistics,2006,53(5):432-447.
[9]Ganyun L,Cheng H Z,Zhai H B,et al.Fault diagnosis of power transformer based on multi-layer SVM classifier[J].Electric Power Systems Research,2005,74(1):1-7.
[10]易輝,宋曉峰,姜斌,等.基于節(jié)點(diǎn)優(yōu)化的決策導(dǎo)向無(wú)環(huán)圖支持向量機(jī)及其在故障診斷中的應(yīng)用[J].自動(dòng)化學(xué)報(bào),2010,36(3):427-432.Yi Hui,Song Xiaofeng,Jiang Bin,et al.Support vector machine based on nodes refined decision directed acyclic graph and its application to fault diagnosis[J].Acta Automatica Sinica,2010,36(3):427-432.(in Chinese)
[11]王定成,姜斌.在線稀疏最小二乘支持向量機(jī)回歸的研究[J].控制與決策,2007,22(2):132-137.Wang Dingcheng,Jiang Bin.Online sparse least square support vector machines regression[J].Control and Decision,2007,22(2):132-137.(in Chinese)
[12]王定成,姜斌.支持向量機(jī)控制與在線學(xué)習(xí)方法研究的進(jìn)展[J].系統(tǒng)仿真學(xué)報(bào),2007,19(6):1177-1181.Wang Dingcheng,Jiang Bin.Review of SVM-based control and online training algorithms[J].Journal of System Simulation,2007,19(6):1177-1181.(in Chinese)
[13]宋曉峰,陳德釗,胡上序.結(jié)構(gòu)可調(diào)的支持向量回歸估計(jì)[J].控制與決策,2003,18(6):698-702.Song Xiaofeng,Chen Dezhao,Hu Shangxu.Adjustable structure support vector regression estimation[J].Control and Decision,2003,18(6):698-702.(in Chinese)
Fault diagnosis based on misclassification loss minimized SVM
Yi Hui Song Xiaofeng Jiang Bin Mao Zehui
(College of Automation Engineering,Nanjing University of Aeronautics and Astronautics,Nanjing 210016,China)
In order to solve the classification bias problem in fault diagnosis based on the decision directed acyclic graph support vector machine(DAG-SVM),a misclassification loss minimized SVM(MLM-SVM)is proposed to optimize the multi-type decision structures.Compared with conventional methods which are aimed to maximize the diagnosing accuracy,this approach takes the different losses brought by different misclassifications into consideration and sets the minimization of misclassification losses as the goal for optimization.Dealing with the k-type fault diagnosis,the MLMSVM first gives the penalty factors for all misclassification cases,and generalizes the misclassification loss confusion matrixes for all k!decision structures.Then,the misclassification loss confusion matrixes and the risk function for total losses are combined,and the misclassification losses for all corresponding decision structures are obtained.Furthermore the decision structure with the smallest misclassification loss for fault diagnosis is obtained.The approach is applied to the transformer fault diagnosis and the best structure is obtained.Then,all the k!decision structures are made for diagnosis and the corresponding misclassification losses are calculated to obtain the best structure.The two results are consistent,indicating the effectiveness of the proposed approach.
fault diagnosis;misclassification loss minimized support vector machine;selection for structure
TP183
A
1001-0505(2010)增刊(I)-0116-05
2010-05-18. 作者簡(jiǎn)介:易輝(1984—),男,博士生;姜斌(聯(lián)系人),男,博士,教授,博士生導(dǎo)師,binjiang@nuaa.edu.cn.
江蘇省自然科學(xué)基金重點(diǎn)資助項(xiàng)目(BK2010072)、南京航空航天大學(xué)基本科研業(yè)務(wù)費(fèi)專項(xiàng)科研資助項(xiàng)目(NS2010071).