穆煒煒 王國才
摘要:該文運用多特征融合進行文本相似度的計算,并利用決策樹算法C4.5進行文本分類,構(gòu)建決策樹分類器,完成對主觀題的自動評閱。通過實驗結(jié)果表明,該算法準確率高,與人工閱卷相接近,具有一定的應(yīng)用前景。
關(guān)鍵詞:多特征;相似度;決策樹;文本分類;評閱
中圖分類號:TP391.2文獻標識碼:A文章編號:1009-3044(2012)15-3579-04
Algorithm Design of Subjective Question Auto Assessment
MU Wei-wei1,2, WANG Guo-cai1
(1.College of Information Science and Engineering, Central South University, Changsha 410083, China; 2.Hunan Chemical Vocational Technology College, Zhuzhou 412004, China)
Abstract: This paper use the multi-features combinaion forr text similarity computing, and take use of the C4.5 decision tree algorithm for text classification to build a decision tree classifier. In this way, to complete the review on the subjective question automatically. Experi mental results shows that the algorithm accuracy rate close to the manual scoring, It has a certain degree of application prospect.
Key words: multi-features; similarity; decision tree classification; text classification; assessment
1)數(shù)據(jù)樣本復(fù)雜,表達方式多樣化,關(guān)鍵詞的提取存在偏差;
2)多特征相似度匹配具有一定的優(yōu)勢,但基于特征的多樣性,匹配程度還達不到100%;
3)文本分類算法還需進一步優(yōu)化。
本文采用多特征相似度計算和C4.5決策樹算法進行主觀題自動評閱,通過實驗結(jié)果表明,該算法性能優(yōu)良,評閱準確率較高,具有一定的實用參考價值。