胡紫音 桂寧
摘 要:特征選擇是避免維度詛咒的一種數(shù)據(jù)預(yù)處理技術(shù)。在多變量時(shí)間序列預(yù)測(cè)中,為了同時(shí)找到與問(wèn)題相關(guān)性最大的變量及其對(duì)應(yīng)時(shí)延,提出一種基于多注意力的有監(jiān)督特征選擇方法。該方法利用帶有注意力模塊和學(xué)習(xí)模塊的深度學(xué)習(xí)模型,將原始二維時(shí)間序列數(shù)據(jù)正交分割成兩組一維數(shù)據(jù),分別輸入兩個(gè)不同維度的注意力生成模塊,得到特征維度和時(shí)間維度的注意權(quán)重。兩個(gè)維度的注意力權(quán)值點(diǎn)積疊加作為全局注意力得分進(jìn)行特征選擇,作用于原始數(shù)據(jù)后輸入隨學(xué)習(xí)模塊訓(xùn)練不斷更新至收斂。實(shí)驗(yàn)結(jié)果表明,所提出的方法在特征數(shù)小于10時(shí)可達(dá)到全量數(shù)據(jù)訓(xùn)練效果,與現(xiàn)有幾種基線方法相比實(shí)現(xiàn)了最佳準(zhǔn)確率。
關(guān)鍵詞:特征選擇;時(shí)間序列;注意力機(jī)制;多維數(shù)據(jù);深度學(xué)習(xí)
DOI:10. 11907/rjdk. 201206
中圖分類(lèi)號(hào):TP301 ??? 文獻(xiàn)標(biāo)識(shí)碼:A?????? 文章編號(hào):1672-7800(2020)011-0021-04
A Multi-attention-based Feature Selection Method for Multivariate Time Series
HU Zi-yin1,GUI Ning 2
(1. School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018,China;
2. School of Computer Science, Central South University, Changsha 410006,China)
Abstract:Feature selection is a data preprocessing technique that reduces model complexity and avoids the curse of dimensionality. In order to find the variable that is most relevant to the problem and its corresponding delay simultaneously in multivariate time series prediction, this paper proposes a multi-attention based supervised feature selection method. This method uses a deep learning model with an attention module and a learning module. The original two-dimensional time series data is orthogonally divided into two sets of one-dimensional data and input into the attention module of two different dimensions respectively to generate the attention weights of the feature dimension and the time dimension. Then the attention weights of the two dimensions are dotted with the product operation, used as a global attention score for feature selection, applied to the original data and updated continuously with the training process until the model converges. Experimental results show that the proposed method can achieve the effect of full data training when the number of features is less than 10, and achieves the best accuracy compared with several existing baseline methods.
Key Words: feature selection; time series; attention mechanism; multidimensional data; deep learning
0 引言
隨著物聯(lián)網(wǎng)的發(fā)展,越來(lái)越多的領(lǐng)域,包括工業(yè)[1]、生物學(xué)[2]、社交媒體[3]等,積累了大量按時(shí)間順序排列的高維數(shù)據(jù),即多元時(shí)間序列(MTS)。借助機(jī)器學(xué)習(xí)和深度學(xué)習(xí)手段可從這些時(shí)間序列中挖掘出大量有價(jià)值的信息供專(zhuān)業(yè)人員決策。然而,時(shí)間序列中存在的大量無(wú)關(guān)、冗余特征,不僅對(duì)學(xué)習(xí)器的學(xué)習(xí)造成極大困擾,還會(huì)增加計(jì)算開(kāi)銷(xiāo)[4]。特征選擇通過(guò)從數(shù)據(jù)集中選擇出與目標(biāo)變量相關(guān)的特征,有效減輕維數(shù)災(zāi)難問(wèn)題,被視為機(jī)器學(xué)習(xí)中至關(guān)重要的數(shù)據(jù)預(yù)處理步驟[5]。為了建立更準(zhǔn)確、更易理解的時(shí)間序列模型,確定與監(jiān)督目標(biāo)相關(guān)的最相關(guān)變量及其最合適的時(shí)間步長(zhǎng)非常重要,這對(duì)理解底層系統(tǒng)的物理、化學(xué)模型有很大幫助。
隨著深度學(xué)習(xí)的發(fā)展,注意力機(jī)制被提出并廣泛應(yīng)用于圖像處理和自然語(yǔ)言處理領(lǐng)域。注意力模型借鑒了人類(lèi)視覺(jué)的大腦信號(hào)處理機(jī)制,通過(guò)快速掃描全局圖像,獲得需要重點(diǎn)關(guān)注的目標(biāo)區(qū)域,對(duì)這一區(qū)域投入更多資源以獲取更多關(guān)注目標(biāo)的細(xì)節(jié)信息,抑制無(wú)用信息。