摘 要: 鏈路預測是指根據(jù)網(wǎng)絡(luò)中已知的信息對未知或未來可能存在的鏈路/鏈接進行預測,是網(wǎng)絡(luò)科學及數(shù)據(jù)挖掘領(lǐng)域的研究熱點之一。異質(zhì)信息網(wǎng)絡(luò)能夠更準確地刻畫數(shù)據(jù)中提供的語意信息,提高下游數(shù)據(jù)挖掘任務(wù)的效率。因此,異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方法需要兼顧網(wǎng)絡(luò)的拓撲特征與語義特征,為鏈路預測任務(wù)帶來新的挑戰(zhàn)。在前人研究的基礎(chǔ)上,系統(tǒng)性地梳理了近年來異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方法。首先,對異質(zhì)信息網(wǎng)絡(luò)和鏈路預測相關(guān)概念進行介紹;其次,對異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方法進行詳細分類,對不同類型異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方法進行了總結(jié),并對各類典型代表方法進行詳細介紹;然后,對異質(zhì)信息網(wǎng)絡(luò)上鏈路預測方法的應用進行了梳理;最后,總結(jié)了該領(lǐng)域在進一步研究中需要解決的問題,以及未來可能的發(fā)展方向。
關(guān)鍵詞: 異質(zhì)信息網(wǎng)絡(luò); 鏈路預測; 元路徑; 監(jiān)督學習
中圖分類號: TP 393.0
文獻標志碼: A
DOI:10.12305/j.issn.1001-506X.2024.08.22
Survey of link prediction method in heterogeneous information network
CAO Jiaping, LI Jichao*, JIANG Jiang
(College of Systems Engineering, National University of Defense Technology, Changsha 410073, China)
Abstract: Link prediction is the prediction of unknown or future links based on known information in the network, and is one of the research hotspots in the field of data mining. Heterogeneous information network can accurately portray the semantic information from data and improve the efficiency of downstream data mining tasks. Therefore, link prediction method on heterogeneous information network needs to take into account the topological characteristics and semantic characteristics of the network, which brings new challenges to the link prediction task. On the basis of previous research, this paper systematically sorts out the link prediction methods on heterogeneous information network in the past decade. Firstly, the concepts of heterogeneous information network and link prediction are introduced. Secondly, the link prediction methods in heterogeneous information network is classified, and the link prediction methods in different types of heterogeneous information network are summarized. Further more, the typical methods are introduced in detail. Then, the application of link prediction methods in heterogeneous information network are sorted. Finally, the problems that need to be addressed in further research in this field and potential future resarch directions are summaried.
Keywords: heterogeneous information network; link prediction; meta-path; supervised learning
0 引 言
隨著信息技術(shù)的不斷發(fā)展和算力的大幅度提升,互聯(lián)網(wǎng)已深入社會生活的各個領(lǐng)域,互聯(lián)網(wǎng)大數(shù)據(jù)類型多樣,關(guān)聯(lián)關(guān)系復雜,例如社交媒體和電子商務(wù)平臺擁有上億的用戶存量,且用戶之間的交互關(guān)系錯綜復雜,實體類型與關(guān)聯(lián)關(guān)系多樣,豐富的數(shù)據(jù)導致豐富的信息。Sun等[1]首先提出異質(zhì)信息網(wǎng)絡(luò)概念,用于刻畫網(wǎng)絡(luò)節(jié)點或連邊類型數(shù)量大于1的網(wǎng)絡(luò)。目前,針對異質(zhì)信息網(wǎng)絡(luò)的研究在包括聚類、分類、鏈路預測、排序、推薦、信息融合等[2]領(lǐng)域都取得了長足的進步,也被廣泛應用于醫(yī)療疾病診斷[3]、科研學術(shù)合作[4]、社交媒體信息挖掘[5]、電影信息推薦[6]、基因疾病匹配[7]、作戰(zhàn)網(wǎng)絡(luò)薄弱環(huán)節(jié)識別[8]等領(lǐng)域。
然而,在異質(zhì)信息網(wǎng)絡(luò)進行建模的過程中,由于數(shù)據(jù)體量巨大,難以保證獲取的數(shù)據(jù)質(zhì)量,網(wǎng)絡(luò)中節(jié)點、關(guān)聯(lián)關(guān)系等很多信息存在遺漏、缺失,甚至錯誤等情況;同時,數(shù)據(jù)獲取困難,部分數(shù)據(jù)在真實復雜系統(tǒng)中存在的交互關(guān)系,無法在網(wǎng)絡(luò)中刻畫出來,導致產(chǎn)生不完全信息條件下的網(wǎng)絡(luò)。綜上,對網(wǎng)絡(luò)中錯誤關(guān)系的重構(gòu)、未知關(guān)系與未來關(guān)系的預測有十分重要的現(xiàn)實價值[9-11]。鏈路預測也稱鏈接預測,作為異質(zhì)信息網(wǎng)絡(luò)的熱門研究方向,有重要的理論和應用價值。在理論方面,鏈路預測可以從現(xiàn)有觀察出發(fā),挖掘網(wǎng)絡(luò)中的缺失信息,并進行網(wǎng)絡(luò)結(jié)構(gòu)、功能和演化趨勢預測[12]。在應用方面,互聯(lián)網(wǎng)大數(shù)據(jù)體量龐大,難以獲取完整網(wǎng)絡(luò)結(jié)構(gòu),鏈路預測能夠根據(jù)已有數(shù)據(jù)對缺失的鏈接進行預測和網(wǎng)絡(luò)重構(gòu),并基于歷史時間戳數(shù)據(jù)預測動態(tài)網(wǎng)絡(luò)鏈接,例如電商平臺中商品的推薦、交友平臺中潛在好友的推薦,以及裝備網(wǎng)絡(luò)中裝備之間的交互關(guān)系預測。
作為數(shù)據(jù)挖掘領(lǐng)域的熱門研究話題,近年來,有大量學者對鏈路預測開展研究。2019年,鏈路預測在異質(zhì)信息網(wǎng)絡(luò)上的應用開始被大量研究,與之相關(guān)的技術(shù)包括機器學習、圖卷積網(wǎng)絡(luò)等。相關(guān)學者已對信息網(wǎng)絡(luò)上的鏈路預測問題從不同角度進行了總結(jié),如呂琳媛等[13]從網(wǎng)絡(luò)結(jié)構(gòu)層面對同質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方法進行了系統(tǒng)總結(jié);Shi等[2]在對異質(zhì)信息網(wǎng)絡(luò)進行分析的過程中,對異質(zhì)信息網(wǎng)絡(luò)上的節(jié)點相似度計算、鏈路預測以及推薦方法進行總結(jié);Daud[14]分別從網(wǎng)絡(luò)類型和技術(shù)應用場景兩個維度,對社交網(wǎng)絡(luò)上的鏈路預測方法進行梳理;Kumar等[15]對鏈路預測的技術(shù)和應用進行總結(jié),并對29個基于拓撲特征的鏈路預測方法和8個基于網(wǎng)絡(luò)表示學習的鏈路預測方法在8個數(shù)據(jù)集上進行比較。然而,根據(jù)已有的鏈路預測研究綜述可知,目前沒有綜述對近幾年異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方法進行系統(tǒng)總結(jié)。本文在已有研究的基礎(chǔ)上,針對上述綜述文章的不足,結(jié)合近期異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方向取得的最新成果,對異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方法進行總結(jié),主要貢獻包括以下3點:
(1) 本文在已有綜述研究的基礎(chǔ)上對異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方法進行了系統(tǒng)梳理與總結(jié),從監(jiān)督學習和無監(jiān)督學習兩個方面對異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測問題進行了詳細介紹。
(2) 本文除了對無權(quán)無向的靜態(tài)單層異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方法進行系統(tǒng)梳理,還對多層異質(zhì)信息網(wǎng)絡(luò)以及時序異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方法進行單獨總結(jié)。
(3) 本文對異質(zhì)信息網(wǎng)絡(luò)在不同領(lǐng)域的應用進行闡述,并指出異質(zhì)信息網(wǎng)絡(luò)鏈路預測領(lǐng)域待解決的問題,同時提出了該領(lǐng)域未來可能的研究方向。
本文的結(jié)構(gòu)如下:第1節(jié)對異質(zhì)信息網(wǎng)絡(luò)進行建模,介紹了異質(zhì)信息網(wǎng)絡(luò)和鏈路預測的相關(guān)概念,并說明異質(zhì)信息網(wǎng)絡(luò)上鏈路預測的評價方法;第2節(jié)從基于結(jié)構(gòu)相似性、元路徑相似性以及特征向量相似性3個角度對無監(jiān)督學習的鏈路預測方法進行介紹;第3節(jié)從基于特征工程、元路徑以及深度學習3個角度對監(jiān)督學習的鏈路預測方法進行介紹;第4節(jié)對不同類型異質(zhì)信息網(wǎng)絡(luò)進行比較,并總結(jié)了不同類型網(wǎng)絡(luò)上的鏈路預測方法;第5節(jié)介紹了異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測應用實例;第6節(jié)對異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測進行總結(jié),并指出未來可能的研究方向。
1 基于異質(zhì)信息網(wǎng)絡(luò)的鏈路預測問題描述
1.1 異質(zhì)信息網(wǎng)絡(luò)相關(guān)概念介紹
互聯(lián)網(wǎng)大數(shù)據(jù)包含成千上萬的實體與關(guān)系,實體的類型豐富多樣,關(guān)系錯綜復雜,用已被廣泛研究的同質(zhì)信息網(wǎng)絡(luò)對其進行建模,會因?qū)嶓w高度抽象,而無法真實地反映其在現(xiàn)實世界中的特性與其內(nèi)部交互關(guān)系,越來越多的學者選擇用異質(zhì)信息網(wǎng)絡(luò)對真實系統(tǒng)進行建模[2,16]。
異質(zhì)信息網(wǎng)絡(luò)也叫異構(gòu)信息網(wǎng)絡(luò),用G=(V,E)來表示,節(jié)點類型方程為φ:V→A,邊類型方程為:E→R,|A|+|R|gt;2,其中|A|表示節(jié)點類型的數(shù)量,|R|表示邊類型的數(shù)量。每個節(jié)點v∈V屬于節(jié)點類型集合A:φ(v)∈A中特定的節(jié)點類型,每條邊類型e∈E屬于邊類型集合R:(e)∈R中特定的邊類型。圖1所示是從DBLP(database systems and logic programming)中抽取出的一個典型的科研合作異質(zhì)信息網(wǎng)絡(luò),網(wǎng)絡(luò)中有4種節(jié)點類型和4種邊類型。
定義 1 網(wǎng)絡(luò)模式是異質(zhì)信息網(wǎng)絡(luò)G=(V,E)的元架構(gòu),一般表示為TG=(B,R),其中B表示實體類型。
網(wǎng)絡(luò)模式可以視為沒有節(jié)點屬性信息的E-R模型,從中能夠直觀看出網(wǎng)絡(luò)中節(jié)點類型和邊類型的數(shù)量[17]。從圖1中抽取科學合作異質(zhì)信息網(wǎng)絡(luò)的網(wǎng)絡(luò)模式,如圖2所示。作者與論文之間是撰寫與被撰寫關(guān)系,論文與期刊之間是發(fā)表與發(fā)表于關(guān)系,期刊與領(lǐng)域之間屬于與包含關(guān)系,論文之間是引用與被引用關(guān)系。
定義 2 元路徑一般表示為P,是定義在網(wǎng)絡(luò)模式TG=(A,R)上的一條路徑,表示為A1R1A2R2…RlAl+1,也可表示為節(jié)點類型A1,A2,…,Al+1之間的組合關(guān)系R=R1R2…Rl,其中,表示關(guān)系間的組合操作。
從圖1中抽取3條作者與作者之間的元路徑,分別如圖3(a)、圖3(b)與圖3(c)所示。其中,作者用A(author)表示、論文用P(paper)表示、領(lǐng)域用V(venue)表示。元路徑A撰寫P撰寫A(APA)表示兩位作者合著了一篇論文。元路徑A撰寫P屬于V屬于P撰寫A(APVPA)表示兩位作者在同一領(lǐng)域發(fā)表論文。元路徑A撰寫P引用/被引用P撰寫A(APPA)表示兩位作者的文章之間有引用/被引用關(guān)系。元路徑的長度由兩節(jié)點間的關(guān)系數(shù)量決定,元路徑越長,兩端點之間的相關(guān)性越差,因此一般控制元路徑長度在一定范圍內(nèi)。元路徑可以是非對稱的,也可以是帶權(quán)重信息的,如豆瓣評分網(wǎng)絡(luò),該網(wǎng)絡(luò)以評分分數(shù)作為網(wǎng)友與電影之間的關(guān)系的權(quán)重。
定義 3 元圖也叫元結(jié)構(gòu),一般表示為S,是一個定義在網(wǎng)絡(luò)模式TG=(A,R)上,擁有一個單一來源節(jié)點(例如入度為0)和一個單一目標節(jié)點(例如出度為0)的有向無環(huán)圖。通常S=(N,M,ns,nt),其中N是一個節(jié)點集合,M是一個邊的集合。任意節(jié)點x∈N,x∈A,任意邊(x,y)∈M,(x,y)∈R。
元圖是異質(zhì)信息網(wǎng)絡(luò)的子圖,從圖1中抽取一個元圖如圖4所示,表示兩位作者撰寫的論文同時發(fā)表在同一本期刊上,并且同時被相同作者引用。元圖可以聚合不同元路徑,在實體解析、排序和聚類任務(wù)中有明顯優(yōu)勢[18]。
異質(zhì)信息網(wǎng)絡(luò)是節(jié)點類型數(shù)或關(guān)系類型數(shù)大于1的網(wǎng)絡(luò),但任意兩節(jié)點間存在單一類型鏈接,其與多關(guān)系網(wǎng)絡(luò)和多路異構(gòu)網(wǎng)絡(luò)在網(wǎng)絡(luò)結(jié)構(gòu)上有本質(zhì)區(qū)別。多關(guān)系網(wǎng)絡(luò)中節(jié)點類型唯一,但任意兩節(jié)點之間鏈接類型大于1;多路異構(gòu)網(wǎng)絡(luò)中節(jié)點類型數(shù)量大于1,任意兩節(jié)點之間的鏈接類型數(shù)量也大于1[19]。復雜網(wǎng)絡(luò)研究主要關(guān)注的是網(wǎng)絡(luò)的結(jié)構(gòu)、特征和功能,而異質(zhì)信息網(wǎng)絡(luò)研究主要關(guān)注的是發(fā)掘網(wǎng)絡(luò)中有用的信息[20],具體分類如表1所示。
1.2 鏈路預測相關(guān)概念介紹
鏈路預測是指通過網(wǎng)絡(luò)中已知的節(jié)點以及網(wǎng)絡(luò)結(jié)構(gòu)信息識別兩節(jié)點之間存在連邊的可能性,這種可能性既包括判斷錯誤連邊和補全未知連邊,也包括預測未來連邊。目前,鏈路預測方法主要可以分為監(jiān)督學習方法和無監(jiān)督學習方法。
呂琳媛[21]給出了無向網(wǎng)絡(luò)上的鏈路預測定義。
定義 4 無向網(wǎng)絡(luò)上的鏈路預測是指給定無向網(wǎng)絡(luò)G(V,E),其中網(wǎng)絡(luò)總的節(jié)點數(shù)為N,邊數(shù)為M,該網(wǎng)絡(luò)共有N(N-1)/2個節(jié)點對,即全集U,對每對沒有連邊的節(jié)點對(x,y)∈(U/E)賦予一個分數(shù)值Sxy,然后將所有未連接的節(jié)點對按照該分數(shù)值從大到小排序,排在最前面的節(jié)點對出現(xiàn)的概率最大。
Yang等[22]給出了小樣本動態(tài)網(wǎng)絡(luò)上的鏈路預測定義。
定義 5 動態(tài)網(wǎng)絡(luò)上的鏈路預測是指給定動態(tài)網(wǎng)絡(luò)G=(V,E)和一個時間戳t,其中VnewV代表時間戳t之后出現(xiàn)的節(jié)點集合。給定所有在時間戳t之前存在的節(jié)點和它們的連邊,基于最初的K個鏈接預測節(jié)點v∈Vnew的未來鏈接。
隨著人工智能技術(shù)的不斷發(fā)展,越來越多的研究者將鏈路預測問題與監(jiān)督學習方法相結(jié)合,將鏈路預測問題視為二分類問題,樣本標簽為有鏈接或無鏈接,通過網(wǎng)絡(luò)表示學習判斷樣本標簽。
鏈路預測不同于推薦,鏈路預測與推薦之間的關(guān)系用韋恩圖表示,如圖5所示,重合部分表示在特殊的異質(zhì)信息網(wǎng)絡(luò)(即二部分網(wǎng)絡(luò))上對未來鏈接進行預測。部分學者在具有附屬關(guān)系的異質(zhì)信息網(wǎng)絡(luò)上進行推薦方法研究[23]。此外,知識圖譜補全也涉及到鏈路預測技術(shù),包括基于翻譯模型[24-26]、基于語義匹配模型[27-29]的方法等,但基于知識圖譜的鏈路預測方法不適用于異質(zhì)信息網(wǎng)絡(luò)。從應用角度來看,知識圖譜屬于多關(guān)系網(wǎng)絡(luò),通常意義上的異質(zhì)信息網(wǎng)絡(luò)是簡單異質(zhì)信息網(wǎng)絡(luò),基于任意兩節(jié)點之間只存在一種關(guān)系的假設(shè)進行研究,而知識圖譜中任意兩節(jié)點之間存在多種關(guān)系。知識圖譜通常以頭實體、關(guān)系和尾實體構(gòu)成的三元組為研究對象,用于語義搜索[30]、知識問答推理[31]、自然語言處理等領(lǐng)域,主要技術(shù)手段包括實體抽取、實體消岐等。鏈路預測是知識圖譜的一種構(gòu)建手段,而異質(zhì)信息網(wǎng)絡(luò)主要服務(wù)于數(shù)據(jù)挖掘任務(wù),例如社區(qū)探測[32]、聚類[33]、鏈路預測[34]等。
1.3 基于異質(zhì)信息網(wǎng)絡(luò)的鏈路預測評價方法
衡量鏈路預測算法準確性的指標有精確率、準確率、召回率、曲線下面積(area under curve, AUC)值、接收者操作特征(receiver operating characteristic, ROC)曲線等,這些指標對預測精確度衡量的側(cè)重點不同。獲得以上指標首先需要計算混淆矩陣,如表2所示。混淆矩陣的橫軸表示判斷樣本為正樣本P(positive)還是負樣本N(negative),縱軸表示預測正確T(true)還是錯誤F(1),其中FN表示樣本本身為正樣本但預測為負樣本的樣本數(shù)量;FP表示樣本是負樣本但預測為正樣本的樣本數(shù)量;TN表示樣本是負樣本但預測為負樣本的樣本數(shù)量;TP表示樣本為正樣本但預測為正樣本的樣本數(shù)量。
由于鏈路預測屬于類別不平衡的預測問題,需要類別不平衡條件下的評價指標。準確率-召回率(precision-recall, PR)曲線顯示了不同閾值下準確率和召回率之間的權(quán)衡,是當類別非常不平衡時衡量預測成功與否的有用指標。PR曲線下方面積代表高召回率和高精度,其中高精度與低誤報率相關(guān),高召回率與低誤報率相關(guān)。高精度和高召回率分別表示分類器返回正確的結(jié)果,以及返回大部分準確的結(jié)果。
ROC曲線如圖6所示,其橫坐標表示偽正類率(1 positive rate, FPR),定義為
FPR=FPFP+TN(4)
縱坐標表示真正類率(true positive rate, TPR),定義為
TPR=TPTP+FN(5)
曲線上的每一個點表示對同一信號的刺激感受性。ROC曲線越靠近(0,1)點,越偏離45°對角線,表示預測效果越好。AUC值是ROC曲線下面的面積,AUC值越大,表示預測效果越好,AUC表示隨機給定一個正樣本和一個負樣本,預測正樣本為正的概率比預測負樣本為正的概率大。
解決鏈路預測問題的方法根據(jù)樣本是否有標簽可以分為無監(jiān)督學習方法和監(jiān)督學習方法兩類。無監(jiān)督學習方法是根據(jù)網(wǎng)絡(luò)本身的屬性設(shè)計節(jié)點間相似性指標,通過計算節(jié)點對間相似性大小并進行排序,來預測節(jié)點對間的鏈接,關(guān)鍵在于設(shè)計合理的節(jié)點對相似性度量指標;監(jiān)督學習方法是將鏈路預測問題視為二分類問題,關(guān)鍵在于獲取節(jié)點的嵌入表示。下面將對以上兩類方法分別進行總結(jié),同時對基于多層網(wǎng)絡(luò)以及時序網(wǎng)絡(luò)類型的鏈路預測方法分別進行總結(jié)。
2 基于無監(jiān)督學習的異質(zhì)信息網(wǎng)絡(luò)鏈路預測方法
無監(jiān)督學習是指利用無標簽樣本所包含的信息進行學習以判斷樣本標簽,常見的無監(jiān)督學習任務(wù)包括聚類和降維兩類任務(wù)[35]?;谙嗨菩缘逆溌奉A測任務(wù)也是無監(jiān)督學習的一個類別,通過定義相似性指標對節(jié)點之間相似性進行排序,排名靠前的節(jié)點對之間更有可能產(chǎn)生鏈接。
2.1 基于結(jié)構(gòu)相似性的鏈路預測方法
基于結(jié)構(gòu)相似性的鏈路預測方法主要是同質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方法,同質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方法已取得了較好的預測效果,若直接將該類方法應用于異質(zhì)信息網(wǎng)絡(luò),預測精度會有顯著下降,這類方法一般只關(guān)注網(wǎng)絡(luò)的結(jié)構(gòu)信息,沒有考慮網(wǎng)絡(luò)的語義信息?;谙嗨菩缘逆溌奉A測方法是一種傳統(tǒng)的鏈路預測方法,2014年之前的鏈路預測研究主要集中于無監(jiān)督學習方法,根據(jù)文獻[2]的總結(jié),傳統(tǒng)鏈路預測方法包括PathPredict[36]、CREATE[37],應用于推薦領(lǐng)域的鏈路預測方法包括HeteRecom[38]、SemRec[39]、基于元路徑潛在特征的方法[40-41]、基于多關(guān)系混合加權(quán)的方法[42]等。Daud[14]和Kumar等[15]將基于相似性的鏈路預測方法分為全局、局部和擬局部指標3類;呂琳媛等[13]將基于相似性的鏈路預測指標分為基于局部信息的相似性指標[43-53]、基于路徑的相似性指標[54-56]、基于隨機游走的相似性指標[57-61]。其中,基于局部信息的相似性指標是指相似性的計算僅參考節(jié)點的局部信息,適用于大規(guī)模網(wǎng)絡(luò),如共同鄰居指標、偏好鏈接相似性指標、局部樸素貝葉斯等;基于路徑的相似性指標是指參考多階鄰居或整個網(wǎng)絡(luò)信息計算相似性的指標,可以提高鏈路預測的精度,如局部路徑指標、Katz指標以及LHN-II指標;基于隨機游走的相似性指標是基于隨機游走過程定義的,例如全局隨機游走相似性指標和局部隨機游走相似性指標。
2.2 基于元路徑相似性的異質(zhì)信息網(wǎng)絡(luò)鏈路預測方法
為解決同質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方法忽視網(wǎng)絡(luò)提供的語義信息的缺陷,許多學者結(jié)合元路徑的概念,提出了基于元路徑相似性的鏈路預測方法,以提高異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測準確性。通過所研究領(lǐng)域豐富的領(lǐng)域知識抽取元路徑,可以使得節(jié)點之間的相似性不僅考慮拓撲結(jié)構(gòu)上的相似性,同時考慮實際語義背景下的相似性,提高了異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測精度。Shi等[2]在其綜述文章中已對2016年以前利用元路徑在異質(zhì)信息網(wǎng)絡(luò)上進行鏈路預測的主要文章進行了總結(jié),其將基于相似性和應用于推薦領(lǐng)域的鏈路預測方法單獨進行總結(jié)[61-68]。此外,部分學者[69-74]提出基于元路徑的相似性度量方法,根據(jù)節(jié)點之間相似性函數(shù)值的大小來判斷節(jié)點之間的鏈接情況。還有學者結(jié)合不同學科領(lǐng)域的實際情況,將基于元路徑的相似性度量方法應用于投資、軍事、學術(shù)合作等領(lǐng)域的鏈路預測任務(wù)[75-81]。
2.3 基于特征向量相似性的異質(zhì)信息網(wǎng)絡(luò)鏈路預測方法
隨著數(shù)據(jù)量的指數(shù)級增長,網(wǎng)絡(luò)的稀疏性是異質(zhì)信息網(wǎng)絡(luò)挖掘任務(wù)面臨的難題,考慮到元路徑長度的限制,基于元路徑的無監(jiān)督學習方法難以處理大型異質(zhì)信息網(wǎng)絡(luò)數(shù)據(jù)。隨著人工智能方法的廣泛應用,基于特征向量相似性的鏈路預測方法可以避免網(wǎng)絡(luò)稀疏性導致的拓撲結(jié)構(gòu)和語義特征獲取困難的問題?;谔卣飨蛄康逆溌奉A測方法通過網(wǎng)絡(luò)表示學習并獲得節(jié)點的特征向量,根據(jù)相似性函數(shù)來計算節(jié)點相似度。例如,Wang等[82]將目標異質(zhì)網(wǎng)絡(luò)分解成影響網(wǎng)絡(luò)、相似性網(wǎng)絡(luò)和信息通道(information tunnels, IT)3個部分,通過交換相似性指標和影響分數(shù)來計算更精確的分數(shù);Fan等[83]提出了一種聯(lián)合鏈路預測指數(shù),同時考慮節(jié)點類型的影響及其結(jié)構(gòu)的相似性。
3 基于監(jiān)督學習的異質(zhì)信息網(wǎng)絡(luò)鏈路預測方法
監(jiān)督學習是指利用帶標簽樣本所包含的信息進行學習以判斷無標簽樣本的標簽,常見的監(jiān)督學習任務(wù)包括分類和回歸兩類任務(wù)[35]。在鏈路預測任務(wù)中,樣本標簽指二分類標簽,即有鏈接和無鏈接兩類,通過機器學習或深度學習方法對帶標簽樣本進行學習來判斷無標簽樣本的標簽。
3.1 基于特征工程的異質(zhì)信息網(wǎng)絡(luò)鏈路預測方法
基于傳統(tǒng)監(jiān)督學習的鏈路預測方法是指通過特征工程獲取應用于下游鏈路預測任務(wù)的特征向量,基于獲取的特征向量進行樣本標簽預測,是一類典型的基于監(jiān)督學習的鏈路預測方法[84-87]。本節(jié)與第2.3節(jié)中介紹的基于特征向量的鏈路預測方法在獲取節(jié)點的特征向量階段,都是結(jié)合領(lǐng)域知識并設(shè)計相關(guān)表示學習框架來獲得節(jié)點的嵌入表示。不同之處在于,第2.3節(jié)是通過計算節(jié)點對中兩節(jié)點特征向量的相似性,并根據(jù)相似性大小排序來預測鏈接,是一種典型的無監(jiān)督學習方法;而本節(jié)是將節(jié)點對中兩節(jié)點嵌入表示構(gòu)造為分類樣本,通過預測樣本標簽來進行鏈路預測,是一種典型的監(jiān)督學習方法。例如,Guan等[88]利用超圖來學習網(wǎng)絡(luò)中更高層的關(guān)系;Ji等[89]通過重要性采樣的方法來加速大規(guī)模異質(zhì)信息網(wǎng)絡(luò)上的表示學習。此外,部分學者提出網(wǎng)絡(luò)表示學習框架,來提取網(wǎng)絡(luò)特征,以進行下游的鏈路預測任務(wù)[90-99]。還有一些學者提出網(wǎng)絡(luò)表示學習框架進行特征工程,以獲得節(jié)點的特征向量[100-117]。例如,文獻[118]學習二部分圖上節(jié)點嵌入、文獻[119]應用關(guān)系感知的異質(zhì)圖轉(zhuǎn)換模型來捕捉網(wǎng)絡(luò)中的異質(zhì)信息。
3.2 基于元路徑監(jiān)督學習的異質(zhì)信息網(wǎng)絡(luò)鏈路預測方法
近年來,越來越多的學者使用網(wǎng)絡(luò)表示學習與元路徑結(jié)合的方法來學習節(jié)點、連邊或元路徑的特征向量,用于鏈路預測等任務(wù)。本節(jié)與第2.2節(jié)基于元路徑相似性的鏈路預測方法都是通過元路徑來學習節(jié)點的嵌入表示,不同之處在于,第2.2節(jié)是通過計算節(jié)點對中兩節(jié)點的特征向量的相似性來進行鏈路預測,是一種典型的無監(jiān)督學習方法;而本節(jié)是將節(jié)點對中兩節(jié)點的嵌入表示構(gòu)造為分類樣本,是一種典型的監(jiān)督學習方法。例如,F(xiàn)u等[120]提出異質(zhì)信息網(wǎng)絡(luò)2Vec方法,來學習節(jié)點和元路徑的特征向量;Chen等[121]提出MULRP方法,通過多標簽學習來進行鏈路預測;Fu等[122]提出MEGAE方法,應用基于元路徑的圖注意力機制來學習特征向量;Gao等[123]提出mcRNN方法,可以更高效地學習和融合節(jié)點和元路徑的特征。
部分學者考慮到關(guān)系間的重要度差異,基于加權(quán)元路徑進行鏈路預測。例如,Mohdeb等[124]提出WMPLP方法,將加權(quán)的元路徑與隨機游走方法結(jié)合完成鏈路預測任務(wù),He等[125]提出PSR-vec方法,基于帶權(quán)重的元路徑對專利交易進行推薦。此外,部分學者考慮到元路徑的準確獲取需要結(jié)合豐富的領(lǐng)域知識,基于自動抽取的元路徑進行鏈路預測。例如,Cao等[126]提出LiPaP方法,通過自動抽取結(jié)構(gòu)豐富的異質(zhì)信息網(wǎng)絡(luò)中的元路徑,來解決元路徑獲取困難的問題;Liang等[127]通過設(shè)計元路徑自動生成機制來獲得節(jié)點向量表示。
3.3 基于深度學習的異質(zhì)信息網(wǎng)絡(luò)鏈路預測方法
基于特征工程的方法獲得應用于下游鏈路預測任務(wù)的特征向量,不僅需要異質(zhì)信息網(wǎng)絡(luò)所建模領(lǐng)域的專業(yè)知識背景,同時面對大規(guī)模網(wǎng)絡(luò),需要耗費大量算力支撐特征提取工作。隨著網(wǎng)絡(luò)表示學習方法的出現(xiàn),可以自動學習網(wǎng)絡(luò)特征,并將其應用于下游鏈路預測任務(wù)。例如deepwalk這樣的淺層模型,通過截斷隨機游走獲得節(jié)點嵌入表示;語義感知模型metapath2vec[128]獲得考慮節(jié)點和連邊的異質(zhì)性的嵌入表示;內(nèi)容感知模型ASNE[129]同時學習網(wǎng)絡(luò)結(jié)構(gòu)信息和節(jié)點屬性信息。上述模型雖然能夠直接獲得節(jié)點潛在嵌入表示,但是并沒有保留節(jié)點豐富的鄰居信息,圖神經(jīng)網(wǎng)絡(luò)(graph neural network, GNN)能夠聚合節(jié)點豐富的鄰居信息,獲得同時考慮網(wǎng)絡(luò)結(jié)構(gòu)和語義的節(jié)點嵌入表示[130-133]。
循環(huán)神經(jīng)網(wǎng)絡(luò)是指一個輸入序列當前的輸出與之前的輸出有關(guān),即神經(jīng)網(wǎng)絡(luò)存在記憶功能,適用于學習時序網(wǎng)絡(luò)這類隨時間動態(tài)變化的網(wǎng)絡(luò)節(jié)點的嵌入表示。最常用的循環(huán)神經(jīng)網(wǎng)絡(luò)方法是長短期記憶(long short term memory, LSTM)網(wǎng)絡(luò),其可以通過門結(jié)構(gòu)去除或增加信息。應用循環(huán)神經(jīng)網(wǎng)絡(luò)的表示學習框架包括Sentic LSTM[134]、HA-LSTM[135]等。
圖注意力網(wǎng)絡(luò)(graph attention network, GAN)通過增加自注意力層改進圖卷積網(wǎng)絡(luò),將所有鄰居節(jié)點視為貢獻同等信息量的鄰居的缺點,根據(jù)鄰居節(jié)點的特征,為不同鄰居節(jié)點分配不同的權(quán)重,以聚合形成目標節(jié)點的嵌入表示[136-139]。例如,Huang等[140]提出AMVAE表示學習方法,用基于注意力機制的LSTM去學習不同數(shù)據(jù)形態(tài)之間的聯(lián)系;Zhu等[141]提出SESGAT表示學習框架,為基于語義子圖和圖注意力機制的社交網(wǎng)絡(luò)鏈路預測方法。
GNN在圖神經(jīng)網(wǎng)絡(luò)的基礎(chǔ)上引入卷積模塊和門控機制,既可以編碼圖結(jié)構(gòu),也可以編碼節(jié)點特征,將網(wǎng)絡(luò)的結(jié)構(gòu)與節(jié)點屬性特征相結(jié)合,以獲得節(jié)點嵌入表示[142-145]。例如,F(xiàn)u等[146]提出MVGCN表示學習方法,通過構(gòu)建多視圖異質(zhì)網(wǎng)絡(luò),應用自監(jiān)督學習方法來獲得節(jié)點屬性并將其作為初始嵌入表示,接著設(shè)計一個鄰居信息聚合層來迭代節(jié)點嵌入表示。
4 特殊異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方法
大多數(shù)異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測研究均是在無權(quán)無向網(wǎng)絡(luò)上進行,但異質(zhì)信息網(wǎng)絡(luò)又包括有向異質(zhì)信息網(wǎng)絡(luò)、加權(quán)異質(zhì)信息網(wǎng)絡(luò)、多層異質(zhì)信息網(wǎng)絡(luò)以及時序異質(zhì)信息網(wǎng)絡(luò),本節(jié)首先對不同類型異質(zhì)信息網(wǎng)絡(luò)進行介紹,并對以上類型異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方法進行總結(jié)。
4.1 不同類型異質(zhì)信息網(wǎng)絡(luò)
異質(zhì)信息網(wǎng)絡(luò)可分為有向異質(zhì)信息網(wǎng)絡(luò)、加權(quán)異質(zhì)信息網(wǎng)絡(luò)、多層異質(zhì)信息網(wǎng)絡(luò)和時序異質(zhì)信息網(wǎng)絡(luò),若沒有提及方向和權(quán)重,則默認指網(wǎng)絡(luò)連邊無向無權(quán)。大部分研究異質(zhì)信息網(wǎng)絡(luò)鏈路預測的文章以前述單一網(wǎng)絡(luò)類型為研究對象,少部分學者會針對兩種或兩種以上由單一類型網(wǎng)絡(luò)結(jié)合的復合類型網(wǎng)絡(luò)進行研究。例如,文獻[147-148]以有向、加權(quán)的時序社交網(wǎng)絡(luò)為研究對象,文獻[149-151]以加權(quán)時序網(wǎng)絡(luò)為研究對象。目前,在該領(lǐng)域沒有學者將多層異質(zhì)信息網(wǎng)絡(luò)與時序異質(zhì)信息網(wǎng)絡(luò),或加權(quán)異質(zhì)信息網(wǎng)絡(luò),或有向異質(zhì)信息網(wǎng)絡(luò)中的一類或兩類網(wǎng)絡(luò)結(jié)合起來進行研究,如表3所示。其中,“√”表示有學者進行這類網(wǎng)絡(luò)的鏈路預測研究,“×”表示目前沒有學者進行這類網(wǎng)絡(luò)的鏈路預測研究。由于表格的對稱性,僅在表格右上部分進行統(tǒng)計,“-”表示無標注。
4.2 特殊類型異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方法
無向無權(quán)異質(zhì)信息網(wǎng)絡(luò)節(jié)點之間的連邊沒有方向,是最簡單的一種異質(zhì)信息網(wǎng)絡(luò)。有向異質(zhì)信息網(wǎng)絡(luò)是指連邊有方向的網(wǎng)絡(luò),為了更有效地刻畫真實網(wǎng)絡(luò),許多學者通過構(gòu)建有向網(wǎng)絡(luò)[152]、加權(quán)網(wǎng)絡(luò)[153-154]來研究鏈路預測問題。
接下來對特殊類型異質(zhì)信息網(wǎng)絡(luò)中兩種典型網(wǎng)絡(luò),即多層網(wǎng)絡(luò)和時序網(wǎng)絡(luò)上的鏈路預測研究分別進行詳細總結(jié)。常見的多層網(wǎng)絡(luò)鏈路預測問題中的多層網(wǎng)絡(luò)一般指部分對齊網(wǎng)絡(luò),即同一實體在不同數(shù)據(jù)庫中的匹配,也稱錨鏈接預測,其能夠聚合多源信息,有效解決推薦中冷啟動問題。多層異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方法可以分為監(jiān)督學習方法和無監(jiān)督學習方法?;跓o監(jiān)督學習的多層異質(zhì)信息網(wǎng)絡(luò)鏈路預測方法一般通過元路徑來獲得網(wǎng)絡(luò)的拓撲和語義特征,其中元路徑又分為網(wǎng)絡(luò)內(nèi)部的元路徑與網(wǎng)絡(luò)間的元路徑[155-156]?;诒O(jiān)督學習的多層異質(zhì)信息網(wǎng)絡(luò)鏈路預測中的網(wǎng)絡(luò)表示學習方法不僅可以在網(wǎng)絡(luò)內(nèi)部進行學習,還可以在網(wǎng)絡(luò)間進行學習[157-159]。例如,Li等[160]提出TALP表示學習框架,基于圖注意力網(wǎng)絡(luò)的類型感知的錨鏈接預測方法。
時序網(wǎng)絡(luò)指隨時間推移有節(jié)點加入或新的鏈接產(chǎn)生的網(wǎng)絡(luò),網(wǎng)絡(luò)中的節(jié)點特征向量不僅由當前時間戳上的網(wǎng)絡(luò)結(jié)構(gòu)與語義所決定,還被過去時間戳上的網(wǎng)絡(luò)結(jié)構(gòu)和語義所決定。例如,Yin等[161]提出DHNE方法,結(jié)合歷史網(wǎng)絡(luò)和當前網(wǎng)絡(luò)學習節(jié)點的嵌入表示,并應用隨機游走獲取節(jié)點的語義信息;Zhao等[162]提出DHBN模型,應用圖網(wǎng)絡(luò)嵌入融合多層網(wǎng)絡(luò)信息和動態(tài)網(wǎng)絡(luò)上的演化模式。
5 異質(zhì)信息網(wǎng)絡(luò)鏈路預測應用
隨著互聯(lián)網(wǎng)大數(shù)據(jù)的應用深入各個領(lǐng)域,異質(zhì)信息網(wǎng)絡(luò)可以對現(xiàn)實中的系統(tǒng)進行刻畫與建模,下面以常見的社交網(wǎng)絡(luò)、蛋白質(zhì)疾病網(wǎng)絡(luò)、科研合作網(wǎng)絡(luò)以及作戰(zhàn)信息網(wǎng)絡(luò)為例,對以上所提領(lǐng)域的異質(zhì)信息網(wǎng)絡(luò)進行介紹,并給出部分數(shù)據(jù)源。
社交網(wǎng)絡(luò)是典型的異質(zhì)信息網(wǎng)絡(luò),以新浪微博為例,微博中的用戶和用戶發(fā)的帖子是網(wǎng)絡(luò)中的節(jié)點,用戶之間可以是關(guān)注關(guān)系,也可以是點贊或通信關(guān)系,同時用戶與帖子之間的關(guān)系可以是瀏覽關(guān)系,也可以是點贊關(guān)系,而為用戶推薦其感興趣的帖子便是其中的鏈路預測問題。隨著在線社交平臺的不斷發(fā)展,社交網(wǎng)絡(luò)種類也變得多樣化,除了傳統(tǒng)的微信、微博、豆瓣等聊天點評網(wǎng)站,還包括求職、婚戀、交友網(wǎng)站等。每天都有成千上萬的用戶涌入各種社交網(wǎng)絡(luò),與網(wǎng)絡(luò)中已有節(jié)點產(chǎn)生不同類型的鏈接。同時,網(wǎng)絡(luò)中已有節(jié)點間不斷有新的鏈接產(chǎn)生,預測以上類型鏈接有利于提高社交效率,也為在線社交平臺帶來巨大的商業(yè)價值。研究異質(zhì)信息網(wǎng)絡(luò)常用的社交網(wǎng)絡(luò)數(shù)據(jù)包括推特、微信、新浪微博、領(lǐng)英、豆瓣等網(wǎng)站上爬取的數(shù)據(jù)。除此之外,基于位置信息的社交鏈路預測研究常采用帶位置信息的數(shù)據(jù)集,常用的數(shù)據(jù)集有Foursquare數(shù)據(jù)集[163]、Yelp數(shù)據(jù)集等。
蛋白質(zhì)疾病網(wǎng)絡(luò)主要包括miRNA-疾病關(guān)系異質(zhì)信息網(wǎng)絡(luò),疾病-藥物異質(zhì)信息網(wǎng)絡(luò),藥物-標簽異質(zhì)信息網(wǎng)絡(luò)等。miRNA-疾病關(guān)系異質(zhì)網(wǎng)絡(luò)主要用于研究miRNA與疾病之間的關(guān)聯(lián)關(guān)系,挖掘特定miRNA會導致的疾病類型;疾病-藥物異質(zhì)信息網(wǎng)絡(luò)和藥物-標簽異質(zhì)信息網(wǎng)絡(luò)主要用于藥物再利用相關(guān)研究。以上3種網(wǎng)絡(luò)的鏈路預測問題均屬于挖掘已存在但未知的網(wǎng)絡(luò)鏈接,miRNA-疾病關(guān)系主要來自miR2Disease數(shù)據(jù)庫,藥物-疾病關(guān)系中的藥物信息來自包含593種藥物的DrugBank,人類疾病信息來自包含313種人類疾病的OMIM數(shù)據(jù)庫,藥物的副作用、靶標譜、配體結(jié)合位點等信息來自Uniprot、PDB。藥物-標簽關(guān)系主要來自KEGG BRITE、酶數(shù)據(jù)庫BRENDA、包含藥物-靶點關(guān)系信息的數(shù)據(jù)庫SuperTarget等。
科研合作網(wǎng)絡(luò)是最常見的異質(zhì)信息網(wǎng)絡(luò)之一,其節(jié)點類型主要由作者、論文、研究領(lǐng)域以及期刊或會議中的兩種或兩種以上組成,作者之間的鏈接可以是引用或者合作關(guān)系,作者與論文之間的關(guān)系可以是瀏覽或者撰寫關(guān)系等。科學合作網(wǎng)上的鏈路預測任務(wù)主要有合作關(guān)系預測、引用文章推薦、文獻發(fā)現(xiàn)等[6,164-165]。主要應用的學術(shù)文獻數(shù)據(jù)集包括計算機類英文文獻的集成數(shù)據(jù)庫系統(tǒng)DBLP數(shù)據(jù)集、文獻檢索數(shù)據(jù)庫Web of Science、CNKI、PubMed、預印本arXiv、科技情報大數(shù)據(jù)挖掘與服務(wù)系統(tǒng)平臺AMiner,以及Microsoft Academic Graph等。
作戰(zhàn)網(wǎng)絡(luò)一般指裝備網(wǎng)絡(luò),也叫異質(zhì)對抗網(wǎng)絡(luò),網(wǎng)絡(luò)中的節(jié)點是不同種類的武器裝備,例如偵查類裝備、打擊類裝備和決策類裝備,網(wǎng)絡(luò)中的鏈接可以是裝備之間的通訊關(guān)系,也可以是指控關(guān)系等[166-169]。隨著戰(zhàn)爭體系由網(wǎng)絡(luò)中心戰(zhàn)向決策中心戰(zhàn)的轉(zhuǎn)變,馬賽克戰(zhàn)的概念被廣泛研究并應用。馬賽克戰(zhàn)使原本孤立的武器裝備之間產(chǎn)生交互關(guān)系,使網(wǎng)絡(luò)結(jié)構(gòu)變得更加復雜,由于隱蔽、迷惑等高技術(shù)手段的運用,全面獲取網(wǎng)絡(luò)鏈接變得更加困難,為此觀測到的網(wǎng)絡(luò)往往是片面的,需要基于已有的鏈接關(guān)系對未獲取鏈路進行預測。
6 結(jié)束語
本文在前人研究的基礎(chǔ)上對異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測問題進行總結(jié),從監(jiān)督學習和無監(jiān)督學習兩個角度對異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方法進行綜述,并對在不同類型異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方法進行總結(jié)。隨著互聯(lián)網(wǎng)技術(shù)的迅猛發(fā)展,互聯(lián)網(wǎng)大數(shù)據(jù)成為當前各個領(lǐng)域研究的數(shù)據(jù)源,多樣性是互聯(lián)網(wǎng)大數(shù)據(jù)的重要特點,也是其能夠提供豐富信息的基礎(chǔ)。異質(zhì)信息網(wǎng)絡(luò)作為描述互聯(lián)網(wǎng)大數(shù)據(jù)的有效方法,其上的鏈路預測任務(wù)已被學者廣泛研究,但異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測任務(wù)依舊存在很多待解決的問題,以及有很大的提升空間。首先,典型的異質(zhì)信息網(wǎng)絡(luò),如社交網(wǎng)絡(luò)、蛋白質(zhì)疾病網(wǎng)絡(luò)等擁有成千上萬的節(jié)點,如何使鏈路預測方法在這類大規(guī)模網(wǎng)絡(luò)上發(fā)揮作用,使其真正能夠解決現(xiàn)實問題,是一個待解決的難點;伴隨大規(guī)模網(wǎng)絡(luò)產(chǎn)生的結(jié)果是現(xiàn)實網(wǎng)絡(luò)稀疏性、由負樣本多于正樣本導致的不平衡的數(shù)據(jù)集、推薦系統(tǒng)中的冷啟動問題,這些均是待解決的難題;當前大部分在異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測方法是基于靜態(tài)的無向無權(quán)單層網(wǎng)絡(luò)進行研究,但現(xiàn)實中的絕大部分網(wǎng)絡(luò)都是有權(quán)有向網(wǎng)絡(luò),并同時伴有動態(tài)和多層的特點,若將其簡單抽象成靜態(tài)的無向無權(quán)單層網(wǎng)絡(luò),會導致難以有效解決實際問題;此外,基于動態(tài)異質(zhì)信息網(wǎng)絡(luò)的研究一般考慮節(jié)點的加入或新邊的產(chǎn)生,很少有學者研究邊的權(quán)重隨時間變化對鏈路預測產(chǎn)生的影響。面對以上待解決的問題,異質(zhì)信息網(wǎng)絡(luò)上的鏈路預測研究將會在未來一直保持其研究熱度,需要更多跨領(lǐng)域的學者參與進來,加強方法的可用性與魯棒性,使其能夠真正解決現(xiàn)實問題。
參考文獻
[1]SUN Y Z, HAN J W, ZHAO P X, et al. RankClus: integrating clustering with ranking for heterogeneous information network analysis[C]∥Proc.of the 12th International Conference on Extending Database Technology: Advances in Database Technology, 2009: 565-576.
[2]SHI C, LI Y T, ZHANG J W, et al. A survey of heterogeneous information network analysis[J]. IEEE Trans.on Knowledge and Data Engineering, 2017, 29(1): 17-37.
[3]SINGH-BLOM U M, NATARAJAN N, TEWARI A, et al. Prediction and validation of gene-disease associations using methods inspired by social network analyses[J]. Plos One, 2013, 8(5): e58977.
[4]SEBASTIAN Y, SIEW E G, ORIMAYE S O. Predicting future links between disjoint research areas using heterogeneous bibliographic information network[C]∥Proc.of the 19th Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2015: 610-621.
[5]SAJADMANESH S, RABIEE H R, KHODADADI A. Predicting anchor links between heterogeneous social networks[C]∥Proc.of the 8th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2016: 158-163.
[6]LI J C, YIN Y, FORTUNATO S, et al. Scientific elite revisited: patterns of productivity, collaboration, authorship and im-pact[J]. Journal of the Royal Society Interface, 2020, 17(165): 20200135.
[7]LU M L, YE D N. HIN_DRL: a random walk based dynamic network representation learning method for heterogeneous information networks[J]. Expert Systems with Applications, 2020, 158: 113427.
[8]LI J C, GE B F, YANG K W, et al. Meta-path based heterogeneous combat network link prediction[J]. Physica A: Statistical Mechanics and Its Applications, 2017, 482: 507-523.
[9]AGGARWAL C C, XIE Y, YU P S. A framework for dynamic link prediction in heterogeneous networks[J]. Statistical Analysis and Data Mining, 2014, 7(1): 14-33.
[10]PEIXOTO T P. Reconstructing networks with unknown and heterogeneous errors[J]. Physical Review X, 2018, 8(4): 041011.
[11]FRANZONI V, MILANI A. Structural and semantic proximity in information networks[C]∥Proc.of the 17th International Conference on Computational Science and its Applications, 2017: 651-666.
[12]周濤, 張子柯, 陳關(guān)榮, 等. 復雜網(wǎng)絡(luò)研究的機遇與挑戰(zhàn)[J]. 電子科技大學學報, 2014, 43(1): 1-5.
ZHOU T, ZHANG Z K, CHEN G R, et al. The opportunities and challenges of complex networks research[J]. Journal of University of Electronic Science and Technology of China, 2014, 43(1): 1-5.
[13]呂琳媛, 周濤. 鏈路預測[M]. 北京: 高等教育出版社, 2013.
LYU L Y, ZHOU T. Link prediction[M]. Beijing: Higher Education Press, 2013.
[14]DAUD N N. Applications of link prediction in social networks: a review[J]. Journal of Network and Computer Applications, 2020, 166: 102716.
[15]KUMAR A, SINGH S S, SINGH K, et al. Link prediction techniques, applications, and performance: a survey[J]. Physica A: Statistical Mechanics and its Applications, 2020, 553: 124289.
[16]SUN Y Z, HAN J W. Mining heterogeneous information networks: a structural analysis approach[J]. ACM SIGKDD Explorations Newsletter, 2013, 14(2): 20-28.
[17]SUN Y Z, HAN J W, YAN X F, et al. PathSim: meta pathbased top-k similarity search in heterogeneous information networks[C]∥Proc.of the VLDB Endowment, 2011: 992-1003.
[18]ZHANG J L, JIANG Z L, CHEN Z, et al. WMGCN: weighted meta-graph based graph convolutional networks for representation learning in heterogeneous networks[J]. IEEE Access, 2020, 8: 40744-40754.
[19]MA Y J, JIANG H M. NinimHMDA: neural integration of neighborhood information on a multiplex heterogeneous network for multiple types of human microbe-disease association[J]. Bioinformatics, 2021, 36(24): 5665-5671.
[20]李際超. 異質(zhì)信息網(wǎng)絡(luò)數(shù)據(jù)挖掘關(guān)鍵技術(shù)研究[D]. 長沙: 國防科技大學, 2019.
LI J C. Key technologies of heterogeneous information networks data mining[D]. Changsha: National University of Defense Technology, 2019.
[21]呂琳媛. 復雜網(wǎng)絡(luò)鏈路預測[J]. 電子科技大學學報, 2010, 39(5): 651-661.
LYU L Y. Link prediction on complex networks[J]. Journal of University of Electronic Science and Technology of China, 2010, 39(5): 651-661.
[22]YANG C, WANG C C, LU Y F, et al. Few-shot link prediction in dynamic networks[C]∥Proc.of the 15th ACM International Conference on Web Search and Data Mining, 2022: 1245-1255.
[23]JIANG S H, DING Z M, FU Y. Heterogeneous recommendation via deep low-rank sparse collective factorization[J]. IEEE Trans.on Pattern Analysis and Machine Intelligence, 2020, 42(5): 1097-1111.
[24]WANG Z, ZHANG J W, FENG J L, et al. Knowledge graph embedding by translating on hyperplanes[C]∥Proc.of the 28th AAAI Conference on Artificial Intelligence, 2014: 1112-1119.
[25]LIN Y K, LIU Z Y, SUN M S, et al. Learning entity and relation embeddings for knowledge graph completion[C]∥ Proc.of the 29th AAAI Conference on Artificial Intelligence, 2015: 2181-2187.
[26]JI G L, HE S Z, XU L H, et al. Knowledge graph embedding via dynamic mapping matrix[C]∥Proc.of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015: 687-696.
[27]TROUILLON T, WELBL J, RIEDEL S, et al. Complex embeddings for simple link prediction[C]∥Proc.of the International Conference on Machine Learning, 2016: 2071-2080.
[28]KAZEMI S M, POOLE D. Simple embedding for link prediction in knowledge graphs[C]∥ Proc.of the Advances in Neural Information Processing Systems, 2018: 31.
[29]NICKEL M, ROSASCO L, POGGIO T. Holographic embeddings of knowledge graphs[C]∥Proc.of the 13th AAAI Conference on Artificial Intelligence, 2016: 1955-1961.
[30]ZHANG Y G, ZHANG C S, ZHANG D. Distance metric learning by knowledge embedding[J]. Pattern Recognition, 2004, 37(1): 161-163.
[31]HAO Y C, ZHANG Y Z, LIU K, et al. An end-to-end model for question answering over knowledge base with cross-attention combining global knowledge[C]∥Proc.of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 221-231.
[32]GUPTA S, KUMAR P. Community detection in heterogenous networks using incremental seed expansion[C]∥Proc.of the 3rd International Conference on Data Science and Engineering, 2016: 184-188.
[33]REN X, LIU J L, YU X, et al. ClusCite: effective citation re-commendation by information network-based clustering[C]∥Proc.of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014: 821-830.
[34]ABABIO I B, CHEN J X, CHEN Y, et al. Link prediction based on heuristics and graph attention[C]∥Proc.of the 8th IEEE International Conference on Big Data, 2020: 5428-5434.
[35]劉建偉, 劉媛, 羅雄麟. 半監(jiān)督學習方法[J]. 計算機學報, 2015, 38(8): 1592-1617.
LI J W, LIU Y, LUO X L. Semi-supervised learning methods[J]. Chinese Journal of Computers, 2015, 38(8): 1592-1617.
[36]SUN Y Z, BARBER R, GUPTA M, et al. Co-author relationship prediction in heterogeneous bibliographic networks[C]∥Proc.of the International Conference on Advances in Social Networks Analysis and Mining, 2011: 121-128.
[37]ZHANG J W, YU P S, LYU Y H, et al. Organizational chart inference[C]∥Proc.of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2015: 1435-1444.
[38]SHI C, ZHOU C, KONG X N, et al. HeteRecom: a semantic-based recommendation system in heterogeneous networks[C]∥Proc.of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2012: 1552-1555.
[39]SHI C, ZHANG Z Q, LUO P, et al. Semantic path based personalized recommendation on weighted heterogeneous information networks[C]∥Proc.of the 24th ACM International on Conference on Information and Knowledge Management, 2015: 453-462.
[40]YU X, REN X, SUN Y Z, et al. Recommendation in heterogeneous information networks with implicit user feedback[C]∥Proc.of the 7th ACM Conference on Recommender Systems, 2013: 347-350.
[41]YU X, REN X, SUN Y Z, et al. Personalized entity recommendation: a heterogeneous information network approach[C]∥Proc.of the 7th ACM International Conference on Web Search and Data Mining, 2014: 283-292.
[42]BURKE R, VAHEDIAN F, MOBASHER B. Hybrid recommendation in heterogeneous networks[C]∥Proc.of the 22nd International Conference on User Modeling, Adaptation, and Personalization, 2014: 49-60.
[43]NAUN G G. Introduction to modern information retrieval[J]. Library Resources amp; Technical Services, 2011, 55(4): 239-240.
[44]JACCARD P. Extude comparative de la distribution florale dans one partion des Alpes et des Jura[J]. Bulletin de la Societe Vaudoise des Sciences Naturelles, 1901, 37(142): 547-579.
[45]SORENSEN T. A method of establishing groups of equal amplitude in plant sociology based on similarity of species and its application to analyses of the vegetation on danish commons[J]. Biologiske Skrifter, 1957(5): 1-34.
[46]RAVASZ E, SOMERA A L, MONGRU D A, et al. Hierarchical organization of modularity in metabolic networks[J]. Science, 2002, 297(5586): 1551-1555.
[47]ADAMIC L A, ADAR E. Friends and neighbors on the web[J]. Social Networks, 2003, 25(3): 211-230.
[48]TAO Z, LYU L Y, ZHANG Y C. Predicting missing links via local information[J]. European Physical Journal B-Condensed Matter, 2009, 71(4): 623-630.
[49]OU Q, JIN Y D, ZHOU T, et al. Power-law strength-degree correlation from resource-allocation dynamics on weighted networks[J]. Physical Review E, 2007, 75(2): 021102.
[50]BARABASI A L, ALBERT R. Emergence of scaling in random networks[J]. Science, 1999, 286(5439): 509-512.
[51]XIE Y B, ZHOU T, WANG B H. Scale-free networks without growth[J]. Physica A: Statistical Mechanics and Its Applications, 2008, 387(7): 1683-1688.
[52]LIU Z, ZHANG Q M, LYU L Y, et al. Link prediction in complex networks: a local naive Bayes model[J]. Europhysics Letters, 2011, 96(4): 48007.
[53]LYU L Y, JIN C H, ZHOU T. Similarity index based on local paths for link prediction of complex networks[J]. Physical Review E, 2009, 80(4): 046122.
[54]KATZ L. A new status index derived from sociometric analysis[J]. Psychometrika, 1953, 18(1): 39-43.
[55]LEICHT E A, HOLME P, NEWMAN M E J. Vertex similarity in networks[J]. Physical Review E, 2006, 73(2): 026120.
[56]KLEIN D J, RANDIC M. Resistance distance[J]. Journal of Mathematical Chemistry, 1993, 12(1): 81-95.
[57]FOUSS F, PIROTTE A, RENDERS J M, et al. Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation[J]. IEEE Trans.on Knowledge and Data Engineering, 2007, 19(3): 355-369.
[58]BRIN S, PAGE L. The anatomy of a large-scale hypertextual web search engine[J]. Computer Networks and ISDN Systems, 1998, 30(1): 107-117.
[59]JEH G, WIDOM J. Simrank: a measure of structural-context similarity[C]∥Proc.of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002: 538-543.
[60]LIU W P, LYU L Y. Link prediction based on local random walk[J]. Europhysics Letters, 2010, 89(5): 58007.
[61]MAJI P, SHAH E, PAUL S. RelSim: an integrated method to identify disease genes using gene expression profiles and PPIN based similarity measure[J]. Information Sciences, 2017, 384: 110-125.
[62]ROWEIS S T, SAUL L K. Nonlinear dimensionality reduction by locally linear embedding[J]. Science, 2000, 290(5500): 2323-2326.
[63]LAO N, COHEN W W. Relational retrieval using a combination of path-constrained random walks[J]. Machine Learning, 2010, 81(1): 53-67.
[64]SHI C, KONG X N, HUANG Y, et al. HeteSim: a general framework for relevance measure in heterogeneous networks[J]. IEEE Trans.on Knowledge and Data Engineering, 2014, 26(10): 2479-2492.
[65]LI C C, SUN J, XIONG Y, et al. An efficient drug-target interaction mining algorithm in heterogeneous biological networks[C]∥Proc.of the 18th Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2014: 65-76.
[66]MENG X F, SHI C, LI Y T, et al. Relevance measure in large-scale heterogeneous networks[C]∥Proc.of the 16th Asia-Pacific Web Conference, 2014: 636-643.
[67]BU S L, HONG X G, PENG Z H, et al. Integrating meta-path selection with user-preference for topk relevant search in hetero-geneous information networks[C]∥Proc.of the 18th International Conference on Computer Supported Cooperative Work in Design, 2014: 301-306.
[68]ZHU M, ZHU T C, PENG Z H, et al. Relevance search on signed heterogeneous information network based on metapath factorization[C]∥Proc.of the 16th International Conference on Web-Age Information Management, 2015: 181-192.
[69]MENG X Y, ZOU Q, RODRIGUEZ-PATON A, et al. Iteratively collective prediction of disease-gene associations through the incomplete network[C]∥Proc.of the Biological Ontologies and Knowledge Bases Workshop at IEEE International Conference on Bioinformatics and Biomedicine, 2017: 1324-1330.
[70]SHAKIBIAN H, CHARKARI N M. Statistical similarity measures for link prediction in heterogeneous complex networks[J]. Physica A: Statistical Mechanics and Its Applications, 2018, 501: 248-263.
[71]ANIL A, CHUGH U, SINGH S R. On applying metapath for network embedding in mining heterogeneous DBLP network[C]∥Proc.of the 8th International Conference on Pattern Recognition and Machine Intelligence, 2019: 249-257.
[72]LYU H, LI J, ZHANG S, et al. Metapath based miRNA-disease association prediction[C]∥Proc.of the Database Systems for Advanced Applications, 2019: 34-48.
[73]WANG Q, DU W, MA J, et al. Recommendation mechanism for patent trading empowered by heterogeneous information networks[J]. International Journal of Electronic Commerce, 2019, 23(2): 147-178.
[74]JEONG H J, KIM M H. Utilizing adjacency of colleagues and type correlations for enhanced link prediction[J]. Data amp; Knowledge Engineering, 2020, 125: 101785.
[75]ZENG X X, LI Y, LEUNG S C H, et al. Investment behavior prediction in heterogeneous information network[J]. Neurocomputing, 2016, 217: 125-132.
[76]PHUC D, PHU P, TRUNG P, et al. TMPP: a novel topic-driven metapath-based approach for coauthorship prediction in largescale content-based heterogeneous bibliographic network in distributed computing framework by spark[C]∥/Proc.of the 1st International Conference on Intelligent Computing and Optimization, 2018: 87-97.
[77]NIKMEHR G, SALEHI M, JALILI M. TSS: temporal similarity search measure for heterogeneous information networks[J]. Physica A: Statistical Mechanics and its Applications, 2019, 524: 696-707.
[78]LU M L, WEI X D, YE D N, et al. A unified link prediction framework for predicting arbitrary relations in heterogeneous academic networks[J]. IEEE Access, 2019, 7: 124967-124987.
[79]SHAKIBIAN H, CHARKARI N M. Mutual information model for link prediction in heterogeneous complex networks[J]. Scientific Reports, 2017, 7(1): 44981.
[80]ZHANG J Z. Uncovering mechanisms of coauthorship evolution by multirelations-based link prediction[J]. Information Processing amp; Management, 2017, 53(1): 42-51.
[81]WANG X, ZHANG Y D, SHI C. Hyperbolic heterogeneous information network embedding[C]∥Proc.of the AAAI Conference on Artificial Intelligence, 2019: 5337-5344.
[82]WANG G, HU Q B, YU P S. Influence and similarity on heterogeneous networks[C]∥Proc.of the 21st ACM International Conference on Information and Knowledge Management, 2012: 1462-1466.
[83]FAN C J, LIU Z, XIU B, et al. Missing and spurious interactions in heterogeneous military networks[C]∥Proc.of the 5th National Conference on Social Media Processing, 2016: 14-28.
[84]CAI X Y, HAN J W, YANG L B. Generative adversarial network based heterogeneous bibliographic network representation for personalized citation recommendation[C]∥Proc.of the 32nd AAAI Conference on Artificial Intelligence, 2018: 5747-5754.
[85]LIU M Y, LIU J, CHEN Y H, et al. AHNG: representation learning on attributed heterogeneous network[J]. Information Fusion, 2019, 50: 221-230.
[86]CHEN H, ZHOU C Q, ZHANG J, et al. Heterogeneous graph embedding based on edge-aware neighborhood convolution[C]∥Proc.of the International Joint Conference on Neural Networks, 2021.
[87]CHEN Z Y, FAN Z P, SUN M H. Tensorial graph learning for link prediction in generalized heterogeneous networks[J]. European Journal of Operational Research, 2021, 290(1): 219-234.
[88]GUAN Y S, SUN X G, SUN Y J. Sparse relation prediction based on hypergraph neural networks in online social networks[J]. World Wide Web, 2023, 26(1): 7-31.
[89]JI Y G, YIN M Y, YANG H X, et al. Accelerating large-scale heterogeneous interaction graph embedding learning via importance sampling[J]. ACM Transactions on Knowledge Discovery from Data, 2020, 15(1): 1-23.
[90]CHEN H X, YIN H Z, WANG W Q, et al. PME: projected metric embedding on heterogeneous networks for link prediction[C]∥Proc.of the 24th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2018: 1177-1186.
[91]WANG H W, ZHANG F Z, HOU M, et al. SHINE: signed heterogeneous information network embedding for sentiment link prediction[C]∥Proc.of the 11th ACM International Conference on Web Search and Data Mining, 2018: 592-600.
[92]FU G J, YUAN B, DUAN Q Q, et al. Representation learning for heterogeneous information networks via embedding events[C]∥Proc.of the 26th International Conference on Neural Information Processing of the Asia-Pacific-Neural-Network-Society, 2019: 327-339.
[93]LIU S C, HUANG Z Y, QIU Y, et al. Structural network embedding using multi-modal deep auto-encoders for predicting drug-drug interactions[C]∥Proc.of the IEEE International Conference on Bioinformatics and Biomedicine, 2019: 445-450.
[94]MENG L, BAI J Y, ZHANG J W. LATTE: application oriented social network embedding[C]∥Proc.of the IEEE International Conference on Big Data, 2019: 1169-1174.
[95]SHI B X, YANG J, WENINGER T, et al. Representation learning in heterogeneous professional social networks with ambiguous social connections[C]∥Proc.of the IEEE International Conference on Big Data, 2019: 1928-1937.
[96]WANG Y Q, FENG C Y, CHEN L, et al. User identity linkage across social networks via linked heterogeneous network embedding[J]. World Wide Web-Internet and Web Information Systems, 2019, 22(6): 2611-2632.
[97]ZENG X X, WANG W, DENG G S, et al. Prediction of potential disease-associated microRNAs by using neural networks[J]. Molecular Therapy Nucleic Acids, 2019, 16: 566-575.
[98]ZHUO W, ZHAN Q Y, LIU Y, et al. Context attention he-terogeneous network embedding[J]. Computational Intelligence and Neuroscience, 2019, 2019(1): 8106073.
[99]MALEKI E F, GHADIRI N, SHAHREZA M L, et al. DHLP 1amp;2: giraph based distributed label propagation algorithms on heterogeneous drug-related networks[J]. Expert Systems with Applications, 2020, 159: 113640.
[100]RIZI F S, GRANITZER M. Signed heterogeneous network embedding in social media[C]∥Proc.of the 35th Annual ACM Symposium on Applied Computing, 2020: 1877-1880.
[101]YU B, HU J Z, XIE Y, et al. Rich heterogeneous information preserving network representation learning[J]. Pattern Recognition, 2020, 108: 107564.
[102]ZHAO K, BAI T, WU B, et al. Deep adversarial completion for sparse heterogeneous information network embedding[C]∥Proc.of the 29th World Wide Web Conference, 2020: 508-518.
[103]ZHOU F, ZHANG K P, XIE S Y, et al. Learning to correlate accounts across online social networks: an embedding-based approach[J]. Informs Journal on Computing, 2020, 32(3): 714-729.
[104]CHEN W H, LI J C, JIANG J. Heterogeneous combat network link prediction based on representation learning[J]. IEEE Systems Journal, 2021, 15(3): 4069-4077.
[105]CHENG S C, ZHANG L, JIN B, et al. GraphMS: drug target prediction using graph representation learning with substructures[J]. Applied Sciences, 2021, 11(7): 3239.
[106]LI C T, WANG W C. Learning template-free network embeddings for heterogeneous link prediction[J]. Soft Computing, 2021, 25(21): 13425-13435.
[107]XU L C, WANG J, HE L F, et al. MixSp: a framework for embedding heterogeneous information networks with arbitrary number of node and edge types[J]. IEEE Trans.on Knowledge and Data Engineering, 2021, 33(6): 2627-2639.
[108]SHI B X, YANG J, WENINGER T, et al. Representation learning in heterogeneous professional social networks with ambiguous social connections[C]∥Proc.of the IEEE International Conference on Big Data, 2019: 1928-1937.
[109]WANG Y Q, FENG C Y, CHEN L, et al. User identity linkage across social networks via linked heterogeneous network embedding[J]. World Wide Web-Internet and Web Information Systems, 2019, 22(6): 2611-2632.
[110]XIE X P, WANG Y, SHENG N, et al. Predicting miRNA-disease associations based on multi-view information fusion[J]. Frontiers in Genetics, 2022, 13: 979815.
[111]PARK C, KIM D, ZHU Q, et al. Task-guided pair embedding in heterogeneous network[C]∥Proc.of the 28th ACM International Conference on Information and Knowledge Ma-nagement, 2019: 489-498.
[112]MALEKI E F, GHADIRI N, SHAHREZA M L, et al. DHLP 1amp;2: scalable label propagation algorithms for he-terogeneous networks[J]. Expert Systems with Applications, 2020, 159: 113640.
[113]WANG T T, YUAN W W, GUAN D H. Attributed heterogeneous network embedding for link prediction[C]∥Proc.of the Pacific Rim Knowledge Acquisition Workshop, 2021: 106-119.
[114]WANG H W, ZHANG F Z, HOU M, et al. SHINE: signed heterogeneous information network embedding for sentiment link prediction[C]∥Proc.of the 11th ACM International Conference on Web Search and Data Mining, 2018: 592-600.
[115]LIU C F, LIU Y, YU M, et al. RL4HIN: representation learning for heterogeneous information networks[C]∥Proc.of the IEEE Global Communications Conference, 2019.
[116]ZHANG Y, QIU Y, CUI Y X, et al. Predicting drug-drug interactions using multi-modal deep auto-encoders based network embedding and positive-unlabeled learning[J]. Methods, 2020, 179: 37-46.
[117]SALEHI R F. Graph representation learning for social networks[D]. Bavaria: University of Passau, 2021.
[118]GAO M, HE X N, CHEN L H, et al. Learning vertex representations for bipartite networks[J]. IEEE Trans.on Knowledge and Data Engineering, 2022, 34(1): 379-393.
[119]MEI X, CAI X Y, YANG L B, et al. Relation-aware heterogeneous graph transformer based drug repurposing[J]. Expert Systems with Applications, 2022, 190: 116165.
[120]FU T Y, LEE W C, LEI Z, et al. HIN2Vec: explore meta-paths in heterogeneous information networks for representation learning[C]∥Proc.of the ACM Conference on Information and Knowledge Management, 2017: 1797-1806.
[121]CHEN K J, LU H, LI Y, et al. On relationship formation in heterogeneous information networks: an inferring method based on multilabel learning[J]. Statistical Analysis and Data Mining, 2019, 12(3): 157-167.
[122]FU Y W, XIONG Y, YU P S, et al. Metapath enhanced graph attention encoder for hins representation learning[C]∥Proc.of the IEEE International Conference on Big Data, 2019: 1103-1110.
[123]GAO X, CHEN J, HUAI N. Meta-circuit machine: inferencing human collaborative relationships in heterogeneous information networks[J]. Information Processing amp; Management, 2019, 56(3): 844-857.
[124]MOHDEB D, BOUBETRA A, CHARIKHI M. WMPLP: a model for link prediction in heterogeneous social networks[C]∥Proc.of the 4th International Symposium ISKO-Maghreb: Concepts and Tools for Knowledge Management, 2014.
[125]HE X J, DONG Y B, ZHEN Z, et al. Weighted meta paths and networking embedding for patent technology trade recommendations among subjects[J]. Knowledge-Based Systems, 2019, 184: 104899.
[126]CAO X H, ZHENG Y Y, SHI C, et al. Link prediction in schema-rich heterogeneous information network[C]∥Proc.of the 20th Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2016: 449-460.
[127]LIANG T, LIU J. Metapath generation online for heterogeneous network embedding[C]∥Proc.of the International Joint Conference on Neural Networks, 2020.
[128]DONG Y X, CHAWLA N V, SWAMI A. Metapath2vec: scalable representation learning for heterogeneous networks[C]∥Proc.of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017: 135-144.
[129]LIAO L Z, HE X N, ZHANG H W, et al. Attributed social network embedding[J]. IEEE Trans.on Knowledge amp; Data Engineering, 2018, 30(12): 2257-2270.
[130]ZHANG C X, SONG D J, HUANG C, et al. Heterogeneous graph neural network[C]∥Proc.of the 25th ACM SIGKDD International Conference on Knowledge Discovery amp; Data Mining, 2019: 793-803.
[131]LI Z W, LI J S, NIE R, et al. A graph auto-encoder model for miRNA-disease associations prediction[J]. Briefings in Bioinformatics, 2021, 22(4): 240.
[132]YANG H J, LI L S, ZHANG L X, et al. PHGNN: position-aware graph neural network for heterogeneous graph embedding[C]∥Proc.of the International Joint Conference on Neural Networks, 2021.
[133]YOU J X, GOMES-SELMAN J M, YING R, et al. Identity-aware graph neural networks[C]∥Proc.of the 35th AAAI Conference on Artificial Intelligence, 2021: 10737-10745.
[134]ZHAO A P, YU Y. Context aware sentiment link prediction in heterogeneous social network[J]. Cognitive Computation, 2022, 14(1): 300-309.
[135]KONG C, LI H, ZHANG L P, et al. Link prediction on dynamic heterogeneous information networks[C]∥Proc.of the 8th International Conference on Computational Data and Social Networks, 2019: 339-350.
[136]LONG Y F, XIANG R, LU Q, et al. Learning heterogeneous network embedding from text and links[J]. IEEE Access, 2018, 6: 55850-55860.
[137]HU J, QIAN S S, FANG Q, et al. A2CMHNE: attention-aware collaborative multimodal heterogeneous network embed-ding[J]. ACM Transactions on Multimedia Computing, Communications, and Applications, 2019, 15(2): 1-17.
[138]XUE H S, YANG L W, JIANG W, et al. Modeling dynamic heterogeneous network for link prediction using hierarchical attention with temporal RNN[C]∥Proc.of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2020: 282-298.
[139]LI Z F, LIU H, ZHANG Z L, et al. Learning knowledge graph embedding with heterogeneous relation attention networks[J]. IEEE Trans.on Neural Networks and Learning Systems, 2021, 33(8): 3961-3973.
[140]HUANG F R, ZHANG X M, LI C Z, et al. Multimodal network embedding via attention based multi-view variational autoencoder[C]∥Proc.of the 8th ACM International Conference on Multimedia Retrieval, 2018: 108-116.
[141]ZHU K, CAO M. A semantic subgraphs based link prediction method for heterogeneous social networks with graph attention networks[C]∥Proc.of the International Joint Conference on Neural Networks, 2020.
[142]ZHU K, CAO M, LU H Y. MALP: a more effective metapaths based link prediction method in partially aligned heterogeneous social networks[C]∥Proc.of the 31st IEEE International Conference on Tools with Artificial Intelligence, 2019: 644-651.
[143]MAKI T, TAKAHASHI K, WAKAHARA T, et al. A new multiple label propagation algorithm for linked open data[C]∥Proc.of the 10th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, 2016: 202-208.
[144]SHEKHAR S, PAI D, RAVINDRAN S. Entity resolution in dynamic heterogeneous networks[C]∥Proc.of the 29th World Wide Web Conference, 2020: 662-668.
[145]SUN H W, LYU X Q, WANG B, et al. An enhanced LRMC method for drug repositioning via GCN-based hin embedding[C]∥Proc.of the IEEE International Conference on Bioinformatics and Biomedicine, 2020: 1137-1141.
[146]FU H T, HUANG F, LIU X, et al. MVGCN: data integration through multiview graph convolutional network for predicting links in biomedical bipartite networks[J]. Bioinformatics, 2022, 38(2): 426-434.
[147]BUTUN E, KAYA M, ALHAJJ R. Extension of neighbor-based link prediction methods for directed, weighted and temporal social networks[J]. Information Sciences, 2018, 463/464: 152-165.
[148]BUTUN E, KAYA M, ALHAJJ R. A new topological metric for link prediction in directed, weighted and temporal networks[C]∥Proc.of the 8th IEEE/ACM International Confe-rence on Advances in Social Networks Analysis and Mining, 2016: 954-959.
[149]OZCAN A, OGUDUCU S G. Multivariate time series link prediction for evolving heterogeneous network[J]. International Journal of Information Technology amp; Decision Making, 2019, 18(1): 241-286.
[150]KEFALAS P, SYMEONIDIS P. Recommending friends and locations over a heterogeneous spatio-temporal graph[C]∥Proc.of the 5th International Conference on Model Engineering and Data Engineering, 2015: 271-284.
[151]KEFALAS P, SYMEONIDIS P, MANOLOPOULOS Y. Recommendations based on a heterogeneous spatio-temporal social network[J]. World Wide Web-Internet and Web Information Systems, 2018, 21(2): 345-371.
[152]PARK I, YOON B. Technological opportunity discovery for technological convergence based on the prediction of technology knowledge flow in a citation network[J]. Journal of Informetrics, 2018, 12(4): 1199-1222.
[153]ERONEN L, TOIVONEN H. Biomine: predicting links between biological entities using network models of heterogeneous databases[J]. BMC Bioinformatics, 2012, 13: 119.
[154]PAN L M, GAO L, GAO J. Link prediction in weighted networks via structural perturbations[C]∥Proc.of the 14th IEEE International Computer Conference on Wavelet Active Media Technology and Information Processing, 2017: 5-8.
[155]ZHANG J W, YU P S, ZHOU Z H. Metapath based multi-network collective link prediction[C]∥Proc.of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014: 1286-1295.
[156]DONG Y X, ZHANG J, TANG J, et al. CoupledLP: link prediction in coupled networks[C]∥Proc.of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2015: 199-208.
[157]XU L C, WEI X K, CAO J N, et al. Embedding of embedding: joint embedding for coupled heterogeneous networks[C]∥Proc.of the 10th ACM International Conference on Web Search and Data Mining, 2017: 741-749.
[158]ZHANG Y, XIONG Y, RUAN L, et al. NetMerger: predicting cross-network links in merged heterogeneous networks[C]∥Proc.of the 19th IEEE/WIC/ACM International Conference on Web Intelligence, 2019: 21-28.
[159]RUAN B B, ZHU C. An efficient link prediction model in dynamic heterogeneous information networks based on multiple self-attention[C]∥Proc.of the 14th International Conference on Knowledge Science, Engineering, and Management, 2021: 62-74.
[160]LI X X, SHANG Y M, CAO Y N, et al. Typeaware anchor link prediction across heterogeneous networks based on graph attention network[C]∥Proc.of the 34th AAAI Conference on Artificial Intelligence, 2020: 147-155.
[161]YIN Y, JI L X, ZHANG J P, et al. DHNE: network representation learning method for dynamic heterogeneous networks[J]. IEEE Access, 2019, 7: 134782-134792.
[162]ZHAO X F, JIN Z G, LIU Y H, et al. Heterogeneous information network embedding for user behavior analysis on social media[J]. Neural Computing and Applications, 2022, 34: 5683-5699.
[163]SARWAT M, LEVANDOSKI J J, ELDAWY A, et al. LARS*: an efficient and scalable location-aware recommender system[J]. IEEE Trans.on Knowledge amp; Data Engineering, 2014, 26(6): 1384-1399.
[164]LI J C, YIN Y, FORTUNATO S, et al. Nobel laureates are almost the same as us[J]. Nature Reviews Physics, 2019, 1(5): 301-303.
[165]LI J C, YIN Y, FORTUNATO S, et al. A dataset of publication records for Nobel laureates[J]. Scientific Data, 2019, 6(1): 33.
[166]LI J C, ZHAO D L, JIANG J, et al. Capability oriented equipment contribution analysis in temporal combat networks[J]. IEEE Trans.on Systems, Man amp; Cybernetics. Systems, 2021, 51(2): 696-704.
[167]LI J C, ZHAO D L, GE B F, et al. Disintegration of operational capability of heterogeneous combat networks under incomplete information[J]. IEEE Trans.on Systems, Man amp; Cybernetics-systems, 2020, 50(12): 5172-5179.
[168]LI J C, ZHAO D L, GE B F, et al. A link prediction method for heterogeneous networks based on BP neural network[J]. Physica Astatistical Mechanics and its Applications, 2018, 495: 1-17.
[169]葛冰峰, 李際超, 趙丹玲, 等. 基于元路徑的武器裝備體系作戰(zhàn)網(wǎng)絡(luò)鏈路預測方法[J]. 系統(tǒng)工程與電子技術(shù), 2019, 41(5): 1028-1033.
GE B F, LI J C, ZHAO D L. Matepath based link prediction approach for weapon system-of-system combat networks[J]. Systems Engineering and Electronics, 2019, 41(5): 1028-1033.
作者簡介
曹嘉平(1999—),女,碩士研究生,主要研究方向為復雜系統(tǒng)與復雜網(wǎng)絡(luò)、異質(zhì)信息網(wǎng)絡(luò)鏈路預測。
李際超(1990—),男,副教授,博士,主要研究方向為復雜系統(tǒng)與復雜網(wǎng)絡(luò)、數(shù)據(jù)驅(qū)動智能決策。
姜 江(1981—),男,副教授,博士,主要研究方向為不確定性推理與風險決策技術(shù)。