樊志英
摘 要: 依據(jù)車輛軌跡相似度在時間和空間維度上的約束,引入LCSS算法,遵循最長公共子序列的原理,抽象出軌跡中的卡口號序列,提出一種兩條車輛軌跡相似度的計算方法,并結合Spark并行計算、Hive數(shù)據(jù)倉庫存儲等相關技術,搭建數(shù)據(jù)分析平臺,實現(xiàn)該算法。實驗表明,該算法滿足實際車輛軌跡在時間和空間上的相似性,數(shù)據(jù)分析計算在性能上可以滿足前臺業(yè)務的檢索。該算法和軌跡相似度分析業(yè)務,可作為治安卡口應用系統(tǒng)中關聯(lián)車輛分析、團伙作案車輛分析等功能的后臺支撐業(yè)務。
關鍵詞: 軌跡相似度; LCSS算法; Spark; Hive
中圖分類號: TN911?34; TP311.5 文獻標識碼: A 文章編號: 1004?373X(2016)23?0133?03
Research and implementation of a vehicle trajectory similarity algorithm
used for security access monitoring
FAN Zhiying
(First Research Institute of the Ministry of Public Security of PRC, Beijing 100048, China)
Abstract: According to the constraints of time and space dimensions of the vehicle trajectory similarity, the LCSS (longest common subsequence) algorithm is proposed. According to the principle of longest common subsequence, the access monitoring sequences in the trajectory are abstracted. A calculation method of two vehicle trajectories similarity is proposed. The Spark pa?rallel calculation, Hive data warehouse storage and other correlation technologies are combined to establish the data analysis platform, and implement the algorithm. The experimental results show that the algorithm can satisfy the time and space similarity of the practical vehicle trajectory, and the data analysis and calculation can meet the search performance of foreground business. The algorithm and trajectory similarity analysis business can be used as the background support service of the vehicle relevance analysis and gang crime vehicle analysis in the security access monitoring application system.
Keywords: trajectory similarity; LCSS algorithm; Spark; Hive
0 引 言
隨著城市經(jīng)濟的快速發(fā)展,各地機動車保有量迅速增加,與車輛相關的刑事和治安案件也在逐年上升,除了傳統(tǒng)的違法涉案車輛的緝查管控外,基于重點車輛的行駛軌跡和出行規(guī)律分析等業(yè)務也將為偵查破案提供有力的依據(jù)。
隨著治安卡口、電子警察等應用系統(tǒng)的建設和使用,各地已積累了大量的車輛通行記錄和違法記錄,這些記錄中涵蓋了車牌號碼、經(jīng)過時間、車輛顏色、車輛類型、行駛方向、行駛狀態(tài)等車輛信息,為開展車輛出行規(guī)律分析等業(yè)務提供了強大的數(shù)據(jù)支撐。
本文使用某地區(qū)已有的大量車輛通行記錄,結合大數(shù)據(jù)相關技術,對車輛軌跡和軌跡相似度進行分析和實現(xiàn),該方案可作為治安卡口應用系統(tǒng)的車輛數(shù)據(jù)分析的實現(xiàn)思路,為其提供業(yè)務支撐。
1 車輛軌跡相似度計算
車輛軌跡相似度分析業(yè)務指的是計算指定車輛和其他車輛的行駛軌跡,分析出與指定車輛具有相似軌跡的多個車輛的通行記錄,進而為治安卡口應用系統(tǒng)的關聯(lián)車輛、團伙作案車輛等功能提供后臺業(yè)務支撐。
車輛軌跡相似度分析分別在時間和空間維度上進行了限制,首先,其他車輛與指定車輛經(jīng)過同一個卡口的時間要在一定范圍內(nèi),如2 min以內(nèi);其次,其他車輛與指定車輛經(jīng)過多個卡口的順序要一致,一致性越高,相似度越高。
3 結 語
本文依據(jù)卡口車輛軌跡相似度在時間和空間維度上的約束,提出了一種軌跡相似度的計算方法,并結合大數(shù)據(jù)相關技術對該算法進行驗證。實驗表明,該計算公式和實現(xiàn)方法滿足后臺業(yè)務分析的需求,可作為治安卡口應用系統(tǒng)相關功能的業(yè)務支撐。
參考文獻
[1] VLACHOS M, KOLLIOS G, GUNOPULOS D. Discovering si?milar multidimensional trajectories [C]// Proceedings of 2002 18th International Conference on Data Engineering. Riverside: IEEE, 2002: 673?684.
[2] KOLLIOS G, GUNOPULOS D, VLACHOS M. Robust similarity measures for mobile object trajectories [C]// Proceedings of 2002 International Workshop on Database & Expert Systems Applications. France: IEEE, 2002: 721?726.
[3] WHITE T.Hadoop權威指南[M].周敏奇,王曉玲,金澈清,等譯,2版.北京:清華大學出版社,2011.
[4] DEAN J, GHEMAWAT S. MapReduce: simplified data processing on large clusters [C]// Proceedings of the 6th Confe?rence on Symposium on Operation Systems Design Implementation. Berkeley: ACM, 2004: 107?113.
[5] ZAHARIA M, CHOWDHURY M, DAS T, et al. Resilient distributed datasets: a fault?tolerant abstraction for in?memory cluster computing [C]// Proceedings of the 9th USENIX Confe?rence on Networked Systems Design and Implementation. Berkeley: ACM, 2012: 141?146.
[6] HWANG J R, KANG H Y, LI K J. Spatio?temporal similarity analysis between trajectories on road networks [C]// Procee?dings of ER 2005 Workshops on AOIS. BP?UML, CoMoGIS, eCOMO, and Qols. Klagenfurt: Springer Berlin Heidelberg, 2005: 280?289.
[7] 夏俊鸞,劉旭暉,邵賽賽,等.Spark大數(shù)據(jù)處理技術[M].北京:電子工業(yè)出版社,2015.
[8] 高彥杰.Spark大數(shù)據(jù)處理:技術、應用與性能優(yōu)化[M].北京:機械工業(yè)出版社,2014.