陳艷真,李樹有
基于均值未知的高維協(xié)方差矩陣的估計
陳艷真,李樹有
(遼寧工業(yè)大學 理學院,遼寧 錦州 121001)
給出了一種基于均值未知情形下,高維協(xié)方差矩陣估計的新算法。即當矩陣的維數(shù)大于樣本容量時,根據(jù)隨機矩陣理論,通過樣本協(xié)方差矩陣特征值的邊緣密度函數(shù)和總體特征值的對數(shù)似然函數(shù),得到目標矩陣特征值的估計量?;谑湛s估計的思想,對目標矩陣特征值和樣本協(xié)方差矩陣特征值進行收縮估計,通過特征值的估計得到高維協(xié)方差矩陣的一個新的估計量。數(shù)值模擬表明,對于多元正態(tài)的總體,高維協(xié)方差矩陣的新估計量較樣本協(xié)方差矩陣的精度更好。
高維協(xié)方差矩陣;收縮估計;邊緣密度;似然函數(shù);奇異Wishart分布
協(xié)方差矩陣的估計是現(xiàn)代統(tǒng)計學中一個重要的參數(shù)估計問題,人們在實際應(yīng)用中會遇到各種類型的海量數(shù)據(jù),如股票交易數(shù)據(jù)、圖像處理數(shù)據(jù)、基因檢測數(shù)據(jù)等,這些數(shù)據(jù)在統(tǒng)計處理中通常稱為高維數(shù)據(jù)。
則的密度函數(shù)如下:
由Muirhead[9]的推論2.1.16,表明具有由密度指定的分布
積分J不能以封閉形式計算,此處推導(dǎo)其近似值,對于大, 積分J近似于下面的表達式:
在本小節(jié)中,主要根據(jù)Banerjee等[10]的方法求出總體協(xié)方差矩陣特征值的估計量。首先根據(jù)上節(jié)推導(dǎo)的樣本特征值的近似邊緣密度,求出總體特征值的近似對數(shù)似然函數(shù)。
邊緣密度函數(shù):
對數(shù)似然函數(shù):
則高維協(xié)方差矩陣估計的一種新估計量為
表1 數(shù)值模擬結(jié)果
n205080100200 3.50582.00481.47441.33181.1236 3.47071.98661.47431.33171.1236
[1] 茆詩松. 高等數(shù)理統(tǒng)計學[M]. 北京: 高等教育出版社, 2006.
[2] Ledoit O, Wolf M. Nonlinear Shrinkage Estimation of Large-Dimensional Covariance Matrices[J]. The Annals of Statistics, 2012, 40(2): 1024-1060.
[3] Ledoit O, Péché S. Eigenvectors of some large sample covariance matrix ensembles[J]. Probability Theory and Related Fields, 2011, 151(1): 233-264.
[4] Ledoit O, Wolf M. Spectrum estimation: A unified framework for covariance matrix estimation and PCA in large dimensions[J]. Journal of Multivariate Analysis, 2015, 139(2): 360-384.
[5] 劉恒, 郭精軍. 基于交叉驗證收縮法的高維協(xié)方差矩陣估計[J]. 統(tǒng)計與決策, 2020, 36(9): 39-42.
[6] Ledoit O, Wolf M. A well-conditioned estimator for large-dimensional covariance matrices[J]. Journal of Multivariate Analysis, 2004, 88(2): 365-411.
[7] Sch?fer J, Strimmer K. A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics[J]. Statistical applications in genetics and molecular biology, 2005, 4(1): 1-32.
[8] Uhlig H. On singular wishart and singular multivariate beta distributions[J]. The Annals of Statistics, 1994, 22(1): 395-405.
[9] Muirhead R J. Aspects of Multivariate Statistical Theory[M]. New Jersey: John Wiley and Sons Inc, 1982.
[10] Banerjee S, Monni S. An orthogonally equivariant estimator of the covariance matrix in high dimensions and for small sample sizes[J]. Journal of Statistical Planning and Inference, 2021, 213(26): 16-32.
Estimation of High Dimensional Covariance Matrix Based on Unknown Mean
CHEN Yan-zhen, LI Shu-you
(College of Science, Liaoning University of Technology, Jinzhou 121001, China)
A new algorithm for estimating high dimensional covariance matrix based on unknown mean is presented. That is, when the dimension of the matrix, p, is larger than the sample size n, according to the random matrix theory, the estimators of the eigenvalues of the objective matrix are obtained through the marginal density function of the eigenvalues of the sample covariance matrix and the logarithmic likelihood function of the population eigenvalues. Based on the idea of shrinkage estimation, the eigenvalues of target matrix and sample covariance matrix are estimated, and a new estimator of the high-dimensional covariance matrix is obtained by estimating the eigenvalues. Numerical simulation shows that the new estimator of high-dimensional covariance matrix is more accurate than the sample covariance matrix for multivariate normal population.
high-dimensionalcovariance matrices; shrinkage estimation; marginal density; likelihood function; singular Wishart distribution
10.15916/j.issn1674-3261.2023.02.012
O212
A
1674-3261(2023)02-0136-05
2022-10-21
陳艷真(1997-),女,河南駐馬店人,碩士生。
李樹有(1964-),男,遼寧錦州人,教授,博士。
責任編輯:劉亞兵