2017年司法語(yǔ)音及聲學(xué)研究

2018-02-21 16:54康錦濤王曉笛李敬陽(yáng)黃文林

刑事技術(shù) 2018年3期

康錦濤，王莉，王曉笛，盛卉，李敬陽(yáng)，黃文林

（公安部物證鑒定中心，2011計(jì)劃司法文明協(xié)同創(chuàng)新中心，北京 100038）

司法語(yǔ)音及聲學(xué)在我國(guó)即為廣義上的聲紋鑒定，包括司法語(yǔ)音學(xué)檢驗(yàn)中的語(yǔ)音同一認(rèn)定、語(yǔ)音人身分析、語(yǔ)音內(nèi)容辨識(shí)和司法聲學(xué)檢驗(yàn)中的錄音的真實(shí)性檢驗(yàn)、降噪及語(yǔ)音增強(qiáng)、噪聲分析、音源同一鑒定以及錄音器材鑒定等內(nèi)容[1]。國(guó)外司法語(yǔ)音及聲學(xué)的研究?jī)?nèi)容與我國(guó)大致相同[2]。2017年，語(yǔ)音同一認(rèn)定仍是司法語(yǔ)音及聲學(xué)的主要內(nèi)容，其在聽(tīng)覺(jué)分析、語(yǔ)音學(xué)-聲學(xué)分析、自動(dòng)識(shí)別、質(zhì)量控制等方面均產(chǎn)生了新的成果；語(yǔ)音人身分析除傳統(tǒng)的性別、年齡等特征外，語(yǔ)音情感分析也成為重要內(nèi)容，并在自動(dòng)識(shí)別方面發(fā)展迅速；各國(guó)學(xué)者也在錄音的真實(shí)性檢驗(yàn)以及降噪及語(yǔ)音增強(qiáng)等方向做了開(kāi)拓。本文對(duì)2017年司法語(yǔ)音及聲學(xué)領(lǐng)域的語(yǔ)音同一認(rèn)定、語(yǔ)音人身分析、錄音的真實(shí)性檢驗(yàn)、降噪及語(yǔ)音增強(qiáng)等熱點(diǎn)專業(yè)的代表性成果進(jìn)行介紹。

1 語(yǔ)音同一認(rèn)定

語(yǔ)音同一認(rèn)定在我國(guó)即為狹義上的聲紋鑒定[3]，它也是司法語(yǔ)音及聲學(xué)實(shí)踐中的主要分支[4]。目前，國(guó)際上的語(yǔ)音同一認(rèn)定實(shí)踐中，絕大多數(shù)機(jī)構(gòu)與從業(yè)者采用的是聽(tīng)覺(jué)分析與聲學(xué)分析相結(jié)合的專家鑒定方法[5]，但也有一些機(jī)構(gòu)開(kāi)始嘗試將自動(dòng)識(shí)別的方法引入語(yǔ)音同一認(rèn)定領(lǐng)域，采用半自動(dòng)（專家干預(yù)）或自動(dòng)識(shí)別等方法開(kāi)展實(shí)踐[6-7]。2017年，關(guān)于語(yǔ)音同一認(rèn)定的專業(yè)論述多數(shù)集中在聽(tīng)覺(jué)分析方法、語(yǔ)音學(xué)及聲學(xué)特征分析、語(yǔ)音特征的鑒定價(jià)值、鑒定意見(jiàn)表述、自動(dòng)識(shí)別技術(shù)以及語(yǔ)音同一認(rèn)定過(guò)程中的質(zhì)量控制與標(biāo)準(zhǔn)化等方面。

1.1 聽(tīng)覺(jué)分析

聽(tīng)覺(jué)分析是目前語(yǔ)音同一認(rèn)定技術(shù)方法的重要組成部分[1,8-10]，在國(guó)內(nèi)外許多規(guī)范標(biāo)準(zhǔn)中早有明確規(guī)定[11-16]。2017年，Sundqvist等[17]設(shè)計(jì)了一套聽(tīng)覺(jué)分析程序，并將之應(yīng)用于瑞典國(guó)家法庭科學(xué)中心（NFC）的檢驗(yàn)實(shí)踐中。為了推進(jìn)聽(tīng)覺(jué)分析方法的體系化與規(guī)范化，Lindh等[18]對(duì)聽(tīng)覺(jué)分析方法的可靠性做了考察，分別使用聽(tīng)覺(jué)分析與自動(dòng)識(shí)別對(duì)芬蘭語(yǔ)說(shuō)話人進(jìn)行對(duì)比分析，并用于芬蘭國(guó)家調(diào)查局（NBI）的語(yǔ)音同一認(rèn)定實(shí)踐的流程改進(jìn)。Leinonen等[19]提出建立不同語(yǔ)種的聽(tīng)覺(jué)特征集，并在瑞典語(yǔ)和芬蘭語(yǔ)兩個(gè)語(yǔ)種上開(kāi)始了初步嘗試。Land等[20]對(duì)笑聲的聽(tīng)覺(jué)分析價(jià)值進(jìn)行了探討。在偽裝語(yǔ)音的研究方面，Skarnitzl與 R??i?ková等[21-22]研究了捷克語(yǔ)說(shuō)話人的常見(jiàn)偽裝方式，并對(duì)不同偽造方式下的聽(tīng)覺(jué)特征與聲學(xué)特征做了初步分析，Delvaux等[23]考察了偽裝與模仿兩種方式下聽(tīng)覺(jué)特征與聲學(xué)特征的差異。

嗓音特質(zhì)分析（Vocal Prof i le Analysis, VPA）在語(yǔ)音同一認(rèn)定中的應(yīng)用是近年來(lái)聽(tīng)覺(jué)分析研究的熱點(diǎn)[24-29]，2017年，許多專家學(xué)者繼續(xù)就這一方向進(jìn)行探索。為了便于分析，Segundo等[30]設(shè)計(jì)了簡(jiǎn)化的VPA分析表，并應(yīng)用于同卵雙胞胎的聽(tīng)覺(jué)分析上；Segundo等[31]驗(yàn)證了VPA分析表在西班牙語(yǔ)、德語(yǔ)、英語(yǔ)語(yǔ)境下的有效性。Klug[32]就VPA分析表的改進(jìn)做了探討，提出應(yīng)當(dāng)在加強(qiáng)培訓(xùn)的基礎(chǔ)上改進(jìn)要素的類目。Hughes等[33-34]將VPA分析表得分與自動(dòng)識(shí)別方法結(jié)合起來(lái)考察，結(jié)果表明，將使用梅爾頻率倒譜系數(shù)（MFCC）參數(shù)與長(zhǎng)時(shí)共振峰分布（LTFD）特征的自動(dòng)識(shí)別系統(tǒng)融合，系統(tǒng)性能提升有限，將VPA得分結(jié)果加入后，系統(tǒng)識(shí)別正確率顯著增加。

1.2 語(yǔ)音學(xué)-聲學(xué)分析

聽(tīng)覺(jué)分析與語(yǔ)音學(xué)-聲學(xué)分析是共生互補(bǔ)的關(guān)系[35-36]，語(yǔ)音學(xué)-聲學(xué)分析方法不僅為聽(tīng)覺(jué)分析提供量化支持，而且也可以提供新的特征[3]。在語(yǔ)音學(xué)-聲學(xué)分析方面，Heuven、Gold等[37-38]繼續(xù)就填詞暫停（f i lled pauses）、猶豫詞（hesitation markers）的聲學(xué)特征進(jìn)行分析，以進(jìn)一步挖掘其在語(yǔ)音同一認(rèn)定中的價(jià)值。He等[39]研究了不同說(shuō)話人的重音變化受噪音或不同頻段影響的程度，結(jié)果表明不同說(shuō)話人的重音特征在全頻段上都有較好的體現(xiàn)。雙語(yǔ)者在說(shuō)兩種語(yǔ)言時(shí)的聲學(xué)特征各有何特點(diǎn)是一直以來(lái)的研究課題之一，Dorreen等[40]就這個(gè)課題下的長(zhǎng)時(shí)基頻分布做了研究。Arantes等[41]考察了語(yǔ)種、話語(yǔ)方式等因素對(duì)長(zhǎng)時(shí)基頻達(dá)到穩(wěn)定狀態(tài)時(shí)的時(shí)長(zhǎng)影響，結(jié)果表明話語(yǔ)方式的影響最大。Dimos、Lopez等[42-43]研究了大喊狀態(tài)下語(yǔ)音的節(jié)奏、韻律以及頻譜特征。He等[44]研究了音強(qiáng)曲線的聲紋鑒定價(jià)值。不同語(yǔ)種的元音空間（vowel space）并不相同，Varo?anec-?kari?[45]研究了克羅地亞語(yǔ)、塞爾維亞語(yǔ)和斯洛文尼亞語(yǔ)男性說(shuō)話人元音空間的異同，為開(kāi)展不同語(yǔ)種間的說(shuō)話人鑒定提供了一定基礎(chǔ)。McDougall等[46]比較了基于音節(jié)與基于時(shí)間的兩種流利度描寫方法。Wang等[47]研究了漢語(yǔ)復(fù)合元音的動(dòng)態(tài)特征，結(jié)果表明復(fù)合元音也具備較高的聲紋鑒定價(jià)值。Heeren[48]對(duì)電話錄音中[s]在不同語(yǔ)境下的不同聲學(xué)特性進(jìn)行了探討。在嗓音檔案（voice prof i le）的構(gòu)建方面，F(xiàn)ranchini[49]以[l]音的聲學(xué)特征為例對(duì)此做了研究，F(xiàn)ingerling[50]對(duì)二語(yǔ)說(shuō)話人的元音集合重建做了探索。

1.3 語(yǔ)音特征的價(jià)值

在語(yǔ)音同一認(rèn)定中，語(yǔ)音特征價(jià)值的高低是需要重點(diǎn)考慮的內(nèi)容。根據(jù)語(yǔ)音特征的動(dòng)態(tài)性原理，其具有變異性（即同一說(shuō)話人的自身的差異）和差異性（即不同說(shuō)話人之間的差異），變異小而差異大的特征鑒定價(jià)值較高。2017年，對(duì)于特征價(jià)值的關(guān)注點(diǎn)主要在人群的語(yǔ)音特征分布上。Rhodes等[51]認(rèn)為現(xiàn)階段的人群特征分布研究應(yīng)與實(shí)際案件結(jié)合。Hughes、Wormald[52]提出建立維基方言庫(kù)的構(gòu)想，將方言中的高價(jià)值特征放入數(shù)據(jù)庫(kù)。Hughes等[53]提出了研究人群語(yǔ)音特征分布需要考慮的四個(gè)問(wèn)題，一是控制因子，二是特異度，三是誤差，四是確定程度，并以英語(yǔ)中雙元音[ai]中的共振峰走勢(shì)為例，說(shuō)明了不同情況下的語(yǔ)音特征分布對(duì)語(yǔ)音同一認(rèn)定結(jié)果的可能影響。在檢材與樣本內(nèi)部語(yǔ)音特征的表現(xiàn)是否穩(wěn)定方面，在以往部分研究的基礎(chǔ)上，Ajili[54-56]提出一種使用信息論中的同質(zhì)化度量（homogeneity measure）標(biāo)準(zhǔn)對(duì)聲學(xué)參數(shù)的穩(wěn)定性進(jìn)行度量的方法[57]。

1.4 聲紋鑒定意見(jiàn)表述

聲紋鑒定的意見(jiàn)表述一直以來(lái)都是討論熱點(diǎn)。國(guó)際上，Rose和Morrison一直提倡量化的似然比體系，英國(guó)的Nolan等絕大部分從業(yè)人員使用英國(guó)立場(chǎng)聲明形式，歐洲大陸的大部分從業(yè)者則使用可能性等級(jí)形式。我國(guó)則多使用5級(jí)分類的可能性等級(jí)形式[11]。

2017年，英國(guó)的French[58]調(diào)整了其意見(jiàn)表述形式，逐漸從英國(guó)立場(chǎng)說(shuō)明框架下的一致性與獨(dú)特性[59]轉(zhuǎn)向可能性等級(jí)形式，在這一框架下，意見(jiàn)共分為13級(jí)，與英國(guó)法庭科學(xué)提供者協(xié)會(huì)（Association of Forensic Science Providers）推薦的標(biāo)準(zhǔn)[60]一致。荷蘭NFI的Vermeulen[61]介紹了其得出“強(qiáng)烈支持”結(jié)論的依據(jù)，在實(shí)際案例中，NFI只有在檢材與樣本特征幾乎相同或者說(shuō)話人有言語(yǔ)障礙等高度獨(dú)特性特征時(shí)才給出這種鑒定意見(jiàn)。

1.5 語(yǔ)音數(shù)據(jù)庫(kù)及自動(dòng)識(shí)別技術(shù)

目前，國(guó)際上司法語(yǔ)音及聲學(xué)專門的語(yǔ)音數(shù)據(jù)庫(kù)有英國(guó)的Nolan建立的DyVis[62]、澳大利亞的Morrison建立的FVCD[63]、西班牙的Ramos建立的AHUMADA[64]、荷蘭的Vloed建立的NFI-FRITS[65]、法國(guó)的Ajili建立的FABIOLE[66]等。國(guó)內(nèi)方面，我國(guó)的“全國(guó)公安機(jī)關(guān)聲紋數(shù)據(jù)庫(kù)”依然是國(guó)際上收錄說(shuō)話人最多的聲紋鑒定語(yǔ)音數(shù)據(jù)庫(kù)。2017年新建的VoxCeleb[67]則是比較新的代表。目前說(shuō)話人自動(dòng)識(shí)別技術(shù)的主流框架主要有兩類，一種是高斯混合模型加通用背景模型（GMM-UBM），另一種是基于i向量（i-vector）空間的概率線性判別分析（PLDA）方法，同時(shí)開(kāi)始使用深度神經(jīng)網(wǎng)絡(luò)（deep neural network,DNN）提取語(yǔ)音特征。后一種框架較新，因此成為2017年的研究熱點(diǎn)。DNN提取語(yǔ)音特征的方法取得的效果較好，對(duì)訓(xùn)練數(shù)據(jù)量的要求也較大，我國(guó)的“全國(guó)公安機(jī)關(guān)聲紋數(shù)據(jù)庫(kù)”已經(jīng)采用DNN方法提取特征。Park等[68]將嗓音音質(zhì)聲學(xué)特征引入采用這種架構(gòu)的自動(dòng)識(shí)別系統(tǒng)中，與MFCC特征結(jié)合，顯著提升了短語(yǔ)音的識(shí)別率。Solewicz等[69]為解決現(xiàn)有的對(duì)數(shù)似然比（LLR）對(duì)處理說(shuō)話人內(nèi)部變異的不足提出了一種新的說(shuō)話人自動(dòng)識(shí)別系統(tǒng)性能指標(biāo)——空假設(shè)對(duì)數(shù)似然比（Null-Hypothesis LLR）。Tsch?pe等[70]考察了基于i向量系統(tǒng)的錯(cuò)誤結(jié)果，發(fā)現(xiàn)如果加入地域信息，系統(tǒng)錯(cuò)誤率會(huì)大大下降。Alexander等[71]設(shè)計(jì)了基于i向量的多說(shuō)話人自動(dòng)識(shí)別系統(tǒng)。Milo?evi?[72]將基頻、共振峰頻率、共振峰帶寬等音段特征（SF）與現(xiàn)有GMM-MFCC架構(gòu)的自動(dòng)識(shí)別系統(tǒng)相結(jié)合，提升了原有系統(tǒng)的識(shí)別正確率。

關(guān)于說(shuō)話人自動(dòng)識(shí)別在語(yǔ)音同一認(rèn)定中的作用，目前仍有爭(zhēng)議。比如，雖然德國(guó)、西班牙、瑞典等國(guó)的訴訟中已有接受專家干預(yù)自動(dòng)識(shí)別方法鑒定結(jié)論的判例，但鑒于目前自動(dòng)識(shí)別系統(tǒng)的性能，這種“接受”不僅在程度上有限，而且推廣起來(lái)仍困難重重。以英國(guó)為例，英國(guó)JP French實(shí)驗(yàn)室的French與Harrison作為辯方專家證人在“女王訴斯雷德等人”（R v Slade&Ors）的上訴案件中提供了專家鑒定與自動(dòng)識(shí)別系統(tǒng)兩套語(yǔ)音同一認(rèn)定證據(jù)，但是上訴法院駁回了自動(dòng)識(shí)別系統(tǒng)的鑒定結(jié)論。 French[58]表示，雖然這宗判例并沒(méi)有直接扼殺英國(guó)未來(lái)使用自動(dòng)識(shí)別系統(tǒng)鑒定結(jié)論的希望，但是，鑒于英美法系的判例傳統(tǒng)，除非未來(lái)說(shuō)話人自動(dòng)識(shí)別技術(shù)取得重大技術(shù)突破，否則不僅是英國(guó)，甚至包括加拿大、新西蘭、澳大利亞等英聯(lián)邦國(guó)家（共52個(gè)國(guó)家）都將駁回說(shuō)話人自動(dòng)識(shí)別系統(tǒng)的鑒定結(jié)論。

2 質(zhì)量控制及標(biāo)準(zhǔn)化

質(zhì)量控制方面，F(xiàn)rench等[73]提出了聲紋鑒定實(shí)驗(yàn)室檢驗(yàn)鑒定的透明化倡議，其將之稱為“打開(kāi)百葉窗”（opening the blinds）行動(dòng)，并詳細(xì)介紹了JP French實(shí)驗(yàn)室的檢驗(yàn)流程。德國(guó)BKA的Wagner[74]則介紹了其語(yǔ)音同一認(rèn)定的標(biāo)準(zhǔn)操作規(guī)程，并結(jié)合實(shí)際案例進(jìn)行了演示。這種透明化與標(biāo)準(zhǔn)化的趨勢(shì)是司法語(yǔ)音及聲學(xué)中質(zhì)量控制的主要方向。

標(biāo)準(zhǔn)化方面，我國(guó)的公安部頒布了司法語(yǔ)音及聲學(xué)的四個(gè)公安安全行業(yè)標(biāo)準(zhǔn)，包括語(yǔ)音同一認(rèn)定[11]、錄音的真實(shí)性檢驗(yàn)[12]、降噪及語(yǔ)音增強(qiáng)[13]和語(yǔ)音人身分析[14]四個(gè)專業(yè)方向。

3 語(yǔ)音人身分析

語(yǔ)音人身分析是指在只聞其聲、不見(jiàn)其人的情況下，對(duì)說(shuō)話人的社會(huì)群體屬性和個(gè)體屬性進(jìn)行刻畫(huà)；或在見(jiàn)其人但不知其身份的情況下，通過(guò)上述綜合分析對(duì)其社會(huì)群體屬性進(jìn)行判斷[4]。聲紋鑒定實(shí)踐中，還涉及對(duì)說(shuō)話人的暫時(shí)狀態(tài)與瞬時(shí)狀態(tài)的分析刻畫(huà)，如通過(guò)語(yǔ)音對(duì)說(shuō)話人是否抽煙、吸毒進(jìn)行分析，通過(guò)語(yǔ)音推測(cè)說(shuō)話人心理狀態(tài)語(yǔ)音情感分析[75]，我們也將之歸入語(yǔ)音人身分析中去。

人工耳蝸的頻率響應(yīng)有自己的特點(diǎn)，Kova?i?[76]研究了人工耳蝸對(duì)聲音信號(hào)的處理特性，并探索其在說(shuō)話人性別、體型、身分識(shí)別等方面的應(yīng)用潛力。Georg[77]研究了德語(yǔ)的不同方言對(duì)年齡分析的影響，探索了不同方言對(duì)年齡推測(cè)的影響因素。Tomi?[78]研究了通過(guò)口音負(fù)遷移推斷說(shuō)話人地域的方法。Jong-Lendle等[79]研究了從外國(guó)人的德語(yǔ)口音中推斷其母語(yǔ)的方法。Schwab等[80]研究了抽煙對(duì)嗓音的影響，Rodmonga等[81]研究了吸毒后的言語(yǔ)聽(tīng)覺(jué)特征，其結(jié)果均可用于對(duì)說(shuō)話人身體狀態(tài)的分析。自動(dòng)人身分析方面，Kelly等[82]設(shè)計(jì)了基于i向量的說(shuō)話人自動(dòng)刻畫(huà)系統(tǒng)，能夠自動(dòng)分析說(shuō)話人的性別、年齡及語(yǔ)種。Watt等[83]對(duì)自動(dòng)口音識(shí)別與人工口音識(shí)別進(jìn)行了比對(duì)研究。

語(yǔ)音情感分析方面，Kathiresan等[84]研究了MFCC中的語(yǔ)音情感信息。Hippey等[85]探索了在語(yǔ)音中識(shí)別懊悔情緒的方法。Bizozzero等[86]研究了女性說(shuō)話人聲音中的恐懼信息，主要涉及基頻、語(yǔ)速以及音高對(duì)恐懼信息的影響。Satt等設(shè)計(jì)了一種使用卷積網(wǎng)絡(luò)與遞歸網(wǎng)絡(luò)兩種神經(jīng)網(wǎng)絡(luò)工[87]具直接從聲譜圖中識(shí)別情感的方法。Zhang等[88]針對(duì)對(duì)話語(yǔ)音設(shè)計(jì)了一個(gè)情感交流與轉(zhuǎn)換（EIT）模型挖掘?qū)υ捴械慕涣髋c轉(zhuǎn)換語(yǔ)中的情感信息，設(shè)計(jì)的算法比傳統(tǒng)方法在正確率與精度方面各提升了18.8%與22.6%。Parthasarathy、Le等[89-90]對(duì)深度學(xué)習(xí)中的多任務(wù)學(xué)習(xí)方法在語(yǔ)音情感識(shí)別中的應(yīng)用做了探索。除了一般性的情感識(shí)別外，語(yǔ)音測(cè)謊也是語(yǔ)音情感識(shí)別的研究熱點(diǎn)。Schroder[91]使用合成分析方法（analysis-bysynthesis）將不同的發(fā)聲方式、語(yǔ)速、顫音（tremolo）及基頻與中性言語(yǔ)（neutral utterances）組合，分別判斷各段語(yǔ)音的可信度。結(jié)果表明，當(dāng)顫音與氣息增加時(shí)，語(yǔ)音內(nèi)容的可信度大大提升，當(dāng)暫停與基頻增加上，語(yǔ)音內(nèi)容的可信度則下降。Mendels[92]使用CXD語(yǔ)料庫(kù)比較了頻譜集合、聲學(xué)-韻律集合和用詞特征集合對(duì)于謊言的表征程度，并使用混合深度模型對(duì)這些集合進(jìn)行測(cè)試。

4 錄音的真實(shí)性檢驗(yàn)

錄音真實(shí)性檢驗(yàn)是指通過(guò)對(duì)錄音資料進(jìn)行語(yǔ)音學(xué)和聲學(xué)、電磁學(xué)、信號(hào)處理技術(shù)等方面的分析檢驗(yàn)，做出其是否經(jīng)過(guò)剪輯的結(jié)論[4]。

2017年，Ali[93]等開(kāi)發(fā)了一套自動(dòng)系統(tǒng)，系統(tǒng)基于心理聲學(xué)原理，準(zhǔn)確率達(dá)99.2%。Catalin等[94]為了解決檢驗(yàn)中無(wú)法獲取原始錄音器材的問(wèn)題，將18年間的125中錄音設(shè)備與40中商業(yè)錄音軟件的文件結(jié)構(gòu)與格式做了全面介紹。Jeff等[95]研究了iOS系統(tǒng)中的音頻文件，并基于決策樹(shù)建立了針對(duì)此類文件的檢驗(yàn)流程。Rashmika等[96]探討了錄音中的混響等噪音信息在真實(shí)性檢驗(yàn)中的價(jià)值。

電網(wǎng)頻率（ENF）檢測(cè)方法是錄音的真實(shí)性檢驗(yàn)中的熱點(diǎn)。關(guān)于這一方法的原理與具體內(nèi)容，可參見(jiàn)以往文獻(xiàn)[97-99]。Huang等[100]就ENF檢驗(yàn)中的一些常見(jiàn)問(wèn)題進(jìn)行了討論。James等[101]開(kāi)發(fā)了基于云端的便攜式ENF系統(tǒng)，從而避免了檢驗(yàn)的地域限制。Huang等[102]提出用絕對(duì)誤差圖（absolute error map）聯(lián)系檢材音頻與ENF數(shù)據(jù)庫(kù)中的ENF信息，并據(jù)此構(gòu)建的兩套算法。Reis等[103]開(kāi)發(fā)了基于ESPRITHilbert檢測(cè)ENF的分析方法，結(jié)果大大優(yōu)于其他方法。

國(guó)內(nèi)方面，操文成[104]針對(duì)語(yǔ)音偽造的檢測(cè)提出了兩種新算法，漏檢率均低于10%。孫蒙蒙[105]提出了適用于音頻檢測(cè)的共生向量特征，基于該特征的方法準(zhǔn)確率可達(dá)95%。申小虎[106]等在系統(tǒng)分析數(shù)字音頻文件篡改方法基本原理的基礎(chǔ)上，使用多種頻譜分析方法尋找音頻文件的篡改特征，建立了有效的頻譜檢驗(yàn)的方法。

5 降噪及語(yǔ)音增強(qiáng)

降噪及語(yǔ)音增強(qiáng)是綜合運(yùn)用計(jì)算機(jī)技術(shù)、聲學(xué)技術(shù)對(duì)錄音資料進(jìn)行降低噪音信號(hào)、增強(qiáng)語(yǔ)音信號(hào)的處理技術(shù)，目前主要的算法有自適應(yīng)噪聲抵消算法、統(tǒng)計(jì)模型算法、譜減法、聽(tīng)覺(jué)掩蔽算法，短時(shí)譜估計(jì)算法、子空間算法、小波變換算法等[4]。2017年，使用DNN方法降噪及語(yǔ)音增強(qiáng)成為熱點(diǎn)。

在去混響及回聲消除方面，Guzewich等[107]研究了使用DNN去混響的一種新方法。此前，相關(guān)研究[108-111]已經(jīng)在使用DNN去混響方面取得了一定進(jìn)展，新方法處理的音頻在說(shuō)話人比對(duì)系統(tǒng)中的等錯(cuò)誤率由9.2%降至6.8%。Bulling等[112]提出了一種消除錄音中回聲的新方法，可以使信號(hào)的最大穩(wěn)定增益（MSG）提升30分貝。在語(yǔ)音增強(qiáng)方面，Wu等[113]提出了基于局部線性嵌入（LLE）算法的差異補(bǔ)償后置濾波（post-f i ltering）方法。Ogawa等[114]從基于深度神經(jīng)網(wǎng)絡(luò)的聲學(xué)模型（DNN-AM）中提取出瓶頸特征（bottleneck features），然后使用噪音樣例搜索（example search）的方法消除單聲道音頻中的高度不穩(wěn)定噪音。Gelderblom等[115]提出了一種評(píng)價(jià)基于DNN的語(yǔ)音增強(qiáng)算法的主觀評(píng)測(cè)方法。在非DNN方法上，Qian等[116]使用貝葉斯WaveNet方法直接就原始音頻進(jìn)行處理，也得到了不錯(cuò)的語(yǔ)音增強(qiáng)效果。在降噪方面，Pascual等[117]使用深度網(wǎng)絡(luò)中的生成式對(duì)抗網(wǎng)絡(luò)（generative adversarial network）降噪，并以主觀與客觀兩種評(píng)測(cè)方法證明了這種方法的有效性。Maiti等[118]同時(shí)使用兩個(gè)網(wǎng)絡(luò)進(jìn)行拼接再合成（concatenative resynthesis），大大提升了處理速度。值得注意的是，在司法實(shí)踐中，背景噪音因?yàn)榘杏眯畔?，需要在降噪過(guò)程中保留甚至增強(qiáng)，這就需要實(shí)踐中結(jié)合多種方法，消減目標(biāo)噪音，保留有用信息，上述部分深度學(xué)習(xí)的方法因具有較強(qiáng)的靈活性便具有了更大的優(yōu)勢(shì)。

［1］李敬陽(yáng)．音像物證技術(shù) 第二章 : 聲音物證技術(shù)［M］//李學(xué)軍.新編物證技術(shù)學(xué). 北京：北京交通大學(xué)出版社，2015：339-360．

［2］ HOLLIEN H．The acoustics of crime: the new science of forensic phonetics［M］．New York: Plenum Press, 1990.

［3］曹洪林，李敬陽(yáng)，王英利，等．論聲紋鑒定意見(jiàn)的表述形式［J］.證據(jù)科學(xué)，2013,21(5):605-624．

［4］王英利，李敬陽(yáng)，曹洪林．聲紋鑒定技術(shù)綜述［J］．警察技術(shù),2012(4):54-56．

［5］ Eriksson A. Aural/Acoustic vs. automatic methods in forensic phonetic casework［M］// NEUSTEIN A, PATIL H.A. In Forensic Speaker Recognition: Law Enforcement and Counter-Terrorism. New York: Springer, 2011: 41-69．

［6］ GOLD E, FRENCH P. International practices in forensic speaker comparison［J］. International Journal of Speech Language and the Law, 2011, 18(2): 293-307.

［7］ MORRISON G S, SAHITO F H, JARDINE G, et al. Interpol survey of the use of speaker identif i cation by law enforcement agencies［J］. Forensic Science International, 2016, 263(3): 92-100.

［8］ NOLAN F. The phonetic bases of speaker recognition［M］. Cambridge, UK: Cambridge University Press, 1983.

［9］ HOLLIEN H, DIDLA G, HARNSBERGER J D, et al. The case for aural perceptual speaker identif i cation［J］. Forensic Science International, 2016, 269(3) :8-20.

［10］ ROSE P. Forensic speaker identif i cation［M］. London: Taylor and Francis, 2002.

［11］中華人民共和國(guó)公安部，法庭科學(xué)語(yǔ)音同一認(rèn)定技術(shù)規(guī)范：GA/T 1433-2017［S］.北京：中國(guó)標(biāo)準(zhǔn)出版社，2017.

［12］中華人民共和國(guó)公安部，法庭錄音的真實(shí)性檢驗(yàn)技術(shù)規(guī)范：GA/T 1432-2017 ［S］. 北京：中國(guó)標(biāo)準(zhǔn)出版社，2017.

［13］中華人民共和國(guó)公安部，法庭科學(xué)降噪及語(yǔ)音增強(qiáng)技術(shù)規(guī)范：GA/T 1431-2017［S］ .北京：中國(guó)標(biāo)準(zhǔn)出版社，2017.

［14］中華人民共和國(guó)公安部，法庭科學(xué)語(yǔ)音人身分析技術(shù)規(guī)范：GA/T 1430-2017［S］ .北京：中國(guó)質(zhì)檢出版社，2017.

［15］中華人民共和國(guó)司法部司法鑒定管理局，錄音資料鑒定規(guī)范：SF/Z JD0301001-2010［S］.北京：中國(guó)標(biāo)準(zhǔn)出版社，2010.

［16］ CAIN S. American Board of Recorded Evidence-Voice Comparison Standards［EB/OL］. (1998)［ 2017-10-15］. http://www.forensictapeanalysisinc.com/Articles/voice_comp.htm

［17］ SUNDQVIST M, LEINONEN T, LINDH J, et al. Blind test procedure to avoid bias in perceptual analysis for forensic speaker comparison casework［C］// IAFPA . Proceedings of IAFPA2017.Split, Croatia:IAFPA,2017: 45-47.

［18］ LINDH J, NAUTSCH A, LEINONEN T, et al. Comparison between perceptual and automatic systems on fi nnish phone speech data (FinEval1) - a pilot test using score simulations［C］// IAFPA.Proceedings of IAFPA2017. Split, Croatia:IAFPA,2017:86-87.

［19］ LEINONEN T, LINDH J, AKESSON J. Creating linguistic feature set templates for perceptual forensic speaker comparison in fi nnish and swedish［C］// IAFPA. Proceedings of IAFPA2017.Split, Croatia:IAFPA,2017:126-128.

［20］ LAND E, GOLD E. Speaker identif i cation using laughter in a close social network［C］ // IAFPA. Proceedings of IAFPA2017.Split, Croatia:IAFPA,2017: 99-101.

［21］ SKARNITZL R, R??I?KOVá A. The malleability of speech production: An examination of sophisticated voice disguise［C］ // IAFPA. Proceedings of IAFPA2017. Split,Croatia:IAFPA,2017:59-60.

［22］ R??I?KOVá A, SKARNITZL R. Voice disguise strategies in Czech male speakers［J］. AUC Philologica, Phonetica Pragensia.2017.

［23］ DELVAUX V, CAUCHETEUX L, HUET K, et al. Voice disguise vs. Impersonation: Acoustic and perceptual measurements of vocal flexibility in non-experts［C］// ISCA. Proceedings of Inter speech 2017. Stockholm, Sweden: ISCA ,2017:3777-3781.

［24］ JESSEN M. Speaker-specif i c information in voice quality parameters［J］, Forensic Linguistics 1997, 4 (1):84-103.

［25］ K?STER O, K?STER J P. The auditory-perceptual evaluation of voice quality in forensic speaker recognition［J］. The Phonetician, 2004,89: 9–37.

［26］ NOLAN F. Voice quality and forensic speaker identif i cation［J］.GOVOR XXIV 2007. 24(2):111-128.

［27］ K?STER O, JESSEN M , KHAIRI F, et al. Auditory-perceptual identif i cation of voice quality by expert and non-expert listeners［C］. ICphS XVI, 2007:1845-1848.

［28］ SEGUNDO E, ALVES H , TRINIDAD M F. CIVIL corpus:voice quality for speaker forensic comparison［J］. Proceida, Social and Behavioral Science. 2013,95(4): 587-593.

［29］ FRENCH P. Developing the vocal prof i le analysis scheme for forensic voice comparison［C］. York, UK:IAFPA, 2016.

［30］ SEGUNDO E. A simplif i ed vocal prof i le analysis protocol for the assessment of voice quality and speaker similarity［J］. Journal of Voice. 2017,31(5):11-27.

［31］ SEGUNDO E, BRAUN A, HUGHES V, et al. Speaker-similarity perception of Spanish twins and non-twins by native speakers of Spanish, German and English［C］// IAFPA. Proceedings of IAFPA2017. Split, Croatia:IAFPA,2017:159-162.

［32］ KLUG K. Refining the Vocal Profile Analysis (VPA) scheme for forensic purposes［C］// IAFPA. Proceedings of IAFPA2017.Split, Croatia:IAFPA,2017. 190-191.

［33］ HUGHES V, HARRISON P , FOULKES P, et al. Mapping across feature spaces in forensic voice comparison: the contribution of auditory-based voice quality to (semi-)automatic system testing［C］// ISCA. Proceedings of Interspeech2017. Stockholm,Sweden: ISCA , 2017:3892-3896.

［34］ HUGHES V, HARRISON P, P FOULKES, et al. The complementarity of automatic, semi-automatic, and phonetic measures of vocal tract output in forensic voice comparison［C］. // IAFPA.Proceedings of IAFPA2017. Split, Croatia:IAFPA,2017:83-85.

［35］ NOLAN F. Speaker identification evidence: its forms, limitations, and roles［C］//Proceedings of the conference’ Law and Language: Prospect and Retrospect’ . University of Lapland,2001.

［36］ NOLAN F. Voice［M］// BOGAN P S, ROBERTS A. In identif i cation: investigation, trial and scientif i c evidence. Jordan Publishing ,2011:381-390.

［37］ HEUVEN V, CORTES P. Speaker specificity of filled pauses compared with vowels and consonants in Dutch［C］ // IAFPA .Proceedings of IAFPA2017. Split, Croatia:IAFPA,2017: 48-49.

［38］ GOLD E, ROSS S, EARNSHAW K. Delimiting the West Yorkshire population: Examining the regional-specif i city of hesitation markers［C］ // IAFPA .Proceedings of IAFPA2017. Split,Croatia:IAFPA,2017:50-52.

［39］ HE L, DELWO V. Between-speaker intensity variability is maintained in different frequency bands of amplitude demodulated signal［C］// IAFPA .Proceedings of IAFPA2017. Split,Croatia:IAFPA,2017:55-58.

［40］ DORREEN K, PAPP V. Bilingual speakers’ long-term fundamental frequency distributions as cross-linguistic speaker discriminants［C］ // IAFPA .Proceedings of IAFPA2017. Split,Croatia:IAFPA,2017:61-64.

［41］ ARANTES P, ERIKSSON A, GUTZEIT. Effect of language,speaking style and speaker on long-term f0 estimation［C］ //ISCA. Proceedings of Inter speech 2017. Stockholm, Sweden:ISCA ,2017:3897-3901.

［42］ DIMOS K, DELLWO V, HE L. Rhythm and speaker-specif i c variability in shouted speech［C］ // IAFPA .Proceedings of IAFPA2017. Split, Croatia:IAFPA,2017:102-104.

［43］ LOPEZ A, SAEIDI R, JUVELA L, et al. Normal-to-shouted speech spectral mapping for speaker recognition under vocal effort mismatch［C］// ICASSP. Proceedings of ICASSP2017.ICASSP,2017:4940-4944.

［44］ HE L, DELLWO V. Speaker-specific temporal organizations of intensity contours［C］ // IAFPA .Proceedings of IAFPA2017.Split, Croatia:IAFPA,2017:163-166.

［45］ VARO?ANEC-?KARI? G, BA?I? I, KI?I?EK G. Comparison of vowel space of male speakers of Croatian, Serbian and Slovenian language［C］ // IAFPA .Proceedings of IAFPA2017.Split, Croatia:IAFPA,2017: 142-146.

［46］ MCDOUGALL K, DUCKWORTH M. Fluency prof i ling for forensic speaker comparison: a comparison of syllable- and timebased approaches［C］ // IAFPA .Proceedings of IAFPA2017.Split, Croatia:IAFPA,2017:129-131.

［47］ WANG L, KANG J, LI J, et al. Speaker-specif i c dynamic features of diphthongs in Standard Chinese ［C］ // IAFPA . Proceedings of IAFPA2017. Split, Croatia:IAFPA,2017: 91-95.

［48］ HEEREN W. Speaker-dependency of /s/ in spontaneous telephone conversation［C］ // IAFPA . Proceedings of IAFPA2017.Split, Croatia:IAFPA,2017:68-71.

［49］ FRANCHINI S. Construction of a voice profile: An acoustic study of /l/［C］ // IAFPA . Proceedings of IAFPA2017. Split,Croatia:IAFPA,2017:183-186.

［50］ FINGERLING B. Constructing a voice prof i le: Reconstruction of the L1 vowel set for a L2 speaker［C］ // IAFPA . Proceedings of IAFPA2017. Split, Croatia:IAFPA,2017:197-199.

［51］ RHODES R, FRENCH P, HARRISON P, et al. Which questions,propositions and ‘relevant populations’ should a speaker comparison expert assess［C］// IAFPA. Proceedings of IAFPA2017.Split, Croatia: IAFPA. 2017: 40-44.

［52］ HUGHES V, WORMALD J. WikiDialects: a resource for assessing typicality in forensic voice comparison［C］ // IAFPA.Proceedings of IAFPA2017. Split, Croatia: IAFPA. 2017: 154-155.

［53］ HUGHES V, FOULKES P. What is the relevant population?Considerations for the computation of likelihood ratios in forensic voice comparison［C］ // ISCA. Proceedings of Inter speech 2017. Stockholm, Sweden: ISCA ,2017:3772-3776.

［54］ AJILI M, BONASTRE J, KHEDER W, et al. Phonetic content impact on forensic voice comparison［C］// Spoken Language Technology Workshop(SLT), 2016 IEEE. IEEE, 2016:210–217.

［55］ AJILI M, BONASTRE J, ROSSATTO S, et al. Inter-speaker variability in forensic voice comparison: a preliminary evaluation［C］//2016 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP). IEEE, 2016:2114–2118.

［56］ DODDINGTON G, LIGGETT W, MARTIN A, et al. Sheep,goats, lambs and wolves: A statistical analysis of speaker performance in the NIST 1998 speaker recognition evaluation［C］//Tech. Rep. DTIC Document, 1998.

［57］ AJILI M, BONASTRE J, KHEDER W, et al. Homogeneity measure impact on target and non-target trials in forensic voice comparison［C］ // ISCA. Proceedings of Inter speech 2017.Stockholm, Sweden: ISCA ,2017:2844-2848.

［58］ FRENCH P. A developmental history of forensic speaker comparison in the UK［J］. English Phonetics, 2017: 271-286.

［59］ FRENCH P, HARRISON P. Position statement concerning use of impressionistic likelihood terms in forensic speaker comparison cases［J］. International Journal of Speech Language and the Law, 2007, 14(1): 137-144.

［60］ Association of Forensic Science Providers. Standards for the formulation of evaluative forensic science expert opinion［J］. Science and Justice 2009(49):161-164.

［61］ VERMEULEN J, CAMBIER-LANGEVELD T. Outstanding cases: about case reports with a “strong” conclusion［C］ // IAFPA .Proceedings of IAFPA2017. Split, Croatia: IAFPA. 2017: 31-33.

［62］ NOLAN F, MCDOUGALL K, JONG G D, et al. A forensic phonetic study of ‘dynamic’ sources of variability in speech: the dyvis project［C］//Proceedings of the 11th Australian International Conference on Speech Science & Technology．University of Auckland, 2006: 13-18.

［63］ MORRISON G S, ZHANG C, ENZINGER E, et al. Forensic voice comparison databases［DB/OL］, 2015. http://www.forensic-voice-comparison.net/

［64］ RAMOS D, GONZALEZ-RODRIGUEZ J, LUCENA-MOLINA J J. Addressing database mismatch in forensic speaker recognition with Ahumada III: a public real-casework database in Spanish［C］. International Speech Communication Association.2008.

［65］ VLOED V D, BOUTEN J, LEEUWEN D. NFI-FRITS: A forensic speaker recognition database and some fi rst experiments［C］//Proceedings of Odyssey: The Speaker and Language Recognition Workshop. 2014:6-13.

［66］ AJILI M, BONASTRE J, ROSSATO S. FABIOLE, a Speech database for forensic speaker comparison［C］// Proceedings of LREC-Conference, Slovenia. 2016:726-733.

［67］ NAGRANI A, CHUNG J, ZISSERMAN A. VoxCeleb: a largescale speaker identif i cation dataset［J］. Sound. 2017.

［68］ PARK S J, YEUNG G, KREIMAN J, et al. Using voice quality features to improve short-utterance, text-independent speaker verification systems［C］ // ISCA. Proceedings of Inter speech 2017. Stockholm, Sweden: ISCA ,2017:1522-1526.

［69］ SOLEWICZ Y, JESSEN M, VAN DER VLOED. Null-Hypothesis LLR: a proposal for forensic automatic speaker recognition［C］// ISCA. Proceedings of Inter speech 2017. Stockholm, Sweden:ISCA ,2017:2849-2853.

［70］ TSCH?PE N. Analysis of i-vector-based false-accept trials in a dialect labelled telephone corpus［C］ // IAFPA .Proceedings of IAFPA2017. Split, Croatia: IAFPA. 2017: 65-67.

［71］ ALEXANDER A. Not a lone voice: automatically identifying speakers in multi-speaker recordings［C］ // IAFPA .Proceedings of IAFPA2017. Split, Croatia: IAFPA. 2017: 80-82.

［72］ MILO?EVI? M, GLAVITSCH U. Combining Gaussian mixture models and segmental feature models for speaker recognition［C］// ISCA. Proceedings of Inter speech 2017. Stockholm, Sweden:ISCA ,2017:2042-2043.

［73］ FRENCH J, HARRISON P, KIRCHHüBEL C, et al. From receipt of recordings to dispatch of report: opening the blinds on lab practices［C］ // IAFPA .Proceedings of IAFPA2017. Split,Croatia: IAFPA. 2017: 29-30.

［74］ WAGNER I. The BKA standard operation procedure of forensic speaker comparison and examples of case work［C］ // IAFPA .Proceedings of IAFPA2017. Split, Croatia: IAFPA. 2017: 34-36.

［75］韓文靜，李海峰，阮華斌，等 .語(yǔ)音情感識(shí)別研究進(jìn)展綜述［J］.軟件學(xué)報(bào)，2014, 25(1):37-50.

［76］ KOVA?I? D. Voice gender identification in cochlear implant users ［C］ // IAFPA .Proceedings of IAFPA2017. Split, Croatia:IAFPA. 2017: 23-25.

［77］ GEORG A. The effect of dialect on age estimation［C］ // IAFPA.Proceedings of IAFPA2017. Split, Croatia: IAFPA. 2017: 118-121.

［78］ TOMI? K. Cross-language accent analysis for determination of origin［C］ // IAFPA .Proceedings of IAFPA2017. Split, Croatia:IAFPA. 2017: 171-173.

［79］ JONG-LENDLE G, KEHREIN R, URKE F, et al. Language identif i cation from a foreign accent in German［C］ // IAFPA .Proceedings of IAFPA2017. Split, Croatia: IAFPA. 2017: 135-138.

［80］ SCHWAB S, AMATO M, DELLWO V, et al. Can we hear nicotine craving［C］ // IAFPA .Proceedings of IAFPA2017. Split,Croatia: IAFPA. 2017: 115-117.

［81］ RODMONGA P, TATIANA A, NIKOLAY B, et al. Perceptual auditory speech features of drug-intoxicated female speakers(preliminary results)［C］ // IAFPA .Proceedings of IAFPA2017.Split, Croatia: IAFPA. 2017: 118-121.

［82］ KELLY F, FORTH O, ATREYA A, et al. What your voice says about you: automatic speaker profiling using i-vectors［C］ //IAFPA .Proceedings of IAFPA2017. Split, Croatia: IAFPA.2017: 72-75.

［83］ WATT D, JENKINS M, BROWN G. Performance of human listeners vs. the Y-ACCDIST automatic accent classif i er in an accent authentication task［C］ // IAFPA .Proceedings of IAFPA2017. Split, Croatia: IAFPA. 2017: 139-141.

［84］ KATHIRESAN T, DELLWO V. Cepstral dynamics in MFCCs using conventional deltas for emotion and speaker recognition［C］ // IAFPA .Proceedings of IAFPA2017. Split, Croatia: IAFPA.2017: 105-108.

［85］ HIPPEY F, GOLD E. Detecting remorse in the voice: A preliminary investigation into the perception of remorse using a voice line-up methodology［C］ // IAFPA .Proceedings of IAFPA2017.Split, Croatia: IAFPA. 2017: 179-182.

［86］ BIZOZZERO S, NETZSCHWITZ N, LEEMANN A. The effect of fundamental frequency f0, syllable rate and pitch range on listeners’ perception of fear in a female speaker’s voice［C］// IAFPA .Proceedings of IAFPA2017. Split, Croatia: IAFPA.2017: 174-178.

［87］ SATT A, ROZENBERG S, HOORY R. Eff i cient emotion recognition from speech using deep learning on spectrograms［C］ //ISCA. Proceedings of Inter speech 2017. Stockholm, Sweden:ISCA ,2017:1089-1093.

［88］ ZHANG R, ATSUSHI A, KOBASHIKAWA S, et al. Interaction and transition model for speech emotion recognition in dialogue［C］ // ISCA. Proceedings of Inter speech 2017. Stockholm, Sweden: ISCA ,2017: 1094-1097.

［89］ PARTHASARATHY S, BUSSO C. Jointly predicting arousal,valence and dominance with multi-task learning［C］ // ISCA.Proceedings of Inter speech 2017. Stockholm, Sweden: ISCA ,2017:1103-1107.

［90］ LE D, ALDENEH Z, PROVOST E. Discretized continuous speech emotion recognition with multi-task deep recurrent neural network［C］ // ISCA. Proceedings of Inter speech 2017.Stockholm, Sweden: ISCA ,2017:1108-1112.

［91］ SCHRODER A, STONE S, BIRKHOLZ P. The sound of deception - what makes a speaker credible［C］ // ISCA. Proceedings of Inter speech 2017. Stockholm, Sweden: ISCA ,2017:1467-1471.

［92］ MENDELS G, LEVITAN S, LEE K. Hybrid acoustic-lexical deep learning approach for deception detection［C］ // ISCA.Proceedings of Inter speech 2017. Stockholm, Sweden: ISCA ,2017:1472-1476.

［93］ ALI Z, IMRAN M, ALSULAIMAN M. An automatic digital audio authentication/forensics system［J］. Digital Object Identif i er.2017(5):2994-3007.

［94］ GRIGORAS C, SMITH J. Large scale test of digital audio fi le structure and format for forensic analysis［C］. 2017 AES International Conference on Audio Forensics,2017.

［95］ SMITH J, LACEY D, KOENIG B, et al. Triage approach for the forensic analysis of apple ios audio fi les recorded using the “voice memos” app［C］. 2017 AES International Conference on Audio Forensics,2017.

［96］ PATOLE R, KORE G, REGE P. Reverberation based tampering detection in audio recordings［C］. 2017 AES International Conference on Audio Forensics,2017.

［97］ Advisory Panel of White House Tapes. The EOB Tape of June 20, 1972: Report on a Technical Investigation Conducted for the U.S. District Court for the District of Columbia［R］. 1974.

［98］ GRIGORAS C. Application of ENF Analysis Method in Forensic Authentication of Digital Audio and Video Recordings［J］. Journal of the Audio Engineering Society, 2007, 57 (9) :643-661.

［99］ GRIGORAS C. Statistical Tools for Multimedia Forensics［C］.39th International Conference: Audio Forensics: Practices and Challenges, 2010.

［100］ HUA G, THING V. On practical issues of electric network frequency based audio forensics［J］. IEEE Transactions on Information Forensics & Security,2017(5): 20640-20651.

［101］ JAMES Z, GRIGORAS C, SMITH J. A low cost, cloud based, portable, remote ENF system［C］. 2017 AES International Conference on Audio Forensics ,2017.

［102］ HUA G, ZHANG Y, GOH J. Audio authentication by exploring the absolute-error-map of ENF signals［J］. IEEE Transactions on Information Forensics & Security,2016(5)：1003-1016.

［103］ REIS P M G, MIRANDA R, GALDO G. ESPRIT-Hilbert based audio tampering detection with SVM classif i er for forensic analysis via electrical network frequency［J］. IEEE Transactions on Information Forensics & Security, 2017(4):853-864.

［104］操文成.語(yǔ)音偽造盲檢測(cè)技術(shù)研究［D］.成都：西南交通大學(xué), 2017.

［105］孫蒙蒙.錄音真實(shí)性辨識(shí)和重翻錄檢測(cè)［D］.深圳：深圳大學(xué), 2017.

［106］申小虎，金恬，張長(zhǎng)珍，等.錄音資料真實(shí)性鑒定的頻譜檢驗(yàn)技術(shù)研究［J］. 刑事技術(shù) , 2017,42(3):173-177.

［107］ GUZEWICH P, ZAHORIAN S. Improving speaker verif i cation for reverberant conditions with deep neural network dereverberation processing［C］ // ISCA. Proceedings of Inter speech 2017. Stockholm, Sweden: ISCA ,2017:171-175.

［108］ HAN K, WANG Y, WANG D. Learning spectral mapping for speech dereverberation［C］// 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP),2014:4661-4665.

［109］ HAN K, WANG Y, WANG D, et al. Learning spectral mapping for speech dereverberation and denoising［J］. IEEE/ACM Transactions on Audio, Speech,and Language Processing,2015,23 (6) :982-992.

［110］ WU B, LI K, YANG M, et al. A study on target feature activation and normalization and their impacts on the performance of DNN based speech dereverberation systems［C］. 2016 Asia-Pacif i c Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2016.

［111］ WU B, LI K, YANG M, Et al. A reverberation-time-aware approach to speech dereverberation based on deep neural networks［J］. IEEE/ACM transactions on audio, speech, and language processing, 2017,25(1):102-111.

［112］ BULLING P, LINHARD K, WOLF A, et al. Stepsize control for acoustic feedback cancellation based on the detection of reverberant signal periods and the estimated system distance［C］// ISCA. Proceedings of Inter speech 2017. Stockholm, Sweden:ISCA ,2017:176-180.

［113］ WU Y C, HWANG H, WANG S, et al. A post-f i ltering approach based on locally linear embedding difference compensation for speech enhancement［C］ // ISCA. Proceedings of Inter speech 2017. Stockholm, Sweden: ISCA ,2017:1953-1957.

［114］ OGAWA A, KINOSHITA K, DELCROIX M, et al. Improved example-based speech enhancement by using deep neural network acoustic model for noise robust example search［C］ //ISCA. Proceedings of Inter speech 2017. Stockholm, Sweden:ISCA ,2017:1963-1967.

［115］ GELDERBLOM F B, GRONSTAD T, VIGGEN E. Subjective intelligibility of deep neural network-based speech enhancement［C］ // ISCA. Proceedings of Inter speech 2017.Stockholm, Sweden: ISCA ,2017:1968-1972.

［116］ QIAN K, ZHANG Y, CHANG S, et al. Speech enhancement using bayesian wavenet［C］ // ISCA. Proceedings of Inter speech 2017. Stockholm, Sweden: ISCA ,2017:2013-2017.

［117］ PASCUAL S, BONAFONTE A, SERRA J. SEGAN: Speech enhancement generative adversarial network［C］ // ISCA. Proceedings of Inter speech 2017. Stockholm, Sweden: ISCA ,2017:3642-3646.

［118］ MAITI S, MANDEL M. Concatenative resynthesis using twin networks［C］ // ISCA. Proceedings of Inter speech 2017.Stockholm, Sweden: ISCA ,2017:3647-3651.

国产日韩欧美一区二区三区三州_亚洲少妇熟女av_久久久久亚洲av国产精品_波多野结衣网站一区二区_亚洲欧美色片在线91_国产亚洲精品精品国产优播av_日本一区二区三区波多野结衣 _久久国产av不卡