欢迎访问中国科学院大学学报,今天是

中国科学院大学学报 ›› 2014, Vol. 31 ›› Issue (5): 714-719.DOI: 10.7523/j.issn.2095-6134.2014.05.019

• 简报 • 上一篇    

说话人识别中基于音素分类的数据选择方法

吴蔚澜1,2, 张卫强3, 刘巍巍3, 田垚3, 陈振锋1,2, 刘加3, 夏善红1   

  1. 1. 中国科学院电子学研究所 传感技术国家重点实验室, 北京 100190;
    2. 中国科学院大学, 北京 100190;
    3. 清华大学电子工程系 清华信息科学与技术国家实验室(筹), 北京 100084
  • 收稿日期:2013-06-14 修回日期:2013-11-04 发布日期:2014-09-15
  • 通讯作者: 夏善红,E-mail:shxia@mail.ie.ac.cn
  • 基金资助:

    国家自然科学基金(61005019,61273268,90920302)和北京市自然科学基金(KZ201110005005)资助

Data selection method in speaker recognition based on classification of phonemes

WU Weilan1,2, ZHANG Weiqiang3, LIU Weiwei3, TIAN Yao3, CHEN Zhenfeng1,2, LIU Jia3, XIA Shanhong1   

  1. 1. State Key Laboratory on Transducing Technology, Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China;
    2. University of Chinese Academy of Sciences, Beijing 100190, China;
    3. Tsinghua National Laboratory for Information Science and Technology, Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
  • Received:2013-06-14 Revised:2013-11-04 Published:2014-09-15

摘要:

在说话人识别中,有效语音数据的选择是一个重要的预处理环节.常用的数据选择方法根据能量信息的强弱对有效数据进行提取,但在实际情况中能量的高低与语音数据并无必然联系.本文在对传统方法进行分析比较的同时引入语言学知识,提出基于辅音信息的有效数据选择方法.该方法通过对活动语音检测结果中音素识别结果进行分析,保留所有元音,对辅音进行筛选,去除无益于说话人识别的干扰辅音音素,从而实现对有效语 音数据的选取.实验表明,应用该方法得到的说话人识别结果,明显优于传统的基于能量的数据选择算法,如基于G.723.1标准的活动语音检测算法和近期提出的基于交叉熵顺序统计滤波的端点检测算法.

关键词: 说话人识别, 有效数据, 音素解码器, 辅音

Abstract:

In speaker recognition, the selection of useful information is an important pre-processing step. Usual ways for selection of the useful information are based on energy. However, between useful information and energy there are no necessary connections. After analying the traditional selection ways, we propose a phoneme decoder based data selection algorithm. Through analysis of the phoneme recognition results, all vowels are kept and some useless consonants are filtered. The speaker recognition experiment results show that the proposed method is superior to the traditional energy-based data selection algorithms such as G.723.1 algorithm and the recently proposed cross entropy based order statistics filtering algorithm.

Key words: speaker recognition, useful information, phoneme decoder, consonant

中图分类号: