Welcome to Journal of University of Chinese Academy of Sciences,Today is

›› 2005, Vol. 22 ›› Issue (2): 140-146.DOI: 10.7523/j.issn.2095-6134.2005.2.003

Previous Articles     Next Articles

An Improved Incremental Approach to Speech Corpus Selection

NING Zhen-Jiang, DU Li-Min   

  1. Labs for Speech Interaction Technology Research, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100080, China
  • Received:2004-04-12 Revised:2004-06-08 Online:2005-03-15

Abstract:

In this paper, a novel incremental corpus selection approach is proposed, which can control the balance of phone units in the selected corpus more effectively through a process of erasing redundant sentences at each selection phrase.In our experiments, we employ a huge original data source consists of about 20 million sentences and 847-phone-contexts, while corpus generated from this data source has made up of 17865 sentences and has an coverage of 94.3 % according to phone-contexts which appearances frequency in the selected corpus are more than 10.In addition, it achieves a relatively low distribution variance of 0.18 ×10-3.Experiment results show that our approach is much better than traditional algorithms not only in phonetic units coverage but in phonetic units distribution variances.Moreover, Our algorithm has low computation complexity and memory cost.

Key words: speech recognition, acoustic model, corpus selection

CLC Number: