Welcome to Journal of University of Chinese Academy of Sciences,Today is

›› 2015, Vol. 32 ›› Issue (1): 97-102.DOI: 10.7523/j.issn.2095-6134.2015.01.016

Previous Articles     Next Articles

Bottleneck features and subspace Gaussian mixture models for low-resource speech recognition

WU Weilan1,2, CAI Meng3, TIAN Yao3, YANG Xiaohao3, CHEN Zhenfeng1,2, LIU Jia3, XIA Shanhong2   

  1. 1. University of Chinese Academy of Sciences, Beijing 100190, China;
    2. State Key Laboratory of Transducer Technology, Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China;
    3. Tsinghua National Laboratory for Information Science and Technology, Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
  • Received:2014-02-27 Revised:2014-03-07 Online:2015-01-15

Abstract:

State-of-the-art speech recognition systems often depend on a lot of training data, but perform poorly when limited data is available. In this paper, we study speech recognition systems under low-resource condition. The subspace Gaussian mixture (SGMM) model is first applied to reduce the number of parameters. The model is further enhanced by discriminative training based on maximum mutual information criterion. The bottleneck features based on deep neural networks are then studied to make robust feature extraction. The SGMM model and the bottleneck features are finally combined to produce a novel speech recognition system under low-resource condition. On the standard OpenKWS 2013 evaluation corpus, experimental results show the combination of the two technologies brings substantial relative improvement of about 12% over the baseline system.

Key words: speech recognition, low-resource, acoustic model, acoustic feature

CLC Number: