[1] Rabiner L R, Sambur M R. An algorithm for determining the endpoints of isolated utterances[J]. The Bell System Technical Journal, 1975, 54(2): 297-315.[2] Lu L, Jiang H, Zhang H J. A robust audio classification and segmentation method[C]// Proceedings of the Ninth ACM International Conference on Multimedia. ACM, 2001: 203-211.[3] Shen J, Hung J, Lee L. Robust entropy-based endpoint detection for speech recognition in noisy environments[C]// ICSLP. 1998, 98: 232-235.[4] Huang L, Yang C. A novel approach to robust speech endpoint detection in car environments[C]//Acoustics, Speech, and Signal Processing. IEEE International Conference on. IEEE, 2000, 3: 1751-1754.[5] Haigh J A, Mason J S. Robust voice activity detection using cepstral features[C]//TENCON'93 Proceedings. Computer, Communication, Control and Power Engineering. 1993 IEEE Region 10 Conference on. IEEE, 1993: 321-324.[6] Martin A, Charlet D, Mauuary L. Robust speech/non-speech detection using LDA applied to MFCC[C]//Acoustics, Speech, and Signal Processing. IEEE International Conference on. IEEE, 2001, 1: 237-240.[7] Kinnunen T, Chernenko E, Tuononen M, et al. Voice activity detection using MFCC features and support vector machine[C]//Int Conf on Speech and Computer. Moscow, Russia, 2007, 2: 556-561.[8] Wang H, Xu Y, Li M. Study on the MFCC similarity-based voice activity detection algorithm[C]//Artificial Intelligence, Management Science and Electronic Commerce, 2011 2nd International Conference on. IEEE, 2011: 4391-4394.[9] Wang H Z, Xu Y C, Li M J. Voice activity detection algorithm based on Mel frequency cepstrum coefficient(MFCC) similarity[J]. Journal of Jilin University: Engineering and Technology Edition, 2012, 42(10): 1331-1335 (in Chinese). 王宏志,徐玉超,李美静. 基于Mel频率倒谱参数相似度的语音端点检测算法[J]. 吉林大学学报:工学版,2012,42(10):1331-1335.[10] Cho N, Kim E K. Enhanced voice activity detection using acoustic event detection and classification[J]. Consumer Electronics, IEEE Transactions on, 2011, 57(1): 196-202.[11] Ramirez J, Segura J C, Benitez C, et al. Efficient voice activity detection algorithms using long-term speech information[J]. Speech Communication, 2004, 42(3): 271-287.[12] Ishizuka K, Nakatani T, Fujimoto M, et al. Noise robust voice activity detection based on periodic to aperiodic component ratio[J]. Speech Communication, 2010, 52(1): 41-60.[13] Ramirez J, Segura J C, Benitez C, et al. An effective subband OSF-based VAD with noise reduction for robust speech recognition[J]. Speech and Audio Processing, IEEE Transactions on, 2005, 13(6): 1119-1129.[14] Davis S, Mermelstein P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences[J]. Acoustics, Speech and Signal Processing, IEEE Transactions on, 1980, 28(4): 357-366.[15] Restrepo A, Hincapie G, Parra A. On the detection of edges using order statistic filters[C]//Image Processing, 1994. IEEE International Conference. IEEE, 1994, 1: 308-312.[16] Oten R, de Figueiredo R J P. An efficient method for L-filter design[J]. Signal Processing, IEEE Transactions on, 2003, 51(1): 193-203.[17] Garofolo J. DARPA TIMIT: Acoustic-phonetic continuous speech corps CD-ROM[CD].US Dept of Commerce, National Institute of Standards and Technology, 1993.[18] Varga A H, Steeneken H, Tomlinson M, et al. The NOISEX-92 CD-ROMs[CD]. The NOISEX-92 study on the eect of additive noise on automatic speech recognition, 1992. |