[1] Laffitte P, Sodoyer D, Tatkeu C, et al. Deep neural networks for automatic detection of screams and shouted speech in subway trains[C]//Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2016:6460-6464.
[2] Parascandolo G, Huttunen H, Virtanen T. Recurrent neural networks for polyphonic sound event detection in real life recordings[C]//Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2016:6440-6444.
[3] Schröder J, Anemiiller J, Goetze S. Classification of human cough signals using spectro-temporal Gabor filterbank features[C]//Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2016:6455-6459.
[4] Xu M, Duan L Y, Xu C, et al. Event detection in basketball video using multiple modalities[C]//Information, Communications and Signal Processing, 2003 and Fourth Pacific Rim Conference on Multimedia. IEEE, 2003, 3:1526-1530.
[5] Knox M T, Morgan N, Mirghafori N. Getting the last laugh:automatic laughter segmentation in meetings[C]//INTERSPEECH. ISCA, 2008:797-800.
[6] Atrey P K, Maddage N C, Kankanhalli M S. Audio based event detection for multimedia surveillance[C]//Acoustics, Speech and Signal Processing(ICASSP).IEEE, 2006, 5:813-816.
[7] Smaragdis P. Non-negative matrix factor deconvolution; extraction of multiple sound sources from monophonic inputs[C]//ICA. Berlin:Springer, 2004,3195:494-499.
[8] Takahashi N, Gygli M, Pfister B, et al. Deep convolutional neural networks and data augmentation for acoustic event detection[C]//INTERSPEECH. ISCA, 2016,805:2982-2986.
[9] Aytar Y, Vondrick C, Torralba A. Soundnet:learning sound representations from unlabeled video[C]//Advances in Neural Information Processing Systems(NIPS). MIT Press, 2016:892-900.
[10] Zhuang X, Zhou X, Hasegawa-Johnson M A, et al. Real-world acoustic event detection[J]. Pattern Recognition Letters, 2010, 31(12):1543-1551.
[11] Hayashi T, Watanabe S, Toda T, et al. BLSTM-HMM hybrid system combined with sound activity detection network for polyphonic sound event detection[C]//Acoustics, Speech and Signal Processing (ICASSP).IEEE, 2017:766-770.
[12] Hayashi T, Watanabe S, Toda T, et al. Convolutional bidirectional long short-term memory hidden Markov model hybrid system for polyphonic sound event detection[J]. Journal of the Acoustical Society of America, 2016, 140(4):3404.
[13] Mesaros A, Heittola T, Diment A, et al. DCASE 2017 challenge setup:tasks, datasets and baseline system[C]//Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop(DCASE2017). IEEE, 2017:85-92.
[14] Hinton G, Deng L, Yu D, et al. Deep neural networks for acoustic modeling in speech recognition:the shared views of four research groups[J]. IEEE Signal Processing Magazine, 2012, 29(6):82-97.
[15] Hinton G E. Training products of experts by minimizing contrastive divergence[J]. Neural Computation, 2014, 14(8):1771-1800.
[16] Srivastava N, Hinton G, Krizhevsky A, et al. Dropout:a simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15(1):1929-1958.
[17] Ioffe S, Szegedy C. Batch normalization:accelerating deep network training by reducing internal covariate shift[C]//International Conference on Machine Learning. JMLR, 2015:448-456.
[18] Kingma D P, Ba J. Adam:a method for stochastic optimization[C]//International Conference for Learning Representations (ICLR). arXiv preprint. arXiv:1412.6980, 2014,6:1-13.
[19] Pan S J, Yang Q. A survey on transfer learning[J]. IEEE Transactions on knowledge and data engineering, 2010, 22(10):1345-1359.
[20] Mesaros A, Heittola T, Virtanen T. Metrics for polyphonic sound event detection[J]. Applied Sciences, 2016, 6(6):162.
[21] Zhou Q, Feng Z. Robust sound event detection through noise estimation and source separation using NMF[C]//Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop(DCASE2017). IEEE, 2017:138-142.
[22] Cakir E, Virtanen T. Convolutional recurrent neural networks for rare sound event detection[C]//Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop(DCASE2017). IEEE, 2017:27-31. |