Robust Voice Activity Detection with Deep Maxout Neural Networks

Valentin Sergeyevich Mendelev, Tatiana Nikolaevna Prisyach, Alexey Alexandrovich Prudnikov

Abstract


Voice activity detection (VAD) under non-stationary noises is a very important task to solve when using a real-life system of automatic speech recognition, especially if a remote microphone is used. Many existing methods do not work well with noise that changes over time or with very low signal-to-noise ratio (SNR). This paper proposes a method based on deep maxout neural networks with dropout regularization. The method is effective even for very low SNR (up to -5dB). The robustness of the method is demonstrated by low FR/FA error rates on a test dataset that was recorded under conditions different from the training dataset.

Full Text:

PDF


DOI: http://dx.doi.org/10.5539/mas.v9n8p153

Copyright (c)



Modern Applied Science   ISSN 1913-1844 (Print)   ISSN 1913-1852 (Online)  Email: mas@ccsenet.org

Copyright © Canadian Center of Science and Education

To make sure that you can receive messages from us, please add the 'ccsenet.org' domain to your e-mail 'safe list'. If you do not receive e-mail in your 'inbox', check your 'bulk mail' or 'junk mail' folders.