Robust Voice Activity Detection with Deep Maxout Neural Networks

Valentin Sergeyevich Mendelev, Tatiana Nikolaevna Prisyach, Alexey Alexandrovich Prudnikov


Voice activity detection (VAD) under non-stationary noises is a very important task to solve when using a real-life system of automatic speech recognition, especially if a remote microphone is used. Many existing methods do not work well with noise that changes over time or with very low signal-to-noise ratio (SNR). This paper proposes a method based on deep maxout neural networks with dropout regularization. The method is effective even for very low SNR (up to -5dB). The robustness of the method is demonstrated by low FR/FA error rates on a test dataset that was recorded under conditions different from the training dataset.

Full Text:



Copyright (c)

Modern Applied Science   ISSN 1913-1844 (Print)   ISSN 1913-1852 (Online)  Email:

Copyright © Canadian Center of Science and Education

To make sure that you can receive messages from us, please add the '' domain to your e-mail 'safe list'. If you do not receive e-mail in your 'inbox', check your 'bulk mail' or 'junk mail' folders.