Robust Voice Activity Detection with Deep Maxout Neural Networks

  •  Valentin Sergeyevich Mendelev    
  •  Tatiana Nikolaevna Prisyach    
  •  Alexey Alexandrovich Prudnikov    


Voice activity detection (VAD) under non-stationary noises is a very important task to solve when using a real-life system of automatic speech recognition, especially if a remote microphone is used. Many existing methods do not work well with noise that changes over time or with very low signal-to-noise ratio (SNR). This paper proposes a method based on deep maxout neural networks with dropout regularization. The method is effective even for very low SNR (up to -5dB). The robustness of the method is demonstrated by low FR/FA error rates on a test dataset that was recorded under conditions different from the training dataset.

This work is licensed under a Creative Commons Attribution 4.0 License.
  • Issn(Print): 1913-1844
  • Issn(Onlne): 1913-1852
  • Started: 2007
  • Frequency: monthly

Journal Metrics

(The data was calculated based on Google Scholar Citations)

h-index (January 2018): 30

i10-index (January 2018): 163

h5-index (January 2018): 19

h5-median(January 2018): 25