Blind Room Volume Estimation from Single-channel Noisy Speech

Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) |

Organized by IEEE

Publication

Recent work on acoustic parameter estimation indicates that geometric room volume can be useful for modeling the character of an acoustic environment. However, estimating volume from audio signals remains a challenging problem. Here we propose using a convolutional neural network model to estimate the room volume blindly from reverberant single-channel speech signals in the presence of noise. The model is shown to produce estimates within approximately a factor of two to the true value, for rooms ranging in size from small offices to large concert halls.


Figure: Confusion matrices of the training set (left), test set (center), and the ACE corpus (right).