Modelling and estimation of the spatial impulse response in reverberant conditions

Acoustical Society of America

Publication

Modern audio signal processing and speech enhancement relies more and more on machine learning approaches, which require vast amount of data for training. One of the ways to create a dataset for training is by convolving measured impulse response between the sound source and the device with clean speech and adding noise. This approach is limited to the pair of used sound source and microphone, as it incorporates not only the reverberation of the room, but also the radiation pattern of the sound source (typically mouth simulator or head and torso simulator) and the directivity patterns of the microphones in the device under test. In this paper we propose using a spherical loudspeaker array as a transmitter and a spherical microphone array as a receiver to create a sound source and receiver independent impulse response. During the dataset synthesis this spatial impulse response is modified to model the impulse response between transmitter and receiver with given directivity patterns.