In this paper, we propose and evaluate a novel modular neural network model for sound event localization and detection (SELD). One SELD module extracts arrival time differences and detects sound events from phase and magnitude spectrogram. The assembl...
In this paper, we propose and evaluate a novel modular neural network model for sound event localization and detection (SELD). One SELD module extracts arrival time differences and detects sound events from phase and magnitude spectrogram. The assembled neural network model consists of nC2 numbers of SELD modules where n is the number of channels in the microphone array. The modular structure of the proposed model makes the model easier to train using less training data. The structure also makes the model scalable to microphone array shape change.