AI

Sound source separation algorithm using phase difference and angle distribution modeling near the target

Abstract

In this paper we present a novel two-microphone sound source separation algorithm, which selects the signal from the target direction while suppressing signals from other directions. In this algorithm, which is referred to as Power Angle Information Near Target (PAINT), we first calculate phase difference for each time-frequency bin. From the phase difference, the angle of a sound source is estimated. For each frame, we represent the source angle distribution near the expected target location as a mixture of a Gaussian and a uniform distributions and obtain binary masks using hypothesis testing. Continuous masks are calculated from the binary masks using the Channel Weighting (CW) technique, and processed speech is synthesized using IFFT and the OverLap-Add (OLA) method. We demonstrate that the algorithm described in this paper shows better speech recognition accuracy compared to conventional approaches and our previous approaches