Abstract
It has been shown recently that, for coherence based speech enhancement methods, cross-spectral subtraction is an efficient technique that can reduce the correlated noise components. The zero-phase filtering criterion used by these methods is derived from the standard coherence function that is modified to incorporate the noise cross-power spectrum between input signals. However, there has been partial success at applying these methods when the speech processing is performed under harsh acoustic conditions or when channels process data gathered by microphones in close proximity. This paper proposes an alternative method which uses a phase-based filtering criterion by substituting the cross-power spectrum of the corrupted signals by its real part. A simplified noise power spectral density (PSD) estimator is applied on the estimated speech spectrum as an adaptive post-filtering to reduce the cosine shaped power spectrum of the remaining residual noise to a minimum spectral floor. Using that adaptive post-filter, a soft-decision procedure is implemented to control the amount of noise suppression. Experimental results show a performance improvement in terms of segmental signal-to noise-ratio (SNR) of about 4 dB and 2 dB on average over Zelinski and Zhang approaches, respectively.