Speech Separation using Nonnegative Features and Sparse Nonnegative Matrix Factorization 
Mikkel N. Schmidt

Abstract  This paper describes a method for separating two speakers in a single channel recording. The separation is performed in a low dimensional feature space optimized to represent speech. For each speaker, an overcomplete basis is estimated using sparse nonnegative matrix factorization, and a mixture is separated by mapping the mixture onto the joint bases of the two speakers. The method is evaluated in terms of word recognition rate on the speech separation challenge data set. 
Type  Technical report 
Year  2007 
Publisher  Informatics and Mathematical Modelling, Technical University of Denmark, DTU 
Address  Richard Petersens Plads, Building 321, DK2800 Kgs. Lyngby 
Electronic version(s)  [pdf] 
BibTeX data  [bibtex] 
IMM Group(s)  Intelligent Signal Processing 