Speech Separation using Non-negative Features and Sparse Non-negative Matrix Factorization

Speech Separation using Non-negative Features and Sparse Non-negative Matrix Factorization
Mikkel N. Schmidt
Abstract	This paper describes a method for separating two speakers in a single channel recording. The separation is performed in a low dimensional feature space optimized to represent speech. For each speaker, an overcomplete basis is estimated using sparse non-negative matrix factorization, and a mixture is separated by mapping the mixture onto the joint bases of the two speakers. The method is evaluated in terms of word recognition rate on the speech separation challenge data set.
Type	Technical report
Year	2007
Publisher	Informatics and Mathematical Modelling, Technical University of Denmark, DTU
Address	Richard Petersens Plads, Building 321, DK-2800 Kgs. Lyngby
Electronic version(s)	[pdf]
BibTeX data	[bibtex]
IMM Group(s)	Intelligent Signal Processing