Feature Space Reconstruction for Single-Channel Speech Separation

Feature Space Reconstruction for Single-Channel Speech Separation
Mikkel N. Schmidt, Rasmus K. Olsson
Abstract	In this work we address the problem of separating multiple speakers from a single microphone recording.We formulate a linear regression model for estimating each speaker based on features derived from the mixture. The employed feature representation is a sparse, non-negative encoding of the speech mixture in terms of pre-learned speaker-dependent dictionaries. Previous work has shown that this feature representation by itself provides some degree of separation. We show that the performance is significantly improved when regression analysis is performed on the sparse, non-negative features.
Type	Technical report
Year	2007
Electronic version(s)	[pdf]
BibTeX data	[bibtex]
IMM Group(s)	Intelligent Signal Processing