Speech Reconstruction from Binary Masked Spectrograms Using Vector Quantized Speaker Models |
|
Abstract | Several source separation techniques use binary masking on spectrograms to separate two or more speakers from each other. In this thesis, the possibilities for obtaining the best quality signal, reconstructed from masked spectrograms through vector quantized models of speakers, is investigated. The advantages and disadvantages of such an approach are examined. Additionally, the task of signal reestimation from a spectrogram is investigated using several algorithms.
Vector quantization of speakers can be used to improve on binary masked spectrograms but the approach is not shown to produce high quality speech. It is also concluded that phase information is very important for high quality speech reconstruction, and parameters for optimal phase reestimation are suggested. |
Keywords | |
Type | Master's thesis [Academic thesis] |
Year | 2006 |
Publisher | Informatics and Mathematical Modelling, Technical University of Denmark, DTU |
Address | Richard Petersens Plads, Building 321, DK-2800 Kgs. Lyngby |
Series | IMM-Thesis-2006-68 |
Note | Supervised Lars Kai Hansen, IMM. |
Electronic version(s) | [pdf] [ps] |
BibTeX data | [bibtex] |
IMM Group(s) | Intelligent Signal Processing |