Speaker Recognition

Ling Feng

AbstractThe work leading to this thesis has been focused on establishing a text-independent closed-set speaker recognition system. Contrary to other recognition systems, this system was built with two parts for the purpose of improving the recognition accuracy. The first part is the speaker pruning performed by KNN algorithm. To decrease the gender misclassification in KNN, a novel technique was used, where Pitch and MFCC features were combined. This technique, in fact, does not only improve the gender misclassification, but also leads to an increase on the total performance of the pruning. The second part is the DDHMM speaker recognition performed on the survived speakers after pruning. By adding the speaker pruning part, the system recognition accuracy was increased 9.3%.

During the project period, an English Language Speech Database for Speaker Recognition (ELSDSR) was built. The system was trained and tested with both TIMIT and ELSDSR database.
Keywordsfeature extraction, MFCC, KNN, speaker pruning, DDHMM, speaker recognition and ELSDSR
TypeMaster's thesis [Academic thesis]
Year2004
PublisherInformatics and Mathematical Modelling, Technical University of Denmark, DTU
AddressRichard Petersens Plads, Building 321, DK-2800 Kgs. Lyngby
SeriesIMM-Thesis-2004-73
NoteSupervised by Prof. Lars Kai Hansen
Electronic version(s)[pdf]
BibTeX data [bibtex]
IMM Group(s)Intelligent Signal Processing