Probabilistic Speech Detection

Daniel J. Jacobsen

AbstractThis thesis deals with the detection of speech in signals that may contain very different noise types, referred to as the 'Voice Activity Detection' (VAD) problem. The signals consist of sections of noise only and sections of speech and noise in an additive mixture; convolutive mixtures are not addressed. Two different probabilistic methods are developed to solve the VAD problem. One is a discriminant-function based method in which a linear network with a single logistic output is trained to output the probability of speech presence from a given sound signal. The other is based on modelling of class-conditional probability densities, using Independent Component Analysis (ICA) methods. The algorithms are tested extensively and comparisons are made between them. They are also compared to an industry standard VAD algorithm, namely that of the the ITU-T G.729B recommendation and one other VAD. The results show the crucial importance of considering the type of noise present with the speech for obtaining robust speech detection and that for certain noise types, performance can be bettered with the developed VAD algorithms.
Keywordsmachine learning, classification, voice activity detection, linear networks, independent component analysis, receiver operating characteristics
TypeMaster's thesis [Academic thesis]
Year2003
PublisherInformatics and Mathematical Modelling, Technical University of Denmark, DTU
AddressRichard Petersens Plads, Building 321, DK-2800 Kgs. Lyngby
SeriesIMM-Thesis-2003-50
NoteSupervisor: Jan Larsen
Electronic version(s)[pdf]
BibTeX data [bibtex]
IMM Group(s)Intelligent Signal Processing