Ling Feng, Andreas Brinch Nielsen, Lars Kai Hansen

AbstractThis paper explores the vocal and non-vocal music classification
problem within popular songs. A newly built labeled
database covering 147 popular songs is announced. It is designed
for classifying signals from 1sec time windows. Features
are selected for this particular task, in order to capture
both the temporal correlations and the dependencies among
the feature dimensions. We systematically study the performance
of a set of classifiers, including linear regression,
generalized linear model, Gaussian mixture model, reduced
kernel orthonormalized partial least squares and K-means
on cross-validated training and test setup. The database is
divided in two different ways: with/without artist overlap
between training and test sets, so as to study the so called
‘artist effect’. The performance and results are analyzed in
depth: from error rates to sample-to-sample error correlation.
A voting scheme is proposed to enhance the performance
under certain conditions.
KeywordsMusic retrieval, vocal segment classification, Pop music database
TypeConference paper [With referee]
BibTeX data [bibtex]
IMM Group(s)Intelligent Signal Processing

Back  ::  IMM Publications