Exploratory Datamining in Music

Bjørn Sand Jensen

AbstractThis thesis deals with methods and techniques for music exploration, mainly focussing on the task of music retrieval. This task has become an important part of the modern music society in which music is distributed effectively via for example the Internet. This calls for automatic music retrieval and general machine learning in order to provide organization and navigation abilities.

This Master's Thesis investigates and compares traditional similarity measures for audio retrieval based on density models, namely the Kullback-Leibler divergence, Earth Mover Distance, Cross-Likelihood Ratio and some variations of these are examined. The methods are evaluated on a custom data set, represented by Mel-Frequency Cepstral Coefficients and a pitch estimation. In terms of optimal model complexity and structure, a maximum retrieval rate of »74-75% is obtained by the Cross-Likelihood Ratio in song retrieval, and »66% in clip retrieval.

An alternative method for music exploration and similarity is introduced based on a local perspective, adaptive metrics and the objective to retain the topology of the original feature space for explorative tasks. The method is defined on the basis of Information Geometry and Riemannian metrics. Three metrics (or distance functions) are investigated, namely an unsupervised locally weighted covariance based metric, an unsupervised log-likelihood based metric and finally a supervised metric formulated in terms of the Fisher Information Matrix. The Fisher Information Matrix is reformulated to capture the change in conditional probability of pre-defined auxiliary information given a distance vector in feature space. The metrics are mainly evaluated in simple clustering applications and finally applied to the music similarity task, providing initial results using such adaptive metrics. The results obtained (max »69%) for the supervised metric are in general superior to or comparable with the traditional similarity measures on the clip level depending on the model complexity.
KeywordsMusic Similarity & Retrieval, Audio Features, Clustering, Classification, Learning Metric, Information Geometry, Fisher Information Matrix, Supervised Gaussian Mixture Model
TypeMaster's thesis [Academic thesis]
PublisherInformatics and Mathematical Modelling, Technical University of Denmark, DTU
AddressRichard Petersens Plads, Building 321, DK-2800 Kgs. Lyngby
NoteSupervised Lars Kai Hansen, and co-supervisor Tue Lehn-Schiøler, IMM.
Electronic version(s)[pdf]
BibTeX data [bibtex]
IMM Group(s)Intelligent Signal Processing

Back  ::  IMM Publications