Shift Invariant Sparse Coding of Image and Music Data



AbstractWhen analyzing multi-media data such as image and music it is useful to extract higher-level features that constitute prominent signatures of the data. We demonstrate how a 2D shift invariant sparse coding model is capable of extracting such higher level features forming so-called icon alphabets for the data. For image data the model is able to find high-level prominent features while for music the model is able to extract both the harmonic structure of instruments as well as indicate the scores they play. We further demonstrate that non-negativity constraints are useful since they favor part based representation. The success of the model relies in finding a good value for the degree of sparsity. For this, we propose an `L-curve'-like argument and use the sparsity parameter that maximizes the curvature in the graph of the residual sum of squares plotted against the number of non-zero elements of the sparse code. Matlab implementation of the algorithm is available for download.
Keywordssparse coding, part based representation, L-curve, shift invariance
TypeTechnical report
Year2008
Electronic version(s)[pdf]
BibTeX data [bibtex]
IMM Group(s)Intelligent Signal Processing