@CONFERENCE\{IMM2006-05611,
    author       = "L. Feng and L. K. Hansen",
    title        = "Cognitive Components of Speech at Different Time Scales",
    year         = "2006",
    month        = "dec",
    keywords     = "Feature stacking, Unsupervised learning, Supervised learing, Mixture of factor analyzers",
    booktitle    = "{NIPS} Workshop: Advances in Acoustic Models",
    volume       = "",
    series       = "",
    editor       = "",
    publisher    = "",
    organization = "",
    address      = "",
    url          = "http://www2.compute.dtu.dk/pubdb/pubs/5611-full.html",
    abstract     = "We discuss the cognitive components of speech at different time scales. We investigate cognitive features of speech including phoneme, gender, height, speaker identity. Integration by feature stacking based on short time MFCCs. Our hypothesis is basically ecological: we assume that features that essentially independent in a reasonable ensemble can be efficiently coded using a sparse independent component representation. This means that supervised and unsupervised learning should result in similar representations. We do indeed find that supervised and unsupervised learning of a model based on identical representations have closely corresponding abilities as classifiers."
}