Knowledge discovery in neuroinformatics: coordinate-based meta-analytic search of neuroscientific literature and its expansion using semantic keyword extraction

Knowledge discovery in neuroinformatics: coordinate-based meta-analytic search of neuroscientific literature and its expansion using semantic keyword extraction
Bartlomiej Wilkowski, Marcin Marek Szewczyk
Abstract	The growing number of increasingly sophisticated functional neuroimaging studies of human brain activity have changed our view on human brain function. Moreover, it has brought the demand for new tools and services for integration of research findings, wider exchange of information between laboratories from the same research area and efficient searching of related articles, reviews and other literature. By integrating multiple studies in meta-analysis a more complete picture is emerging. Since functional localization in brain is normally represented in form of stereotaxic coordinates, it can be used directly in the process of retrieving related literature in a given functional context by the measure of coordinate distance. We are developing a BredeQuery plugin for SPM pipeline, which offers a direct link to the Brede Database. The Brede Database, together with BrainMap database, are two of the very few databases which provide a coordinate-based search for neuroscientific literature. Nevertheless, those databases contain relatively small number of publications compared to the frequently updated, big and comprehensive databases like PubMed. We aim at the integration of such two kinds of databases via implementation of SKEEPMED (Semantic KEyword Extraction Pipeline for MEdical Documents). The text bodies of publications retrieved via coordinate-based searching are mapped to a medical ontology (recently we use Metamap software to obtain mappings to UMLS concepts) thus the main context of the publication can be obtained. SKEEPMED, a modular pipeline implemented in Python, is able then to select relevant keywords and construct final PubMed query. Since Metamap software is highly configurable and uses many parameters (semantic types, sources), we are working now on the metaheuristic approach which can be used for finding the best Metamap configuration for our application. Furthermore, in the close future we are planning to test the hybrid approach for keyword extraction (ontological approach together with statistical/machine learning tools). During our talk we will present recent results of our research and discuss further work and future plans concerning above-mentioned projects.
Type	Misc [Presentation]
Year	2009 Month June
Note	Seminar talk at the National Institutes of Health
Electronic version(s)	[pdf]
BibTeX data	[bibtex]
IMM Group(s)	Intelligent Signal Processing