Abstract
With gene set enrichment analysis, researchers aim to reduce the complexity of their gene-based biological datasets and get more easily interpretable findings as to the functionally relevant differences between experimental conditions. Many methods exist to assess the enrichment of gene sets and make ranked lists out of a collection of gene sets, but they all depend on the coherency of those gene sets in the first place. In general, gene sets are synthesized knowledge from different biological or experimental conditions (tissues, diseases, phenotypes). Only a subset of genes within a gene set might be of relevance for one specific experimental condition or research question. We have developed a literature gene set mining tool, that allows composing a gene set out of genes that are relevant to specific conditions and the research question at hand, by selecting a specific corpus of documents with which to establish the gene set through text mining. After this, the gene set enrichment for that specific set can be analyzed. Furthermore, we include analysis for historic auditing of the gene set. Historic auditing of a gene set allows researchers to see when a gene set became enriched - at a predefined threshold - throughout time in the research niche of their interest, showing the novelty strength of their latest experimental results. We present a specific example: metastasis-related genes for neuroblastoma. Neuroblastoma is a pediatric cancer with a heavy metastasis burden for high-risk patients. However, the type of metastasis is very specific for neuroblastoma and cannot be directly compared to adult metastasized cancers. We show the workflow of mining for the neuroblastoma related gene set of metastasis-relevant genes and analyze its enrichment in neuroblastoma experimental data. As a comparison, we then run a similar analysis on metastatic samples from breast cancer to illustrate the added value of research-specific gene set enrichment analysis.
C.V.N is funded by Research Foundation - Flanders (FWO) with a postdoctoral fellowship at Ghent University.