Abstract
The Rosetta data set opens the possibility of comparing an experimental microarray data set with a reference profile from the compendium. However, explaining this comparison in terms of individual genes could be a daunting task because of the sheer number of genes. Thus, we postulate a new strategy of modeling microarray data in terms of functional genomic units (FGUs). A functional genomic unit is a group of genes that carries out a certain biological function. We explored the possibility of defining the functional genomic units from the Gene Ontology (GO) annotation of the yeast genome. To visualize the tree structure of the GO, we have written a yeast genomic knowledge browser in Java, and integrated it with the microarray data. The pitfall of using the GO is that only a portion of the genes in the genome are functionally known or inferred. Thus, we further investigated an unsupervized learning method to identify those functional genomic units in the yeast genome. We have applied an established analysis method from digital signal processing, Independent Component Analysis (ICA), to the Rosetta data set. To further validate the utility of the Rosetta compendium, we have designed an experiment to investigate the yeast cells transfected with human Rac1, a small GTPase protein of the Rho family, and demonstrated that functional genomic units helped us to corroborate our own microarray experiment with the Rosetta data set.