Abstract
Given a newly found gene of some particular genome and a database of sequences whose functions have been known, it must be very helpful if we can search through the database and identify those that are similar to the particular new sequence. The search results may help us to understand the functional role, regulation, and expression of the new gene by the inference from the similar database sequences. This is the task of any methods developed for biological database searching. In this paper we present a new application of the theories of linear predictive coding, vector quantization, and hidden Markov models to address the problem of DNA sequence similarity search where there is no need for sequence alignment. The proposed approach has been tested and compared with some existing methods against real DNA and genomic datasets. The experimental results demonstrate its potential use for such purpose.