Abstract
DNA motifs are short recurring patterns which are assumed to have some biological function. Most of the algorithms that solve this problem are computationally prohibitive. In this paper we extend a recent work that discovered identical string motifs. In the first phase of our three phase algorithm we report all the string motifs of all sizes. In the next phase we filter out those motifs which fail to meet our constraints, and in the last phase the motifs are ranked using a combination of stochastic techniques and p-value. Our method outperforms other motif discovery algorithms including some well-known ones such as MEME and Weeder on benchmark data suites.