Abstract
An encoding method has a direct effect on the quality and the representation of the discovered knowledge in data mining systems. Biological macromolecules are encoded by strings of characters, calledprimary structures. Knowing that data mining systems usually use relational tables to encode data, we have then to reencode these strings and transform them into relational tables. In this paper, we do a comparative study of the existingstatic encoding methods, that are based on the Biologist know-how, and our newdynamic encoding one, that is based on the, construction ofDiscriminant and Minimal Substrings (DMS). Different classification methods are used to do this study. The experimental results show that ourdynamic encoding method is more efficient than thestatic ones, to encode biological macromolecules within a data mining perspective.[PUBLICATION ABSTRACT]