Abstract
Adaptive immune system is one of the human body's defense mechanisms developed to protect against repeated infection by the same pathogen through immunologic memory. Vaccination uses this concept to design vaccines to protect our bodies from infectious diseases. Some cells of the immune response cannot recognize antigen fragments unless attached to Major Histocompatibility Complex (MHC) molecules. Therefore, predicting peptides that are able to bind to MHC molecules is a key step when designing vaccines. MHC class II is one type of MHC molecules that is characterized by its ability to bind peptides of different length. Machine learning techniques can facilitate discrimination between peptides to classify them into binders or non-binders to MHC class II molecules. However, building a classification model passes through several stages that may influence its final decision. In this study, we design a robust MHC class II peptides classifier using neuro-fuzzy techniques. In particular, we optimize each of the stages involved including construction of training and testing datasets to eliminate bias, mapping variable length peptides into fixed feature vector, mining important features through several feature selection techniques, and choice of neuro-fuzzy classifiers. The experimental results demonstrate the importance of this optimization to obtain objective evaluation and show how bias in the results of such techniques as cross-validation can cause wide variability of outcomes for the same data. This can explain the fluctuations in performance of several techniques and suggests a more robust strategy to use for a more objective comparison of different techniques.