Abstract
Butyrylcholinesterase (BChE) is a significant pharmaceutical drug for treating Alzheimer's disease (AD) . Thanks to the computational methods as which decreases significantly the overhead for screening BChE inhibitors. However, some of them have used one-hot encoding which ignores the sequential information. In this study, Term Frequency-Inverse Document Frequency (TF-IDF) is used for encoding SMILES expressions and Long Short-Term Memory (LSTM) for classification to preserve sequential information. Apart from LSTM, different models were used to evaluate the discriminative power of TF-IDF and to show the significance of sequential information. The dataset used in this study con-sists of 4,515 records of BChE inhibitors and non-inhibitors in the form of SMILES. The results obtained by the machine learning models were tested through invitro activity assays as well. The molecular docking study further confirmed the binding modes inside the BChE. The LSTM model showed 98.20% testing ac-curacy for the prediction of BChE inhibitors.