Abstract
In this paper, the ability of Bayesian and convolutional neural networks (CNNs), as two different machine learning methods, to recognize Arabic handwritten words is analyzed. Our contribution is threefold. First, we describe the main highlights of the dynamic Bayesian network (DBN) architecture, especially when compared to standard Bayesian networks. For that, some structural features are extracted from word image and considered as input for different architectures of Bayesian networks (BNs) such as Naive Bayes (NB), Tree Augmented Naive Bayes (TAN), Forest Augmented Naive Bayes (FAN) and Hidden Markov model (HMM). Features are extracted based on the word baseline which has been estimated to mainly cope with the problems of inclination and distortions. Decisions about word classification are then inferred using multiples models of BNs. Second, we model a deep learning architecture: a CNN that convolves learned features with input data and uses 2D convolutional layers that makes it well suited to 2D word image processing. Third, we compare the behavior of DBN-CNN and propose to combine them to exploit their advantages. Experiments are carried on the standard IFN-ENIT database. The obtained results show the relatively high accuracy of the DBN and CNN combination: 95.20% compared to the remaining models.