Abstract
•The research deals with responses written in Arabic.•The research presents a new benchmark Arabic dataset that contains 610 answers.•The system gets a model response from an already built database for specific curriculum.•Responses are translated into English to overcome the lack of NLP resources in Arabic.•Different methods of scaling the similarity values to be in the same range as the manual scores are presented.
Most research in the automatic assessment of free text answers written by students address English language. This paper handles the assessment task in Arabic language. This research focuses on applying multiple similarity measures separately and in combination. Many aspects are introduced that depend on translation to overcome the lack of text processing resources in Arabic, such as extracting model answers automatically from an already built database and applying K-means clustering to scale the obtained similarity values. Additionally, this research presents the first benchmark Arabic data set that contains 610 students’ short answers together with their English translations.