Abstract
Computer Vision besides Deep Learning techniques has served many human activity recognition applications, especially those related to safety, and security purposes. However, autistic children's security has not gotten the high interest of many Computer Vision researchers, despite the autistic children are often exposed to danger regarding their behaviors. High-grade autistic children often get involved in the Meltdown Crisis state where their behaviors in this state turn aggressive, and the children become out of control. In this research, we aim to present a monitoring system able to predict the Meltdown Crisis state early and alert the children's parents or caregivers before getting into more challenging scenarios. Towards this end, the proposed system was built upon a CNN-LSTM structure to learn spatial and temporal characteristics of stereotypical movements of autistic children at the Meltdown Crisis state onset, which is known as a Pre-Meltdown Crisis state. The vision part of the proposed architecture was constructed based on a VGG16 network pre-trained on the ImageNet dataset. The final choices regarding data preparation, model configuration, and training settings were tuned and set empirically to end with a 98% recall and F1 Score, while the best loss value achieved was 0.034. For evaluation, we used the MeltdownCrisis dataset that contains realistic scenarios of autistic children's activities in the Pre-Meltdown Crisis state and Normal state, where the data of the Normal state was used as a negative class.