Abstract
Automatically segmenting groups and individuals from crowd images can be important for surveillance purpose. In this paper, we present a simplified crowd detection and segmentation system. Crowd scene images can be divided into different regions: group, individual, crowd and row. Mask R-CNN with modified backbone architecture is employed for crowd image segmentation. The proposed system is tested on BIWI dataset and self-generated dataset to achieve accuracy of 83%. Furthermore, a crowd dataset* (CVML** crowd dataset) having nearly four thousand images are annotated for training the system.