Abstract
Data science augments manual data understanding with machine learning for potential performance increase. In this paper, data science methodology is examined to enhance machine learning application in smartphone based automatic human activity recognition (HAR). Eventually, a modified feature engineering and a novel post-learning data engineering are proposed in the machine learning framework as the alternate of data understanding for an effective HAR. The proposed framework is examined on two different HAR data sets demonstrating a possibility of data-driven machine learning for near an optimal classification of activities. The proposed framework exhibited effectiveness and efficiency when compared with the existing methods. The modified feature engineering resulted in 42% fewer features required by support vector machine to yield 97.3% correct recognition of human physical activities. However, the addition of post-learning data engineering further improved the model to perform 99% accurate classification, which is an almost optimal performance.