Abstract
Tourism has been one of the biggest competitive industries in the world. Nowadays, medical and wellness tourism are quickly developing as a part of tourism for health and wellness care. Social networking sites have played an important role in developing these types of tourism. Online reviews on the tourism products in social networking sites are considered rich sources for tourists' decision making. Machine learning techniques have proved to be effective in analysing the tourists' online reviews. For big datasets of tourist online reviews, these techniques must be enough robust to accurately discover the hidden relationships of tourists' preferences in the online reviews. In addition, scalable machine learning techniques are needed for examining big datasets analysis in tourism platforms to timely provide the required information regarding the tourists' preferences on the products. This paper investigates the effectiveness of a hybrid method using clustering, Higher-Order Singular Value Decomposition (HOSVD) and Classification and Regression Trees (CART) in analysing tourists' online reviews in TripAdvisor. We use HOSVD to find the similarities among the travellers in the datasets with huge sets of hotels ratings. Then, we use CART to predict travellers' preferences on the quality dimensions of spa hotels in TripAdvisor. To evaluate the method, the data is collected from the travellers' online reviews on Malaysian spa hotels in TripAdvisor. The results showed that our method outperforms the methods which solely rely on prediction machine learning techniques. We demonstrate that the use of clustering and prediction machine learning techniques combined with the HOSVD is robust in analysing the tourists' online reviews for discovering the tourists' preferences in social networking sites.