Abstract
The network traffic classification is used as a basis of network management related works, from security monitoring and Intrusion Detection Systems (IDS) to Quality of Service (QoS). Many works have been carried out on this theme that proposed various approaches. Most of the proposed approaches utilize predefined class label provided by an expert to perform the network traffic classification. Nevertheless, the difficulty to acquire consistent, adequate and up-to-date ground truth for classifying network flows effectively is still as serious problem. This work addresses such an issue via multi-view stacking to fuse the metadata from heterogeneous types of semi-supervised classifiers for accurate classification thru the following steps. Firstly, the original data is represented as multi-view using dimensionality reduction methods, with the aim to have strong discrimination capability. Secondly, integrate different semi-supervised learning algorithms from an ensemble learning perspective, to deliver a better stability and quality classification output. Finally, propose N-fold cross validation on metadata for training the meta-classifier, to produce the final class decision and predict the unlabelled traffic data. Experiment results on four existing network traffic datasets provide the best average accuracy mark is above 96% on all datasets and the best stability performance is up to 99.50%.