Abstract
Exploration of time series data based on correlation is a key ingredient of various analysis tasks. However, such exploration entails massive CPU and I/O costs due to the quadratic nature of the exploration space. Searching for a time sub-interval in which all time series pairs are correlated within certain values is one aspect of time series exploration and has various applications in many domains. Consequently, in this paper, we formulate the Targeted Correlation Matrix Search problem where the goal is to find an optimal sub-interval with a correlation matrix that maximises the closeness and similarity to targeted pairwise correlation values. We show the computational hardness of this problem, and propose the RELATE scheme to address the associated challenges by utilising the incremental property of correlation. Further, we propose two-level pruning techniques for the RELATE scheme to minimise the associated computational and I/O costs. These techniques enable RELATE to avoid exhaustively traversing the search space by pruning unqualified candidate queries, and avoid computing pairwise correlation of every time series pair wherever possible. We demonstrate by experiments the performance gains of RELATE against state-of-the-art algorithms with real and synthetic data sets.