Abstract
Conference Title: 2014 IEEE Congress on Evolutionary Computation (CEC) Conference Start Date: 2014, July 6 Conference End Date: 2014, July 11 Conference Location: Beijing, China Standard Regression models are presented with n samples from an input space X that is composed of observational data of the form (xi, y(xi)), i = 1...n where each xi denotes a k-dimensional input vector of design variables and u is the response. When k » n tie, high variance and over-fitting become a major concern. In this paper we propose a novel approach to mitigate this problem by transforming the input vectors into new smaller vectors (called Z set) using only a set of simple statistical moments. Genetic Algorithm (GA) has been used to evolve a transformation procedure. It is used to optimise an optimal sequence of statistical moments and their input parameters. We used Linear Regression (LR) as an example to quantify the quality of the evolved transformation procedure. Empirical evidences, collected from benchmark functions and real-world problems, demonstrate that the proposed transformation approach is able to dramatically improve LR generalisation and make it outperform other state-of-the-art regression models such as Genetic Programming, Kriging, and Radial Basis Functions Networks. In addition, we present an analysis to shed light on the most important statistical moments that are useful for the transformation process.