Abstract
In this paper, we present a comparative performance study of a 2D Gaussian blur kernel mapped to a heterogeneous multi-core CPU/GPU platform. In this study, the kernel workgroup, the Gaussian kernel and the image sizes are considered variable parameters. We aim to gain insight into how well the execution and data movement times evolve across each computing device in varying the values of these parameters. The profiling information of kernels are extracted from a quad-core Intel CPU-only and an AMD Radeon 7700 GPU-only mappings onto the OpenCL's execution model. Simulation results show that for small values of the referred parameters, it is beneficial to use a multi-core CPU implementation, whereas for higher values, it is advantageous to use a GPU-based platform.