Abstract
For most of the past five decades, the growing computational power of supercomputers has come primarily from a doubling of clock frequency every 18 months. Over this time period, the clock rate increased by six orders of magnitude, while the number of processors increased by three orders of magnitude. The major challenge caused by the increasing scale and complexity of HPC systems is the massive power consumption. Due to constraints on heat and the power requirements of today's microprocessors, vendors have shifted to putting multiple processors (cores) on a chip. The number of cores per chip is expected to continue increasing exponentially over the next decade. One expected strategy is the correct usage of parallel programming models that decrease power consumption and increase system performance through massive parallelism (concurrency). In the current study, we have proposed a Hybrid MVAPICH-2 + CUDA (HMC) parallel programming model that outperformed other state-of-the-art dual and tri hierarchy level approaches with respect to power consumption and execution time. Moreover, the HMC model was evaluated by implementing the matrix multiplication benchmarking application. Consequently, it can be considered a leading model for the emerging Exascale computing system.