Abstract
Conference Title: 2017 International Conference on Engineering & MIS (ICEMIS) Conference Start Date: 2017, May 8 Conference End Date: 2017, May 10 Conference Location: Monastir, Tunisia With the rapid development of multimedia technologies and network communication, the parallel architecture such as the Graphic Processing Unit (GPU) is introduced in high-performance computing. But, how to program this GPU and how to obtain the best execution time remains usually an art. In this paper, a search study is performed on the Thread and the Block number that leads to a Prediction Unit of 64×64 (PU64) computation in the High Efficiency Video Coding (HEVC). It is proposed through the Compute Unified Device Architecture (CUDA). This method is described to optimize the GPU execution time. Experimental results show that the best Grid topology chosen to run the GPU kernel is obtained for 128 Block and 32 Thread. This proposed repartition gives the minimum GPU execution time compared to the CPU one, where the speed-up obtained here is around 50%.