全部 标题 作者
关键词 摘要


Parallelism and Research on Functions with Continuously Independent Data and Intensive Memory Access Using OpenCL
基于OpenCL的连续数据无关访存密集型函数并行与优化研究

Keywords: GPU,OpenCL,Vectorization,ROI
GPU
,OpenCL,向量化,ROI

Full-Text   Cite this paper   Add to My Lib

Abstract:

Continuously independent data type means when calculating the continuous elements of destination matrix,the used elements of source matrices are also continuous and there are no relationship among them. Intensive memory access function is the function that has less computation but a lot of data transfer operations. This paper took the bit wise function as the example, studied and implemented the parallel and the optimizing methods of the continuously independent data and intensive memory access function on GPU platforms. Based on the OpenCL framework, this paper studied and compared various optimizing methods, such as vectorizing, threads organizing, and instruction selecting, and finally used these methods to implement the cross-platform transfer of the bitwise function among different platforms.The study tested the function's execution time without data transfer both on AMD GPU and NVIDIA GPU platforms.On the AMD Radeon HD 5850 platform, the performance has reached 40 times faster than the CPU version in OpenCV library, 90 times faster on AMD Radeon HD 7970 platform, and 60 times faster on NVIDIA GPU hesla C2050 platform. On NVIDIA GPU `hesla C2050 platform,the speedup is 1. 5 comparing with the CUI}A version in C}enCV library.

Full-Text

comments powered by Disqus