|
计算机应用研究 2011
High performance FFT computation based on CUDA
|
Abstract:
The Fourier transform is essential for many image processing and scientific computing techniques. An implementation to accelerate FFT computation based on CUDA is presented in this paper. Based on the analysis of the GPU architecture and algorithm parallelism feature, a mapping strategy used multithread is brought, and the optimization in memory hierarchy is explored. The results on CUDA shows an improvement, the average speedup reaches 2-6X compared with CUFFT supplied by NVIDIA library.