Continuously independent data type means when calculating the continuous elements of destination matrix,the used elements of source matrices are also continuous and there are no relationship among them. Intensive memory access function is the function that has less computation but a lot of data transfer operations. This paper took the bit wise function as the example, studied and implemented the parallel and the optimizing methods of the continuously independent data and intensive memory access function on GPU platforms. Based on the OpenCL framework, this paper studied and compared various optimizing methods, such as vectorizing, threads organizing, and instruction selecting, and finally used these methods to implement the cross-platform transfer of the bitwise function among different platforms.The study tested the function's execution time without data transfer both on AMD GPU and NVIDIA GPU platforms.On the AMD Radeon HD 5850 platform, the performance has reached 40 times faster than the CPU version in OpenCV library, 90 times faster on AMD Radeon HD 7970 platform, and 60 times faster on NVIDIA GPU hesla C2050 platform. On NVIDIA GPU `hesla C2050 platform,the speedup is 1. 5 comparing with the CUI}A version in C}enCV library.