Too Long; Didn't Read
CUDA is for C, so the best alternative is to use Command cgo and invoke an external function with your Cuda Kernel. The vecmul() function is the kernel and a helper function to be called externally. I created a Simple Kernel that has the kernel function and helper function. Its function is to allocate memory in the GPU, copy the parameters, invoke the kernel, and copy the result. Values are passed by reference. If you want to know more about CUDA programming, read the my article.
Share Your Thoughts