c++ - CUDA statically allocating data on device -
i've been trying allocate variable can accessed each kernel function. attempt code attached below, won't compile cause darray can't viewed accessed kernel. in c++ place variable @ top or declare static accessed in every scope through out program.
__global__ void storethreadnumber() { darray[threadidx.x] = threadidx.x; } int main( int argc, char** argv) { unsigned __int8 array[16] = { 0 }; unsigned __int8 darray[16]; for( __int8 position = 0; position < 16; position++) cout << array[position] << " "; cout << endl; cudamalloc((void**) darray, 16*sizeof(__int8)); cudamemcpy( darray, array, 16*sizeof(__int8), cudamemcpyhosttodevice); storethreadnumber<<<1, 16>>>(); cudamemcpy( array, darray, 16*sizeof(__int8), cudamemcpydevicetohost); for( __int8 position = 0; position < 16; position++) cout << array[position] << " "; cout << endl; cudafree(darray); }
you can have global variables in cuda, of type __device__
or __constant__
. so, example, if initialize __constant__
pointer variable address of device pointer using cudamemcpytosymbol()
, can access pointer via __constant__
variable:
__constant__ int* darrayptr; __global__ void storethreadnumber() { darrayptr[threadidx.x] = threadidx.x; }
just make sure correctly initialize darrayptr host code before run kernel.
Comments
Post a Comment