hello
i am trying to do matrix multiplication on a 2D array using pitch.
i am able to load the 2D array on gpu using cudaMallocPitch() and cudaMemcpy2D() function, but i am not able to write the multiplication code.
The output which i am getting is wrong.
Can anyone help me out in the code
here's the which i have written
//---code for matrix multiplication using pitch---
float Pvalue=0;
xid = blockIdx.x * blockDim.x + threadIdx.x;
yid = blockIdx.y * blockDim.y + threadIdx.y;
for (int k = 0; k < N; ++k) { //D=T*M
float Melement = T[yid*pitch+k];
float Nelement = M[k+xid*pitch];
Pvalue += Melement * Nelement;
}
D[yid*pitch+xid] = Pvalue;
__syncthreads();
//---------
i am waiting for the help
thanx in advance....