1 Star 1 Fork 0

Jason/Hands-On-GPU-Accelerated-Computer-Vision-with-OpenCV-and-CUDA

加入 Gitee
与超过 1400万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
文件
克隆/下载
04_gpu_shared_memory.cu 1.24 KB
一键复制 编辑 原始数据 按行查看 历史
bhaumik2450 提交于 2018-06-04 13:44 +08:00 . Add files via upload
#include <stdio.h>
__global__ void gpu_shared_memory(float *d_a)
{
// Defining local variables which are private to each thread
int i, index = threadIdx.x;
float average, sum = 0.0f;
//Define shared memory
__shared__ float sh_arr[10];
sh_arr[index] = d_a[index];
__syncthreads(); // This ensures all the writes to shared memory have completed
for (i = 0; i<= index; i++)
{
sum += sh_arr[i];
}
average = sum / (index + 1.0f);
d_a[index] = average;
sh_arr[index] = average;
}
int main(int argc, char **argv)
{
//Define Host Array
float h_a[10];
//Define Device Pointer
float *d_a;
for (int i = 0; i < 10; i++) {
h_a[i] = i;
}
// allocate global memory on the device
cudaMalloc((void **)&d_a, sizeof(float) * 10);
// now copy data from host memory to device memory
cudaMemcpy((void *)d_a, (void *)h_a, sizeof(float) * 10, cudaMemcpyHostToDevice);
gpu_shared_memory << <1, 10 >> >(d_a);
// copy the modified array back to the host memory
cudaMemcpy((void *)h_a, (void *)d_a, sizeof(float) * 10, cudaMemcpyDeviceToHost);
printf("Use of Shared Memory on GPU: \n");
//Printing result on console
for (int i = 0; i < 10; i++) {
printf("The running average after %d element is %f \n", i, h_a[i]);
}
return 0;
}
Loading...
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/jason921121/Hands-On-GPU-Accelerated-Computer-Vision-with-OpenCV-and-CUDA.git
git@gitee.com:jason921121/Hands-On-GPU-Accelerated-Computer-Vision-with-OpenCV-and-CUDA.git
jason921121
Hands-On-GPU-Accelerated-Computer-Vision-with-OpenCV-and-CUDA
Hands-On-GPU-Accelerated-Computer-Vision-with-OpenCV-and-CUDA
master

搜索帮助