Monthly Archives: October 2013

Histogram on GPU using CUDA

The following sample demonstrates how to compute a histogram on a GPU. CUDA SDK has a histogram sample which works for 64 and 256 bins with code both running on GPU and CPU. SDK CPU code is using bit-manipulation which should be replaced with SSE2 instructions. This article demonstrates 3 simple, entry-level, examples showing the […]

Matrix Transpose on GPU using CUDA

The following sample demonstrates matrix transpose on GPU. It starts with sequential code on the CPU and progresses towards more advanced optimizations, first a parallel transformation on the CPU, then several transformations on the GPU. In real life, it is impractical to do just a single matrix operation on the GPU due to the cost […]

Red-eye Removal with CUDA

This code demonstrates basic steps for red-eye correction in pictures. It requires a picture of someone with red eyes and a small template file which is a picture of a red eye to help us find eyes in the original picture. While the sample works with the picture that I tested it with, it requires […]