CUDA Papers

A collection of research papers and projects utilizing CUDA technology

Category Archives: FFT

High Performance Discrete Fourier Transforms on Graphics Processors

http://portal.acm.org/ft_gateway.cfm?id=1413373&type=pdf&doid2=1413370.1413373 http://www2.computer.org/portal/c/document_library/get_file?folderId=97697&name=DLFE-3346.pdf Abstract We present novel algorithms for computing Fourier transforms with high performance on GPUs. We present hierarchical, mixed radix FFT algorithms for both power-of-two and non-power-of-two sizes. Our hierarchical FFT algorithms efficiently exploit shared memory on GPUs using a Stockham formulation. We reduce the memory transpose overheads in hierarchical algorithms by combining the […]

Bandwidth Intensive 3-D FFT kernel for GPUs using CUDA

http://portal.acm.org/ft_gateway.cfm?id=1413376&type=pdf&doid2=1413370.1413376 http://www2.computer.org/portal/c/document_library/get_file?folderId=97697&name=DLFE-3317.pdf Abstract Most GPU performance “hypes” have focused around tightly-coupled applications with small memory bandwidth requirements e.g., N-body, but GPUs are also commodity vector machines sporting substantial memory bandwidth; however, effective programming methodologies thereof have been poorly studied. Our new 3-D FFT kernel, written in NVidia CUDA, achieves nearly 80 GFLOPS on a top-end […]