Data Layout Transformation for Structured-Grid Codes on GPU
November 24, 2010
Posted by on
We present data layout transformation as an effectiveperformance optimization for memory-bound structuredgridapplications for GPUs. Structured grid applications are aclass of applications that compute grid cell values on a regular2D, 3D or higher dimensional regular grid. Each output pointis computed as a function of itself and its nearest neighbors.Stencil code is an instance of this application class. Examplesof structured grid applications include fluid dynamics and heatdistribution that solve partial differential equations with aniterative solver on a dense multidimensional array.Using the information available through variable-length arraysyntax, standardized in C99 and other modern languages, wehave enabled automatic data layout transformations for structuredgrid codes with dynamic array sizes. We first presenta formulation that enables automatic data layout transformationsfor structured grid code in CUDA. We then model theDRAM banking and interleaving scheme of the GTX280 GPUthrough microbenchmarking. We developed a layout transformationmethodology that guides layout transformations to staticallychoose a good layout given a model of the memory system.The transformation which distributes concurrent memory requestsevenly to DRAM channels and banks provides substantialspeedup for structured grid application by improving theirmemory-level parallelism.
I-Jui Sung, Wen-Mei Hwu, University of Illinois at Urbana-Champaign