Monday 3 August 2015

Week 10 - 2D image writing

Last week I've implemented 2D image writing. As I've mentioned earlier, the Catalyst driver compiles MEM_RAT STORE_TYPED from the write_image* functions, since this instuction performs format conversion if the RAT is configured correctly. Previously I couldn't configure the RATs correctly, but last week I've managed to make it work.

On llvm side the STORE_TYPED instruction has been added along with the new llvm.r600.rat.write.typed intrinsic to the AMDGPU backend (commit). The write_image* functions in libclc can simply use the new intrinsic (commit).

The RAT configuration in r600g consists of setting up the RAT and RESOURCE fields of the CB_COLOR*_INFO registers. For some reason, the CB_COLOR*_DIM registers weren't set correctly, so this had to be added too. See this commit.

There was one unexpected problem though: the LINEAR_ALIGNED array mode doesn't work well with TEXTURE_2D resource type in case of RATs on my hardware, again, for an unknown reason. More precisely the location of the writes is not correct: the data written appeared at wrong locations. My previous attempt to use STORE_TYPED did not work because the driver always chose LINEAR_ALIGNED array mode even for images. My solution/workaround for this is to force a tiled array mode on texture compute resources for r600g hardware (r600, r700, evergreen, northern islands). See this commit.

Along with the RAT configuration in the r600g driver, a few minor changes had to be introduced to clover too. One such change is about mapping GPU resources to a CPU-accessible location. The transfer region is a potentially multi-dimensional (2 or 3) box, that was previously flattened to a linear offset and size. This information is insufficient for tiled textures: the driver has to know region dimensions. See this commit for details.

Another problem was that upon transfer the driver may force a specific row and slice pitch for the mapped data, and this information was ignored by clover. This behaviour was correct for linear buffers, but caused problems for tiled ones. See this commit.

One particular TODO is to add piglit tests to check image writing functionality.