Monday, 27 July 2015

Week 9 - A different approach for image attributes

Previously image attributes were passed as implicit parameters located at the end of the kernel input vector: clover appended the image metadata regardless of the target on which the kernel would run. It would be beneficial to allow more flexibility to clover, since some kind of targets may already have means of querying image dimensions or format, and uploading the attributes as implicit parameters is unnecessary for them. To this end the implicit arguments holding the image attributes are now added to the function signature at IR level immediately after the image argument to which they belong. Calls to attribute getter functions are replaced by reads from these new arguments. This transformation is implemented as an IR pass in the AMDGPU target. Since llvm function signatures can't be changed directly, the kernel functions are recreated with the new signature. The pass also handles image and (partially) sampler resource IDs.

Let's look at the implementation of the pass in a bit more detail. The pass is based on the opencl.kernels named metadata node; no changes are made to modules without proper metadata, and only the kernels listed under the opencl.kernels node are transformed.

For each function found in the node, the kernel_arg_type metadata is scanned for image2d_t and image3d_t argument types. For each of those kernel arguments, two new arguments are added immediately following the image argument: one of type <3 x i32> for the image size (width, height and depth), and one of type <2 x i32> for the image format (channel data type and channel order). The kernel function is recreated with the new signature and the body is copied over. Metadata is added to the new arguments; the implicit args are marked by the types __llvm_image_size and __llvm_image_format.

After the function with implicit arguments is in place, the uses of image and sampler arguments are scanned for the following function calls:

  • llvm.OpenCL.image.get.size*
  • llvm.OpenCL.image.get.format*
  • llvm.OpenCL.image.get.resource.id*
  • llvm.OpenCL.sampler.get.resource.id

The stars in the list above indicate that different image getter function has to be used for 2d and 3d images because of the type difference.

The resource IDs are determined by the index of the argument value within the kernel signature. Image arguments are grouped by access qualifier to read_only and write_only groups; the resource ID is the argument index within the group. E.g. in case of
__kernel void foo(read_only image2d_t a, read_only image3d_t b, write_only image2d_t c, write_only image3d_t d)
the resource ID of a and c is 0, and for b and d is 1. The resource IDs of sampler_t arguments are calculated similarly. Samplers declared as module globals or kernel locals are not handled yet.

Now, that the implicit arguments are present at IR level with proper metadata, clover is able to look for them, and upload the image attributes only if necessary. See this commit.

Note: because of the fact that kernels are recreated during the pass described above, this change is necessary in clover.

The implementation of the OpenCL builtins in libclc looks like this.

Sunday, 12 July 2015

Week 7 - Image reading

This week the main focus was on image reading, while also reworking previously sent patches about image attributes. Currently I have a working prototype for writing CL_FLOAT images, although I'm not ready to push it upstream yet. The current status is summarized in the following list.

  • Mesa is now able to set up compute texture and sampler resources (code). Most of the code was already present, only initialization and emission of the atoms responsible for sampler and texture resource (sampler view) state setup had to be added. I also had my mentor's code as a starting point.
  • The prototype libclc implementation uses the llvm.R600.tex intrinsic with hardcoded texture id, sampler and coord types (code).
  • No modification of llvm is required.

The image attribute getters have undergone a few modifications. The mechanism which assigns a compile-time constant ID to the kernel image arguments, which was part of the image attribute intrinsics replacer pass, has been factored out into a separate pass (patch). Simultaneously, the image attribute replacer pass has been deleted, since the reason of the pass was the compile-time constant ID generation. This way the attribute getters can be implement in libclc roughly as follows (see this patch for additional details):

get_image_attribute(get_image_id(image), attribute);

An additional benefit of this approach is that the image ID pass can be extended with different kind of IDs, e.g. resource IDs (like RAT ID) and sampler IDs.