OpenCL And Function Pointers

Introduction

Quasar has a nice mechanism to compose algorithms in a generic way, based on function types.

For example, we can define a function that reads an RGB color value from an image as follows:

    RGB_color_image = __device__ (pos : ivec2) -> img[pos[0], pos[1], 0..2]

Now suppose we are dealing with images in some imaginary Yab color space where

    Y = R+2*G+B        G = (Y +   a + b)/4
    a = G - R     or   R = (Y - 3*a + b)/4
    b = G - B          B = (Y +   a - 3*b)/4

we can define a similar read-out function that automatically converts Yab to RGB:

    RGB_Yab_image = __device__ (pos : ivec2) -> 
        [dotprod(img[pos[0],pos[1],0..2], [1,-3/4,1/4]),
         dotprod(img[pos[0],pos[1],0..2], [1/4,1/4,1/4]),
         dotprod(img[pos[0],pos[1],0..2], [1,1,-3/4])]

Consequently, both functions have the same type:

    RGB_color_image : [__device__ ivec2 -> vec3]
    RGB_Yab_image   : [__device__ ivec2 -> vec3]

Then we can build an algorithm that works both with RGB and Yab images:

    function [] = __kernel__ brightness_contrast_enhancer(
        brightness :    scalar, 
        contrast : scalar, 
        x : [__device__ ivec2 -> vec3], 
        y : cube,
        pos : ivec2)        

        y[pos[0],pos[1],0..2] = x(pos)*contrast + brightness
    end

    match input_fmt with
    | "RGB" ->
        fn_ptr = RGB_color_image
    | "Yab" ->
        fn_ptr = RGB_Yab_image
    | _ ->
        error("Sorry, input format is currently not defined")
    end

    y = zeros(512,512,3)
    parallel_do(size(y),brightness,contrast,fn_ptr,y,brightness_contrast_enhancer)

Although this approach is very convenient, and allows also different algorithms to be constructed easily (for example for YCbCr, Lab color spaces), there are a number of disadvantages:

  1. The C-implementation typically requires making use of function pointers. However, OpenCL currently does not support function pointers, so this kind of programs can not be executed on OpenCL-capable hardware.
  2. Although CUDA supports function pointers, in some circumstances they result in an internal compiler error (NVCC bug). These cases are very hard to fix.
  3. In CUDA, kernel functions that use function pointers may be 2x slower than the same code without function pointers (e.g. by inlining the function).

Manual solution

The (manual) solution is to use function specialization:

    match input_fmt with
    | "RGB" ->
        kernel_fn = $specialize(brightness_contrast_enhancer, fn_ptr==RGB_color_image)
    | "Yab" ->
        kernel_fn = $specialize(brightness_contrast_enhancer, fn_ptr==RGB_Yab_image)
    | _ ->
        error("Sorry, input format is currently not defined")
    end

    y = zeros(512,512,3)
    parallel_do(size(y),brightness,contrast,y,kernel_fn)

Here, the function brightness_contrast_enhancer is specialized using both RGB_color_image and RGB_Yab_image respectively. These functions are then simply substituted into the kernel function code, effectively eliminating the function pointers.

Automatic solution

The Quasar compiler now has an option UseFunctionPointers, which can have the following values:

  • Always: function pointers are always used (causes more compact code to be generated)
  • SmartlyAvoid: avoid function pointers where possible (less compact code)
  • Error: generate an error if a function pointer cannot be avoided.

In the example of the manual solution, the function pointer cannot be avoided. However, when rewriting the code block as follows:

    y = zeros(512,512,3)
    match input_fmt with
    | "RGB" ->
        parallel_do(size(y),brightness,contrast,RGB_color_image,y,brightness_contrast_enhancer)
    | "Yab" ->
        parallel_do(size(y),brightness,contrast,RGB_Yab_image,y,brightness_contrast_enhancer)
    | _ ->
        error("Sorry, input format is currently not defined")
    end

The compiler is able to automatically specialize the function brightness_contrast_enhancer for RGB_color_image and RGB_Yab_image (Avoid and Error modes).