Functions in Quasar

Hurray! Today I found some of my old notes about Quasar, written about one year ago. Since I forget everything, I thought it could be useful to put it here.

FunctionsDiagram

This diagram is quite essential, if there are some elements you don’t fully understand, please have a look at the reference manual.

Summarized:

  • Both __kernel__ and __device__ functions are low-level functions, they are natively compiled for CPU and/or GPU. This has the practical consequence that the functionality available for these functions is restricted. It is for example not possible to print, load or save information inside kernel or device functions.
  • Host functions are high-level functions, typically they are interpreted (or Quasar EXE’s, compiled using the just-in-time compiler).
  • A kernel function is normally repeated for every element of a matrix. Kernel functions can only be called from host code (although in future support for CUDA 5.0 dynamic parallelism, this may change).
  • A device function can be called from host code, in which case it is normally interpreted (if not inlined), or from other device/kernel functions, in which case it is natively compiled.

The distinction between these three types of functions is necessary to allow GPU programming. Furthermore, it provides a mechanism (to some extent) to balance the work between CPU/GPU. As programmer, you know whether the code inside the function will be run on GPU/CPU.