Functions in Quasar
Hurray! Today I found some of my old notes about Quasar, written about one year ago. Since I forget everything, I thought it could be useful to put it here.
This diagram is quite essential, if there are some elements you don’t fully understand, please have a look at the reference manual.
Summarized:
- Both
__kernel__
and__device__
functions are low-level functions, they are natively compiled for CPU and/or GPU. This has the practical consequence that the functionality available for these functions is restricted. It is for example not possible toprint
,load
orsave
information inside kernel or device functions. - Host functions are high-level functions, typically they are interpreted (or Quasar EXE’s, compiled using the just-in-time compiler).
- A kernel function is normally repeated for every element of a matrix. Kernel functions can only be called from host code (although in future support for CUDA 5.0 dynamic parallelism, this may change).
- A device function can be called from host code, in which case it is normally interpreted (if not inlined), or from other device/kernel functions, in which case it is natively compiled.
The distinction between these three types of functions is necessary to allow GPU programming. Furthermore, it provides a mechanism (to some extent) to balance the work between CPU/GPU. As programmer, you know whether the code inside the function will be run on GPU/CPU.