Device functions do not allow matrices of more than 8192 elements
Or, more probably, of more than 32kb.
I have the following code:
function [memory] = __device__ memtest1() memory = zeros(2047,4) for i = 0..size(memory,0)-1 for d = 0..3 memory[i,d] = tuple_count(max(i,d),d) end end end function [memory] = __device__ memtest2() memory = zeros(2049,4) for i = 0..size(memory,0)-1 for d = 0..3 memory[i,d] = tuple_count(max(i,d),d) end end end function [] = main( memtest1() memtest2() end
The first call works just fine (test by commenting out the second one), but the second one crashes to desktop for me.Weird, as I would expect quasar would simply execute this on the CPU when the memory footprint grows too large (which I assume happens anyhow because of the relative ease of the functions).
On a related note, it should probably not crash to desktop, but give a more graceful warning message that the memory passed a certain bound.
In the last release of Quasar, it is possible to configure the maximum amount of memory that can be allocated from a kernel function. Go to Redshift/Program settings/Runtime. I suggest reserving for example 128 MB of dynamic kernel memory, with a maximum block size of 128 KB (corresponding to +/- 32766 elements in 32-bit floating point precision or +/- 16382 elements in 64-bit precision – note that there are also some overhead bytes).
I will investigate more the crash issue in linux, the error message should be ((native function call) memtest1 – error from user-code: No allocator for object of the specified size!)