Illegal address error

Private: Q&ACategory: QuestionsIllegal address error
tostoop asked 8 years ago

The following code gives an IllegalAddressError when executing on a GPU. The commented code is my workaround. I’d love to know what the specific reason behind this error is as the description doesn’t give much explanation.


max_GPU = 2048
blk_size = 128

function [] = __device__ fsub_col_unit(A: mat'const'unchecked, b: vec'unchecked, i_start:int, j_start: int, n: int)
    for j = 0..n-2
        for i = j+1..n-1
            b[i_start+i] -= A[i_start+i,j_start+j]*b[i_start+j]
        end    
    end
end

function [] = __kernel__ trsv(A: mat'unchecked, i_start:int, j_start: int, k: int, jb: int, pos: ivec1)
    temp = A[:, j_start+k+jb+pos]
    fsub_col_unit(A, temp, i_start, j_start, jb)
    A[:, j_start+k+jb+pos] = temp
end

function [] = f(A: mat'unchecked, i_start:int, j_start: int, cols: int)
%    if cols-blk_size > max_GPU
%        for k = 0..max_GPU..cols-blk_size-1
%            jb2 = min(max_GPU, cols-blk_size-k)
%            parallel_do(jb2, A, i_start, j_start,k, blk_size, trsv)
%        end
%    else
        parallel_do(cols-blk_size, A, 0, 0, 0, blk_size, trsv)
%    endif      
end

dim = 2200
A = rand(dim,dim)

f(A, 0,0,dim)
1 Answers
bgoossen answered 8 years ago

Most likely, using vectors of length 2200 and due to memory alignment, you are using all off the available dynamic kernel memory (variable temp) and a memory allocation failure occurs inside the kernel function. See section 8.3 of the quick reference manual.
In this case, you can easily refactor the code to not use any dynamic kernel memory at all (which is also much faster).
I will check if I can get a more informative error message, although for these things the CUDA error handling mechanism does not allow a lot of flexibility…