SCHED_FIFO cuda issue

Hi,

I am encountering a bug when using the SCHED_FIFO scheduler for CPU fork and/or thread, setting the CPU where I want my fork to run, and trying to allocate memory for the cuda. Followed is a piece of program to reproduce the bug:

#define tryExit(var, msg) if(var != 0) {printf("-> %d <-\n",var); errno = var ; perror(msg);exit(EXIT_FAILURE); }

void child_func() {
    int status = 0;
    pid_t self = getpid();
    struct sched_param sp;
    sp.sched_priority = sched_get_priority_min(SCHED_FIFO);
    status = sched_setscheduler(self, SCHED_FIFO, &sp);
    tryExit(status, "child - sched_setscheduler ");
    cpu_set_t cpuset;
    CPU_ZERO(&cpuset);
    CPU_SET(0, &cpuset); // whatever number between 0-3
    status = sched_setaffinity(self, sizeof(cpu_set_t), &cpuset); 
    tryExit(status, "child - sched_setaffinity(&attr,2)");
    
    cout << "Start" << endl;
    char *ptr;
    cudaError_t cstatus = cudaHostAlloc((void**)&ptr, 12, cudaHostAllocMapped);
    cout << cudaGetErrorString(cstatus) << endl;
    cudaFreeHost(ptr);
}
int main(int argc, char** argv) {
    pid_t child = fork();
    if(child == 0) child_func();
    
    int status = 0;
    pid_t wpid;
    while ((wpid = wait(&status)) > 0 || (wpid == -1 && errno == EINTR));
    if(wpid == -1)
        perror(strerror(errno));
    
    return 0;
}

When running it, it hangs on the cudaHostAlloc, doing a htop I can see that the programme is running and its memory utilisation growing until there is no more memory available on the device.
If I remove the sched_setaffinity, it works. Or if I use an other scheduling policy, e.g. SCHED_RR, it works as well.

Could it be a driver bug? cuda bug? kernel bug?
Any ideas?

What exact software versions of things are you talking about?

Oups sorry forgot to put that in the environment description.
I am using the ubuntu image provided by toradex with JetPack installed.

$ uname -a
Linux tegra-ubuntu 3.10.40-2.8.6+g2c7a3c3af726 #1 SMP PREEMPT Mon Apr 1 09:54:31 UTC 2019 armv7l armv7l armv7l GNU/Linux

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2014 NVIDIA Corporation
Built on Tue_Feb_17_22:53:16_CST_2015
Cuda compilation tools, release 6.5, V6.5.45

$ gcc --version
gcc (Ubuntu/Linaro 4.8.4-2ubuntu1~14.04.4) 4.8.4

Benjamin.

Most likely what you are seeing is related to e.g. the following

https://stackoverflow.com/questions/13596337/cudadevicesynchronise-spawns-new-thread-even-when-set-to-blocking

I don’t think there is any easy way around it, sorry.

Thanks for the pointer. So it seems that it is in the CPU thread management of the cuda runtime. Do you know if it is possible to update the cuda runtime/driver or the jetpack for the TK1?

I believe with CUDA 6.5.53 from JetPack 3.1 we are already using the latest available for TK1.