提交 bb170f4f authored 作者: Arnaud Bergeron's avatar Arnaud Bergeron

Fix random segfault on exit when the new backend is in use.

The problem is that the old backend assumes it is alone in the universe and even when not in use will interact with the gpu. Most notably in this case it will forcibly destroy any primary context on exit. The only problem is that we are also using the primary context and our cleanup handlers run after those of the old backend. This is a major problem because cublas uses the runtime api which will create a context whenever it thinks it needs one (like for cudaFree). However all the allocations are from the old context that the old backend destroyed. So when it tries to clean up its resources the low-level handlers get confused and we end up in a segmentation fault. As to why it doesn't always happen, I figure it's because sometimes we get lucky and addresses line up or something. Anyway, if we stop the old backend from calling cudaThreadExit() the segfaults go away. This may cause a very small leakage of resources for the few seconds we are running until we exit, but I don't think this will be a problem.
上级 e8f9541d
...@@ -3310,7 +3310,6 @@ CudaNdarray_gpu_shutdown(PyObject* _unused, PyObject* _unused_args) { ...@@ -3310,7 +3310,6 @@ CudaNdarray_gpu_shutdown(PyObject* _unused, PyObject* _unused_args) {
} }
} }
} }
cudaThreadExit();
Py_INCREF(Py_None); Py_INCREF(Py_None);
return Py_None; return Py_None;
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论