提交 666cf404 authored 作者: Frederic's avatar Frederic

New doc info from @mrocklin

上级 035ca639
...@@ -610,16 +610,11 @@ Modify and execute to support *stride* (i.e. so as not constrain the input to be ...@@ -610,16 +610,11 @@ Modify and execute to support *stride* (i.e. so as not constrain the input to be
GPU Async capabilities GPU Async capabilities
---------------------- ----------------------
Since Theano 0.6, we started to use the asynchone capability of Ever since Theano 0.6 we started to use the asynchronous capability of
GPU. This allow to be faster, but some errors are raised later, at the GPUs. This allows us to be faster but with the possibility that some
wrong place. This mess with the profiling of Theano apply node. errors may be raised later than when they should occur. This can cause
In both case, you can use the NVIDIA driver feature that when difficulties when profiling Theano apply nodes. There is a NVIDIA
environment variable CUDA_LAUNCH_BLOCKING=1 is set, all kernal call driver feature to help with these issues. If you set the environment
get automatically syncronized. This will restore to the old beavior variable CUDA_LAUNCH_BLOCKING=1 then all kernel calls will be
that provide good profiling and error message. automatically synchronized. This reduces performance but provides good
profiling and appropriately placed error messages.
This feature interact with Theano garbage collector of intermediate
results. To get the most of this feature, you need to disable the gc
as it insert synchronization point in the graph. Set the Theano flag
allow_gc=False to get event faster speed! This will raise the memory
usage.
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论