提交 c10aa585 authored 作者: Frédéric Bastien's avatar Frédéric Bastien

Merge pull request #4366 from nouiz/cudnn_init

cudnn v5 cleanup init, small test fix.
...@@ -10,7 +10,7 @@ Theano 0.7 (26th of March, 2015) ...@@ -10,7 +10,7 @@ Theano 0.7 (26th of March, 2015)
We recommand to everyone to upgrade to this version. We recommand to everyone to upgrade to this version.
Highlights: Highlights:
* Integration of CuDNN for 2D convolutions and pooling on supported GPUs * Integration of cuDNN for 2D convolutions and pooling on supported GPUs
* Too many optimizations and new features to count * Too many optimizations and new features to count
* Various fixes and improvements to scan * Various fixes and improvements to scan
* Better support for GPU on Windows * Better support for GPU on Windows
......
...@@ -10,7 +10,7 @@ We recommend that everybody update to this version. ...@@ -10,7 +10,7 @@ We recommend that everybody update to this version.
Highlights: Highlights:
- Python 2 and 3 support with the same code base - Python 2 and 3 support with the same code base
- Faster optimization - Faster optimization
- Integration of CuDNN for better GPU performance - Integration of cuDNN for better GPU performance
- Many Scan improvements (execution speed up, ...) - Many Scan improvements (execution speed up, ...)
- optimizer=fast_compile moves computation to the GPU. - optimizer=fast_compile moves computation to the GPU.
- Better convolution on CPU and GPU. (CorrMM, cudnn, 3d conv, more parameter) - Better convolution on CPU and GPU. (CorrMM, cudnn, 3d conv, more parameter)
......
...@@ -235,7 +235,7 @@ CPU and GPU memory usage. ...@@ -235,7 +235,7 @@ CPU and GPU memory usage.
Could speed up and lower memory usage: Could speed up and lower memory usage:
- :ref:`CuDNN <libdoc_cuda_dnn>` default CuDNN convolution use less - :ref:`cuDNN <libdoc_cuda_dnn>` default cuDNN convolution use less
memory then Theano version. But some flags allow it to use more memory then Theano version. But some flags allow it to use more
memory. GPU only. memory. GPU only.
- Shortly avail, multi-GPU. - Shortly avail, multi-GPU.
......
...@@ -25,7 +25,7 @@ News ...@@ -25,7 +25,7 @@ News
* Multi-GPU. * Multi-GPU.
* We added support for :ref:`CuDNN v4 <libdoc_cuda_dnn>`. * We added support for :ref:`CuDNN v5 <libdoc_cuda_dnn>`.
* We added support for :attr:`CNMeM <config.lib.cnmem>` to speed up * We added support for :attr:`CNMeM <config.lib.cnmem>` to speed up
the GPU memory allocation. the GPU memory allocation.
......
...@@ -41,22 +41,22 @@ Theano will still work if the user did not introduce them manually. ...@@ -41,22 +41,22 @@ Theano will still work if the user did not introduce them manually.
The recently added Theano flag :attr:`dnn.enabled The recently added Theano flag :attr:`dnn.enabled
<config.dnn.enabled>` allows to change the default behavior to force <config.dnn.enabled>` allows to change the default behavior to force
it or disable it. Older Theano version do not support this flag. To it or disable it. Older Theano version do not support this flag. To
get an error when CuDNN can not be used with them, use this flag: get an error when cuDNN can not be used with them, use this flag:
``optimizer_including=cudnn``. ``optimizer_including=cudnn``.
.. note:: .. note::
CuDNN v3 has now been released. CuDNN v2 remains supported but CuDNN v3 is cuDNN v3 has now been released. cuDNN v2 remains supported but cuDNN v3 is
faster and offers many more options. We recommend that everybody update to faster and offers many more options. We recommend that everybody update to
v3. v3.
.. note:: .. note::
Starting in CuDNN v3, multiple convolution implementations are offered and Starting in cuDNN v3, multiple convolution implementations are offered and
it is possible to use heuristics to automatically choose a convolution it is possible to use heuristics to automatically choose a convolution
implementation well suited to the parameters of the convolution. implementation well suited to the parameters of the convolution.
The Theano flag ``dnn.conv.algo_fwd`` allows to specify the CuDNN The Theano flag ``dnn.conv.algo_fwd`` allows to specify the cuDNN
convolution implementation that Theano should use for forward convolutions. convolution implementation that Theano should use for forward convolutions.
Possible values include : Possible values include :
...@@ -69,20 +69,20 @@ get an error when CuDNN can not be used with them, use this flag: ...@@ -69,20 +69,20 @@ get an error when CuDNN can not be used with them, use this flag:
* ``fft_tiling`` : use the Fast Fourrier Transform implementation of convolution * ``fft_tiling`` : use the Fast Fourrier Transform implementation of convolution
with tiling (high memory usage, but less then fft) with tiling (high memory usage, but less then fft)
* ``guess_once`` : the first time a convolution is executed, the * ``guess_once`` : the first time a convolution is executed, the
implementation to use is chosen according to CuDNN's heuristics and reused implementation to use is chosen according to cuDNN's heuristics and reused
for every subsequent execution of the convolution. for every subsequent execution of the convolution.
* ``guess_on_shape_change`` : like ``guess_once`` but a new convolution * ``guess_on_shape_change`` : like ``guess_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels implementation selected every time the shapes of the inputs and kernels
don't match the shapes from the last execution. don't match the shapes from the last execution.
* ``time_once`` : the first time a convolution is executed, every convolution * ``time_once`` : the first time a convolution is executed, every convolution
implementation offered by CuDNN is executed and timed. The fastest is implementation offered by cuDNN is executed and timed. The fastest is
reused for every subsequent execution of the convolution. reused for every subsequent execution of the convolution.
* ``time_on_shape_change`` : like ``time_once`` but a new convolution * ``time_on_shape_change`` : like ``time_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels implementation selected every time the shapes of the inputs and kernels
don't match the shapes from the last execution. don't match the shapes from the last execution.
The Theano flag ``dnn.conv.algo_bwd_filter`` and The Theano flag ``dnn.conv.algo_bwd_filter`` and
``dnn.conv.algo_bwd_data`` allows to specify the CuDNN ``dnn.conv.algo_bwd_data`` allows to specify the cuDNN
convolution implementation that Theano should use for gradient convolution implementation that Theano should use for gradient
convolutions. Possible values include : convolutions. Possible values include :
...@@ -92,13 +92,13 @@ get an error when CuDNN can not be used with them, use this flag: ...@@ -92,13 +92,13 @@ get an error when CuDNN can not be used with them, use this flag:
* ``fft`` : use the Fast Fourrier Transform implementation of convolution * ``fft`` : use the Fast Fourrier Transform implementation of convolution
(very high memory usage) (very high memory usage)
* ``guess_once`` : the first time a convolution is executed, the * ``guess_once`` : the first time a convolution is executed, the
implementation to use is chosen according to CuDNN's heuristics and reused implementation to use is chosen according to cuDNN's heuristics and reused
for every subsequent execution of the convolution. for every subsequent execution of the convolution.
* ``guess_on_shape_change`` : like ``guess_once`` but a new convolution * ``guess_on_shape_change`` : like ``guess_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels implementation selected every time the shapes of the inputs and kernels
don't match the shapes from the last execution. don't match the shapes from the last execution.
* ``time_once`` : the first time a convolution is executed, every convolution * ``time_once`` : the first time a convolution is executed, every convolution
implementation offered by CuDNN is executed and timed. The fastest is implementation offered by cuDNN is executed and timed. The fastest is
reused for every subsequent execution of the convolution. reused for every subsequent execution of the convolution.
* ``time_on_shape_change`` : like ``time_once`` but a new convolution * ``time_on_shape_change`` : like ``time_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels implementation selected every time the shapes of the inputs and kernels
......
...@@ -43,17 +43,17 @@ To get an error if Theano can not use cuDNN, use this Theano flag: ...@@ -43,17 +43,17 @@ To get an error if Theano can not use cuDNN, use this Theano flag:
.. note:: .. note::
CuDNN v3 has now been released. CuDNN v2 remains supported but CuDNN v3 is cuDNN v3 has now been released. cuDNN v2 remains supported but cuDNN v3 is
faster and offers many more options. We recommend that everybody update to faster and offers many more options. We recommend that everybody update to
v3. v3.
.. note:: .. note::
Starting in CuDNN v3, multiple convolution implementations are offered and Starting in cuDNN v3, multiple convolution implementations are offered and
it is possible to use heuristics to automatically choose a convolution it is possible to use heuristics to automatically choose a convolution
implementation well suited to the parameters of the convolution. implementation well suited to the parameters of the convolution.
The Theano flag ``dnn.conv.algo_fwd`` allows to specify the CuDNN The Theano flag ``dnn.conv.algo_fwd`` allows to specify the cuDNN
convolution implementation that Theano should use for forward convolutions. convolution implementation that Theano should use for forward convolutions.
Possible values include : Possible values include :
...@@ -64,19 +64,19 @@ To get an error if Theano can not use cuDNN, use this Theano flag: ...@@ -64,19 +64,19 @@ To get an error if Theano can not use cuDNN, use this Theano flag:
* ``fft`` : use the Fast Fourrier Transform implementation of convolution * ``fft`` : use the Fast Fourrier Transform implementation of convolution
(very high memory usage) (very high memory usage)
* ``guess_once`` : the first time a convolution is executed, the * ``guess_once`` : the first time a convolution is executed, the
implementation to use is chosen according to CuDNN's heuristics and reused implementation to use is chosen according to cuDNN's heuristics and reused
for every subsequent execution of the convolution. for every subsequent execution of the convolution.
* ``guess_on_shape_change`` : like ``guess_once`` but a new convolution * ``guess_on_shape_change`` : like ``guess_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels implementation selected every time the shapes of the inputs and kernels
don't match the shapes from the last execution. don't match the shapes from the last execution.
* ``time_once`` : the first time a convolution is executed, every convolution * ``time_once`` : the first time a convolution is executed, every convolution
implementation offered by CuDNN is executed and timed. The fastest is implementation offered by cuDNN is executed and timed. The fastest is
reused for every subsequent execution of the convolution. reused for every subsequent execution of the convolution.
* ``time_on_shape_change`` : like ``time_once`` but a new convolution * ``time_on_shape_change`` : like ``time_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels implementation selected every time the shapes of the inputs and kernels
don't match the shapes from the last execution. don't match the shapes from the last execution.
The Theano flag ``dnn.conv.algo_bwd`` allows to specify the CuDNN The Theano flag ``dnn.conv.algo_bwd`` allows to specify the cuDNN
convolution implementation that Theano should use for gradient convolutions. convolution implementation that Theano should use for gradient convolutions.
Possible values include : Possible values include :
...@@ -86,13 +86,13 @@ To get an error if Theano can not use cuDNN, use this Theano flag: ...@@ -86,13 +86,13 @@ To get an error if Theano can not use cuDNN, use this Theano flag:
* ``fft`` : use the Fast Fourrier Transform implementation of convolution * ``fft`` : use the Fast Fourrier Transform implementation of convolution
(very high memory usage) (very high memory usage)
* ``guess_once`` : the first time a convolution is executed, the * ``guess_once`` : the first time a convolution is executed, the
implementation to use is chosen according to CuDNN's heuristics and reused implementation to use is chosen according to cuDNN's heuristics and reused
for every subsequent execution of the convolution. for every subsequent execution of the convolution.
* ``guess_on_shape_change`` : like ``guess_once`` but a new convolution * ``guess_on_shape_change`` : like ``guess_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels implementation selected every time the shapes of the inputs and kernels
don't match the shapes from the last execution. don't match the shapes from the last execution.
* ``time_once`` : the first time a convolution is executed, every convolution * ``time_once`` : the first time a convolution is executed, every convolution
implementation offered by CuDNN is executed and timed. The fastest is implementation offered by cuDNN is executed and timed. The fastest is
reused for every subsequent execution of the convolution. reused for every subsequent execution of the convolution.
* ``time_on_shape_change`` : like ``time_once`` but a new convolution * ``time_on_shape_change`` : like ``time_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels implementation selected every time the shapes of the inputs and kernels
......
...@@ -309,25 +309,25 @@ AddConfigVar('dnn.conv.algo_bwd', ...@@ -309,25 +309,25 @@ AddConfigVar('dnn.conv.algo_bwd',
in_c_key=False) in_c_key=False)
AddConfigVar('dnn.conv.algo_fwd', AddConfigVar('dnn.conv.algo_fwd',
"Default implementation to use for CuDNN forward convolution.", "Default implementation to use for cuDNN forward convolution.",
EnumStr(*SUPPORTED_DNN_CONV_ALGO_FWD), EnumStr(*SUPPORTED_DNN_CONV_ALGO_FWD),
in_c_key=False) in_c_key=False)
AddConfigVar('dnn.conv.algo_bwd_data', AddConfigVar('dnn.conv.algo_bwd_data',
"Default implementation to use for CuDNN backward convolution to " "Default implementation to use for cuDNN backward convolution to "
"get the gradients of the convolution with regard to the inputs.", "get the gradients of the convolution with regard to the inputs.",
EnumStr(*SUPPORTED_DNN_CONV_ALGO_BWD_DATA), EnumStr(*SUPPORTED_DNN_CONV_ALGO_BWD_DATA),
in_c_key=False) in_c_key=False)
AddConfigVar('dnn.conv.algo_bwd_filter', AddConfigVar('dnn.conv.algo_bwd_filter',
"Default implementation to use for CuDNN backward convolution to " "Default implementation to use for cuDNN backward convolution to "
"get the gradients of the convolution with regard to the " "get the gradients of the convolution with regard to the "
"filters.", "filters.",
EnumStr(*SUPPORTED_DNN_CONV_ALGO_BWD_FILTER), EnumStr(*SUPPORTED_DNN_CONV_ALGO_BWD_FILTER),
in_c_key=False) in_c_key=False)
AddConfigVar('dnn.conv.precision', AddConfigVar('dnn.conv.precision',
"Default data precision to use for the computation in CuDNN " "Default data precision to use for the computation in cuDNN "
"convolutions (defaults to the same dtype as the inputs of the " "convolutions (defaults to the same dtype as the inputs of the "
"convolutions).", "convolutions).",
EnumStr('as_input', 'float16', 'float32', 'float64'), EnumStr('as_input', 'float16', 'float32', 'float64'),
...@@ -350,9 +350,9 @@ AddConfigVar('dnn.library_path', ...@@ -350,9 +350,9 @@ AddConfigVar('dnn.library_path',
StrParam(default_dnn_path('lib' if sys.platform == 'darwin' else 'lib64'))) StrParam(default_dnn_path('lib' if sys.platform == 'darwin' else 'lib64')))
AddConfigVar('dnn.enabled', AddConfigVar('dnn.enabled',
"'auto', use CuDNN if available, but silently fall back" "'auto', use cuDNN if available, but silently fall back"
" to not using it if not present." " to not using it if not present."
" If True and CuDNN can not be used, raise an error." " If True and cuDNN can not be used, raise an error."
" If False, disable cudnn", " If False, disable cudnn",
StrParam("auto", "True", "False"), StrParam("auto", "True", "False"),
in_c_key=False) in_c_key=False)
......
...@@ -270,14 +270,14 @@ from theano.sandbox.cuda.type import CudaNdarrayType ...@@ -270,14 +270,14 @@ from theano.sandbox.cuda.type import CudaNdarrayType
def dnn_available(): def dnn_available():
if config.dnn.enabled == "False": if config.dnn.enabled == "False":
dnn_available.avail = False dnn_available.avail = False
dnn_available.msg = "disabled by dnn.enabled flag" dnn_available.msg = "Disabled by dnn.enabled flag"
if dnn_available.avail is None and not cuda_available: if dnn_available.avail is None and not cuda_available:
dnn_available.msg = "CUDA not available" dnn_available.msg = "CUDA not available"
dnn_available.avail = False dnn_available.avail = False
elif dnn_available.avail is None: elif dnn_available.avail is None:
dev = active_device_number() dev = active_device_number()
if device_properties(dev)['major'] < 3: if device_properties(dev)['major'] < 3:
dnn_available.msg = "Device not supported by cuDNN" dnn_available.msg = "Device not supported"
dnn_available.avail = False dnn_available.avail = False
else: else:
preambule = """ preambule = """
...@@ -315,7 +315,7 @@ if ((err = cudnnCreate(&_handle)) != CUDNN_STATUS_SUCCESS) { ...@@ -315,7 +315,7 @@ if ((err = cudnnCreate(&_handle)) != CUDNN_STATUS_SUCCESS) {
dnn_available.avail = comp dnn_available.avail = comp
if not dnn_available.avail: if not dnn_available.avail:
dnn_available.msg = ( dnn_available.msg = (
"Theano can not compile with cuDNN. We got this error:\n" + "Can not compile with cuDNN. We got this error:\n" +
str(err)) str(err))
else: else:
# If we can compile, check that we can import and run. # If we can compile, check that we can import and run.
...@@ -326,18 +326,17 @@ if ((err = cudnnCreate(&_handle)) != CUDNN_STATUS_SUCCESS) { ...@@ -326,18 +326,17 @@ if ((err = cudnnCreate(&_handle)) != CUDNN_STATUS_SUCCESS) {
" from one version, but we link with" " from one version, but we link with"
" a different version %s" % str(v)) " a different version %s" % str(v))
raise RuntimeError(dnn_available.msg) raise RuntimeError(dnn_available.msg)
if v == -1 or v[0] < 3007: if v == -1 or v[0] < 4007:
# 3007 is the final release of cudnn v3 # 4007 is the final release of cudnn v4
dnn_available.avail = False dnn_available.avail = False
dnn_available.msg = ( dnn_available.msg = "Version is too old. Update to v5, was %d." % v[0]
"You have an old release of CuDNN (or a release "
"candidate) that isn't supported. Please update to "
"at least v3 final version.")
raise RuntimeError(dnn_available.msg) raise RuntimeError(dnn_available.msg)
else:
dnn_available.avail = comp
if config.dnn.enabled == "True": if config.dnn.enabled == "True":
if not dnn_available.avail: if not dnn_available.avail:
raise RuntimeError( raise RuntimeError(
"You enabled CuDNN, but we aren't able to use it: %s" % "You enabled cuDNN, but we aren't able to use it: %s" %
dnn_available.msg) dnn_available.msg)
return dnn_available.avail return dnn_available.avail
...@@ -582,14 +581,15 @@ def use(device, ...@@ -582,14 +581,15 @@ def use(device,
if dnn_available(): if dnn_available():
(hdr_v, runtime_v) = dnn_version() (hdr_v, runtime_v) = dnn_version()
cudnn_version = runtime_v cudnn_version = runtime_v
# 4100 should not print warning with cudnn 4 final. # 5100 should not print warning with cudnn 5 final.
if cudnn_version > 4100: if cudnn_version > 5100:
warn = ("Your CuDNN version is more recent then Theano." warn = ("Your cuDNN version is more recent than the one"
" If you see problems, try updating Theano or" " Theano officially supports."
" downgrading CuDNN to version 4.") " If you see any problems, try updating Theano or"
" downgrading cuDNN to version 5.")
except Exception: except Exception:
pass cudnn_version = dnn_available.msg
print("Using gpu device %d: %s (CNMeM is %s, CuDNN %s)" % ( print("Using gpu device %d: %s (CNMeM is %s, cuDNN %s)" % (
active_device_number(), active_device_number(),
active_device_name(), active_device_name(),
cnmem_enabled, cnmem_enabled,
......
...@@ -323,30 +323,30 @@ class GpuDnnConv(DnnBase, COp): ...@@ -323,30 +323,30 @@ class GpuDnnConv(DnnBase, COp):
if self.inplace: if self.inplace:
self.destroy_map = {0: [2]} self.destroy_map = {0: [2]}
# In CuDNN version older than V3, the FFT implementation and the # In cuDNN version older than V3, the FFT implementation and the
# option to time the different implementations to get the fastest # option to time the different implementations to get the fastest
# are both unavailable. # are both unavailable.
if version() < (3000, 3000): if version() < (3000, 3000):
if self.algo == 'fft': if self.algo == 'fft':
raise RuntimeError("CuDNN FFT convolution requires CuDNN v3") raise RuntimeError("cuDNN FFT convolution requires cuDNN v3")
elif self.algo in ['guess_once', 'guess_on_shape_change']: elif self.algo in ['guess_once', 'guess_on_shape_change']:
raise RuntimeError("CuDNN selection of convolution " raise RuntimeError("cuDNN selection of convolution "
"implementation based on heuristics " "implementation based on heuristics "
"requires CuDNN v3") "requires cuDNN v3")
elif self.algo in ['time_once', 'time_on_shape_change']: elif self.algo in ['time_once', 'time_on_shape_change']:
raise RuntimeError("CuDNN convolution timing requires CuDNN " raise RuntimeError("cuDNN convolution timing requires cuDNN "
"v3") "v3")
# The fft_tiling implementation is only available from CuDNN V4 onward # The fft_tiling implementation is only available from cuDNN V4 onward
if version() < (4000, 4000): if version() < (4000, 4000):
if self.algo == 'fft_tiling': if self.algo == 'fft_tiling':
raise RuntimeError("CuDNN tiled-FFT convolution requires " raise RuntimeError("cuDNN tiled-FFT convolution requires "
"CuDNN v4 or more recent") "cuDNN v4 or more recent")
if version() < (5000, 5000): if version() < (5000, 5000):
if self.algo == 'winograd': if self.algo == 'winograd':
raise RuntimeError("CuDNN winograd convolution requires " raise RuntimeError("cuDNN winograd convolution requires "
"CuDNN v5 or more recent") "cuDNN v5 or more recent")
assert self.algo in ['none', 'small', 'large', 'fft', 'fft_tiling', assert self.algo in ['none', 'small', 'large', 'fft', 'fft_tiling',
'winograd', 'guess_once', 'guess_on_shape_change', 'winograd', 'guess_once', 'guess_on_shape_change',
...@@ -517,11 +517,11 @@ class GpuDnnConv3d(GpuDnnConv): ...@@ -517,11 +517,11 @@ class GpuDnnConv3d(GpuDnnConv):
if version() < (5000, 5000): if version() < (5000, 5000):
if self.algo == 'fft_tiling': if self.algo == 'fft_tiling':
raise RuntimeError("CuDNN 3d tiled-FFT convolution requires " raise RuntimeError("cuDNN 3d tiled-FFT convolution requires "
"CuDNN v5 or more recent") "cuDNN v5 or more recent")
elif self.algo == 'winograd': elif self.algo == 'winograd':
raise RuntimeError("CuDNN 3d winograd convolution requires " raise RuntimeError("cuDNN 3d winograd convolution requires "
"CuDNN v5 or more recent") "cuDNN v5 or more recent")
def make_node(self, img, kern, output, desc, alpha=None, beta=None): def make_node(self, img, kern, output, desc, alpha=None, beta=None):
...@@ -834,17 +834,17 @@ class GpuDnnConvGradI(DnnBase, COp): ...@@ -834,17 +834,17 @@ class GpuDnnConvGradI(DnnBase, COp):
if self.inplace: if self.inplace:
self.destroy_map = {0: [2]} self.destroy_map = {0: [2]}
# The small-workspace implementation is only available from CuDNN V4 # The small-workspace implementation is only available from cuDNN V4
# onward. # onward.
if version() < (4000, 4000): if version() < (4000, 4000):
if self.algo == 'fft_tiling': if self.algo == 'fft_tiling':
raise RuntimeError("CuDNN's tiled-FFT convolution requires " raise RuntimeError("cuDNN's tiled-FFT convolution requires "
"CuDNN v4 or more recent") "cuDNN v4 or more recent")
if version() < (5000, 5000): if version() < (5000, 5000):
if self.algo == 'winograd': if self.algo == 'winograd':
raise RuntimeError("CuDNN's winograd convolution requires " raise RuntimeError("cuDNN's winograd convolution requires "
"CuDNN v5 or more recent") "cuDNN v5 or more recent")
assert self.algo in ['none', 'deterministic', 'fft', 'fft_tiling', assert self.algo in ['none', 'deterministic', 'fft', 'fft_tiling',
'winograd', 'guess_once', 'guess_on_shape_change', 'winograd', 'guess_once', 'guess_on_shape_change',
...@@ -997,11 +997,11 @@ class GpuDnnConv3dGradI(GpuDnnConvGradI): ...@@ -997,11 +997,11 @@ class GpuDnnConv3dGradI(GpuDnnConvGradI):
assert self.algo in good_algo assert self.algo in good_algo
if version() < (5000, 5000): if version() < (5000, 5000):
if self.algo == 'fft_tiling': if self.algo == 'fft_tiling':
raise RuntimeError("CuDNN 3d tiled-FFT convolution requires " raise RuntimeError("cuDNN 3d tiled-FFT convolution requires "
"CuDNN v5 or more recent") "cuDNN v5 or more recent")
elif self.algo == 'winograd': elif self.algo == 'winograd':
raise RuntimeError("CuDNN 3d winograd convolution requires " raise RuntimeError("cuDNN 3d winograd convolution requires "
"CuDNN v5 or more recent") "cuDNN v5 or more recent")
def grad(self, inp, grads): def grad(self, inp, grads):
kerns, top, output, desc, alpha, beta = inp kerns, top, output, desc, alpha, beta = inp
...@@ -1079,7 +1079,7 @@ def dnn_conv(img, kerns, border_mode='valid', subsample=(1, 1), ...@@ -1079,7 +1079,7 @@ def dnn_conv(img, kerns, border_mode='valid', subsample=(1, 1),
*deprecated*, use parameter algo instead. *deprecated*, use parameter algo instead.
algo : {'none', 'small', 'large', 'fft', 'guess_once', 'guess_on_shape_change', 'time_once', 'time_on_shape_change'} algo : {'none', 'small', 'large', 'fft', 'guess_once', 'guess_on_shape_change', 'time_once', 'time_on_shape_change'}
Convolution implementation to use. Some of its values may require certain Convolution implementation to use. Some of its values may require certain
versions of CuDNN to be installed. Default is the value of versions of cuDNN to be installed. Default is the value of
:attr:`config.dnn.conv.algo_fwd`. :attr:`config.dnn.conv.algo_fwd`.
precision : {'as_input', 'float16', 'float32', 'float64'} precision : {'as_input', 'float16', 'float32', 'float64'}
Description of the dtype in which the computation of the convolution Description of the dtype in which the computation of the convolution
...@@ -1365,13 +1365,13 @@ class GpuDnnPoolDesc(GpuOp): ...@@ -1365,13 +1365,13 @@ class GpuDnnPoolDesc(GpuOp):
self.pad = pad self.pad = pad
if (pad[0] != 0 or pad[1] != 0) and version() == -1: if (pad[0] != 0 or pad[1] != 0) and version() == -1:
raise RuntimeError("CuDNN pooling with padding requires CuDNN v2") raise RuntimeError("cuDNN pooling with padding requires cuDNN v2")
if self.get_ndim() == 3 and version() < (3000, 3000): if self.get_ndim() == 3 and version() < (3000, 3000):
raise RuntimeError("CuDNN 3d pooling requires CuDNN v3") raise RuntimeError("cuDNN 3d pooling requires cuDNN v3")
if (mode == 'average_exc_pad' and max(pad) > 0 and if (mode == 'average_exc_pad' and max(pad) > 0 and
version() < (4004, 4004)): version() < (4004, 4004)):
raise RuntimeError( raise RuntimeError(
"CuDNN pooling mode 'average_exc_pad' requires at least v4") "cuDNN pooling mode 'average_exc_pad' requires at least v4")
def get_ndim(self): def get_ndim(self):
return len(self.ws) return len(self.ws)
...@@ -1383,7 +1383,7 @@ class GpuDnnPoolDesc(GpuOp): ...@@ -1383,7 +1383,7 @@ class GpuDnnPoolDesc(GpuOp):
def make_node(self): def make_node(self):
if self.pad != (0, 0) and version() == -1: if self.pad != (0, 0) and version() == -1:
raise RuntimeError("CuDNN pooling with padding requires CuDNN v2") raise RuntimeError("cuDNN pooling with padding requires cuDNN v2")
node = Apply(self, [], node = Apply(self, [],
[CDataType("cudnnPoolingDescriptor_t", [CDataType("cudnnPoolingDescriptor_t",
...@@ -1983,7 +1983,7 @@ class GpuDnnSoftmaxBase(DnnBase): ...@@ -1983,7 +1983,7 @@ class GpuDnnSoftmaxBase(DnnBase):
Always set this to 'bc01'. Always set this to 'bc01'.
algo : {'fast', 'accurate', 'log'} algo : {'fast', 'accurate', 'log'}
Indicating whether, respectively, computations should be optimized for Indicating whether, respectively, computations should be optimized for
speed, for accuracy, or if CuDNN should rather compute the log-softmax instead. speed, for accuracy, or if cuDNN should rather compute the log-softmax instead.
mode : {'instance', 'channel'} mode : {'instance', 'channel'}
Indicating whether the softmax should be computed per image across 'c01' Indicating whether the softmax should be computed per image across 'c01'
or per spatial location '01' per image across 'c'. or per spatial location '01' per image across 'c'.
...@@ -2004,7 +2004,7 @@ class GpuDnnSoftmaxBase(DnnBase): ...@@ -2004,7 +2004,7 @@ class GpuDnnSoftmaxBase(DnnBase):
self.tensor_format = tensor_format self.tensor_format = tensor_format
if algo == 'log' and version() < (3000, 3000): if algo == 'log' and version() < (3000, 3000):
raise RuntimeError("CuDNN log-softmax requires CuDNN v3") raise RuntimeError("cuDNN log-softmax requires cuDNN v3")
assert(algo in ('fast', 'accurate', 'log')) assert(algo in ('fast', 'accurate', 'log'))
self.algo = algo self.algo = algo
...@@ -2526,7 +2526,7 @@ if True: ...@@ -2526,7 +2526,7 @@ if True:
@register_opt('cudnn') @register_opt('cudnn')
@local_optimizer([GpuElemwise, LogSoftmax]) @local_optimizer([GpuElemwise, LogSoftmax])
def local_log_softmax_dnn(node): def local_log_softmax_dnn(node):
# The log-softmax implementation is only available starting at CuDNN V3 # The log-softmax implementation is only available starting at cuDNN V3
if not dnn_available() or version() < (3000, 3000): if not dnn_available() or version() < (3000, 3000):
return return
......
...@@ -78,7 +78,7 @@ APPLY_SPECIFIC(conv_fwd)(CudaNdarray *input, CudaNdarray *kerns, ...@@ -78,7 +78,7 @@ APPLY_SPECIFIC(conv_fwd)(CudaNdarray *input, CudaNdarray *kerns,
// Obtain a convolution algorithm appropriate for the input and kernel // Obtain a convolution algorithm appropriate for the input and kernel
// shapes. Either by choosing one according to heuristics or by making // shapes. Either by choosing one according to heuristics or by making
// CuDNN time every implementation and choose the best one. // cuDNN time every implementation and choose the best one.
if (CHOOSE_ALGO_TIME) if (CHOOSE_ALGO_TIME)
{ {
// Time the different implementations to choose the best one // Time the different implementations to choose the best one
......
...@@ -76,7 +76,7 @@ APPLY_SPECIFIC(conv_gi)(CudaNdarray *kerns, CudaNdarray *output, ...@@ -76,7 +76,7 @@ APPLY_SPECIFIC(conv_gi)(CudaNdarray *kerns, CudaNdarray *output,
{ {
// Obtain a convolution algorithm appropriate for the kernel and output // Obtain a convolution algorithm appropriate for the kernel and output
// shapes. Either by choosing one according to heuristics or by making // shapes. Either by choosing one according to heuristics or by making
// CuDNN time every implementation and choose the best one. // cuDNN time every implementation and choose the best one.
if (CHOOSE_ALGO_TIME) if (CHOOSE_ALGO_TIME)
{ {
// Time the different implementations to choose the best one // Time the different implementations to choose the best one
......
...@@ -76,7 +76,7 @@ APPLY_SPECIFIC(conv_gw)(CudaNdarray *input, CudaNdarray *output, ...@@ -76,7 +76,7 @@ APPLY_SPECIFIC(conv_gw)(CudaNdarray *input, CudaNdarray *output,
{ {
// Obtain a convolution algorithm appropriate for the input and output // Obtain a convolution algorithm appropriate for the input and output
// shapes. Either by choosing one according to heuristics or by making // shapes. Either by choosing one according to heuristics or by making
// CuDNN time every implementation and choose the best one. // cuDNN time every implementation and choose the best one.
if (CHOOSE_ALGO_TIME) if (CHOOSE_ALGO_TIME)
{ {
// Time the different implementations to choose the best one // Time the different implementations to choose the best one
......
...@@ -25,7 +25,7 @@ else: ...@@ -25,7 +25,7 @@ else:
class TestDnnConv2d(test_abstract_conv.BaseTestConv2d): class TestDnnConv2d(test_abstract_conv.BaseTestConv2d):
def setUp(self): def setUp(self):
super(TestDnnConv2d, self).setUp() super(TestDnnConv2d, self).setUp()
# provide_shape is not used by the CuDNN impementation # provide_shape is not used by the cuDNN impementation
self.provide_shape = [False] self.provide_shape = [False]
self.shared = gpu_shared self.shared = gpu_shared
......
...@@ -520,7 +520,7 @@ def _test_full(cls, mode=None, version=[-1], extra_shapes=[], ...@@ -520,7 +520,7 @@ def _test_full(cls, mode=None, version=[-1], extra_shapes=[],
def test_full(): def test_full():
# If using CuDNN version before v3, only run the tests where the # If using cuDNN version before v3, only run the tests where the
# kernels are not larger than the input in any spatial dimension. # kernels are not larger than the input in any spatial dimension.
if cuda.dnn.dnn_available() and cuda.dnn.version() < (3000, 3000): if cuda.dnn.dnn_available() and cuda.dnn.version() < (3000, 3000):
test_bigger_kernels = False test_bigger_kernels = False
...@@ -542,7 +542,7 @@ def test_dnn_full(): ...@@ -542,7 +542,7 @@ def test_dnn_full():
if not cuda.dnn.dnn_available(): if not cuda.dnn.dnn_available():
raise SkipTest(cuda.dnn.dnn_available.msg) raise SkipTest(cuda.dnn.dnn_available.msg)
# If using CuDNN version before v3, only run the tests where the # If using cuDNN version before v3, only run the tests where the
# kernels are not larger than the input in any spatial dimension. # kernels are not larger than the input in any spatial dimension.
if cuda.dnn.version() < (3000, 3000): if cuda.dnn.version() < (3000, 3000):
test_bigger_kernels = False test_bigger_kernels = False
......
...@@ -413,7 +413,7 @@ def test_old_pool_interface(): ...@@ -413,7 +413,7 @@ def test_old_pool_interface():
def test_pooling3d(): def test_pooling3d():
# CuDNN 3d pooling requires CuDNN v3. Don't test if the CuDNN version is # cuDNN 3d pooling requires cuDNN v3. Don't test if the cuDNN version is
# too old. # too old.
if not cuda.dnn.dnn_available() or cuda.dnn.version() < (3000, 3000): if not cuda.dnn.dnn_available() or cuda.dnn.version() < (3000, 3000):
raise SkipTest(cuda.dnn.dnn_available.msg) raise SkipTest(cuda.dnn.dnn_available.msg)
...@@ -641,8 +641,8 @@ class test_DnnSoftMax(test_nnet.test_SoftMax): ...@@ -641,8 +641,8 @@ class test_DnnSoftMax(test_nnet.test_SoftMax):
)]) == 0) )]) == 0)
def test_log_softmax(self): def test_log_softmax(self):
# This is a test for an optimization that depends on CuDNN v3 or # This is a test for an optimization that depends on cuDNN v3 or
# more recent. Don't test if the CuDNN version is too old. # more recent. Don't test if the cuDNN version is too old.
if cuda.dnn.version() < (3000, 3000): if cuda.dnn.version() < (3000, 3000):
raise SkipTest("Log-softmax is only in cudnn v3+") raise SkipTest("Log-softmax is only in cudnn v3+")
...@@ -826,7 +826,7 @@ class TestDnnInferShapes(utt.InferShapeTester): ...@@ -826,7 +826,7 @@ class TestDnnInferShapes(utt.InferShapeTester):
def test_conv3d(self): def test_conv3d(self):
if not (cuda.dnn.dnn_available() and dnn.version() >= (2000, 2000)): if not (cuda.dnn.dnn_available() and dnn.version() >= (2000, 2000)):
raise SkipTest('"CuDNN 3D convolution requires CuDNN v2') raise SkipTest('"cuDNN 3D convolution requires cuDNN v2')
ftensor5 = T.TensorType(dtype="float32", broadcastable=(False,) * 5) ftensor5 = T.TensorType(dtype="float32", broadcastable=(False,) * 5)
img = ftensor5('img') img = ftensor5('img')
kerns = ftensor5('kerns') kerns = ftensor5('kerns')
...@@ -914,7 +914,7 @@ class TestDnnInferShapes(utt.InferShapeTester): ...@@ -914,7 +914,7 @@ class TestDnnInferShapes(utt.InferShapeTester):
def test_conv3d_gradw(self): def test_conv3d_gradw(self):
if not (cuda.dnn.dnn_available() and dnn.version() >= (2000, 2000)): if not (cuda.dnn.dnn_available() and dnn.version() >= (2000, 2000)):
raise SkipTest('"CuDNN 3D convolution requires CuDNN v2') raise SkipTest('"cuDNN 3D convolution requires cuDNN v2')
ftensor5 = T.TensorType(dtype="float32", broadcastable=(False,) * 5) ftensor5 = T.TensorType(dtype="float32", broadcastable=(False,) * 5)
img = ftensor5('img') img = ftensor5('img')
kerns = ftensor5('kerns') kerns = ftensor5('kerns')
...@@ -1004,7 +1004,7 @@ class TestDnnInferShapes(utt.InferShapeTester): ...@@ -1004,7 +1004,7 @@ class TestDnnInferShapes(utt.InferShapeTester):
def test_conv3d_gradi(self): def test_conv3d_gradi(self):
if not (cuda.dnn.dnn_available() and dnn.version() >= (2000, 2000)): if not (cuda.dnn.dnn_available() and dnn.version() >= (2000, 2000)):
raise SkipTest('"CuDNN 3D convolution requires CuDNN v2') raise SkipTest('"cuDNN 3D convolution requires cuDNN v2')
ftensor5 = T.TensorType(dtype="float32", broadcastable=(False,) * 5) ftensor5 = T.TensorType(dtype="float32", broadcastable=(False,) * 5)
img = ftensor5('img') img = ftensor5('img')
kerns = ftensor5('kerns') kerns = ftensor5('kerns')
...@@ -1392,7 +1392,7 @@ def get_conv3d_test_cases(): ...@@ -1392,7 +1392,7 @@ def get_conv3d_test_cases():
itt = chain(product(test_shapes, border_modes, conv_modes), itt = chain(product(test_shapes, border_modes, conv_modes),
product(test_shapes_full, ['full'], conv_modes)) product(test_shapes_full, ['full'], conv_modes))
else: else:
# CuDNN, before V3, did not support kernels larger than the inputs, # cuDNN, before V3, did not support kernels larger than the inputs,
# even if the original inputs were padded so they would be larger than # even if the original inputs were padded so they would be larger than
# the kernels. If using a version older than V3 don't run the tests # the kernels. If using a version older than V3 don't run the tests
# with kernels larger than the unpadded inputs. # with kernels larger than the unpadded inputs.
...@@ -1404,7 +1404,7 @@ def get_conv3d_test_cases(): ...@@ -1404,7 +1404,7 @@ def get_conv3d_test_cases():
def test_conv3d_fwd(): def test_conv3d_fwd():
if not (cuda.dnn.dnn_available() and dnn.version() >= (2000, 2000)): if not (cuda.dnn.dnn_available() and dnn.version() >= (2000, 2000)):
raise SkipTest('"CuDNN 3D convolution requires CuDNN v2') raise SkipTest('"cuDNN 3D convolution requires cuDNN v2')
def run_conv3d_fwd(inputs_shape, filters_shape, subsample, def run_conv3d_fwd(inputs_shape, filters_shape, subsample,
border_mode, conv_mode): border_mode, conv_mode):
...@@ -1421,7 +1421,7 @@ def test_conv3d_fwd(): ...@@ -1421,7 +1421,7 @@ def test_conv3d_fwd():
filters = shared(filters_val) filters = shared(filters_val)
bias = shared(numpy.zeros(filters_shape[0]).astype('float32')) bias = shared(numpy.zeros(filters_shape[0]).astype('float32'))
# Compile a theano function for the CuDNN implementation # Compile a theano function for the cuDNN implementation
conv = dnn.dnn_conv3d(img=inputs, kerns=filters, conv = dnn.dnn_conv3d(img=inputs, kerns=filters,
border_mode=border_mode, subsample=subsample, border_mode=border_mode, subsample=subsample,
conv_mode=conv_mode) conv_mode=conv_mode)
...@@ -1476,7 +1476,7 @@ def test_conv3d_fwd(): ...@@ -1476,7 +1476,7 @@ def test_conv3d_fwd():
def test_conv3d_bwd(): def test_conv3d_bwd():
if not (cuda.dnn.dnn_available() and dnn.version() >= (2000, 2000)): if not (cuda.dnn.dnn_available() and dnn.version() >= (2000, 2000)):
raise SkipTest('"CuDNN 3D convolution requires CuDNN v2') raise SkipTest('"cuDNN 3D convolution requires cuDNN v2')
def run_conv3d_bwd(inputs_shape, filters_shape, subsample, def run_conv3d_bwd(inputs_shape, filters_shape, subsample,
border_mode, conv_mode): border_mode, conv_mode):
...@@ -1488,7 +1488,7 @@ def test_conv3d_bwd(): ...@@ -1488,7 +1488,7 @@ def test_conv3d_bwd():
filters = shared(filters_val) filters = shared(filters_val)
bias = shared(numpy.zeros(filters_shape[0]).astype('float32')) bias = shared(numpy.zeros(filters_shape[0]).astype('float32'))
# Compile a theano function for the CuDNN implementation # Compile a theano function for the cuDNN implementation
conv = dnn.dnn_conv3d(img=inputs, kerns=filters, conv = dnn.dnn_conv3d(img=inputs, kerns=filters,
border_mode=border_mode, subsample=subsample, border_mode=border_mode, subsample=subsample,
conv_mode=conv_mode) conv_mode=conv_mode)
......
...@@ -20,7 +20,7 @@ if theano.config.mode == 'FAST_COMPILE': ...@@ -20,7 +20,7 @@ if theano.config.mode == 'FAST_COMPILE':
mode_without_gpu = theano.compile.mode.get_mode('FAST_RUN') mode_without_gpu = theano.compile.mode.get_mode('FAST_RUN')
else: else:
mode_with_gpu = theano.compile.mode.get_default_mode().including('gpu') mode_with_gpu = theano.compile.mode.get_default_mode().including('gpu')
mode_without_gpu = theano.compile.mode.get_default_mode() mode_without_gpu = theano.compile.mode.get_default_mode().excluding('gpu')
def test_GpuCrossentropySoftmaxArgmax1HotWithBias(): def test_GpuCrossentropySoftmaxArgmax1HotWithBias():
......
...@@ -69,17 +69,17 @@ def init_dev(dev, name=None): ...@@ -69,17 +69,17 @@ def init_dev(dev, name=None):
warn = None warn = None
cudnn_version = "" cudnn_version = ""
if dev.startswith('cuda'): if dev.startswith('cuda'):
cudnn_version = " (CuDNN not available)" cudnn_version = " (cuDNN not available)"
try: try:
cudnn_version = dnn.version() cudnn_version = dnn.version()
# 4100 should not print warning with cudnn 4 final. # 5100 should not print warning with cudnn 5 final.
if cudnn_version > 4100: if cudnn_version > 5100:
warn = ("Your CuDNN version is more recent than Theano." warn = ("Your cuDNN version is more recent than Theano."
" If you see problems, try updating Theano or" " If you see problems, try updating Theano or"
" downgrading CuDNN to version 4.") " downgrading cuDNN to version 5.")
cudnn_version = " (CuDNN version %s)" % cudnn_version cudnn_version = " (cuDNN version %s)" % cudnn_version
except Exception: except Exception:
pass cudnn_version = dnn.dnn_present.msg
print("Mapped name %s to device %s: %s%s" % ( print("Mapped name %s to device %s: %s%s" % (
name, dev, context.devname, cudnn_version), name, dev, context.devname, cudnn_version),
file=sys.stderr) file=sys.stderr)
......
...@@ -18,7 +18,7 @@ class TestDnnConv2d(test_abstract_conv.BaseTestConv2d): ...@@ -18,7 +18,7 @@ class TestDnnConv2d(test_abstract_conv.BaseTestConv2d):
def setUp(self): def setUp(self):
super(TestDnnConv2d, self).setUp() super(TestDnnConv2d, self).setUp()
self.shared = gpuarray_shared_constructor self.shared = gpuarray_shared_constructor
# provide_shape is not used by the CuDNN impementation # provide_shape is not used by the cuDNN impementation
self.provide_shape = [False] self.provide_shape = [False]
def tcase(self, i, f, s, b, flip, provide_shape): def tcase(self, i, f, s, b, flip, provide_shape):
......
...@@ -171,7 +171,7 @@ def test_pooling(): ...@@ -171,7 +171,7 @@ def test_pooling():
raise SkipTest(dnn.dnn_available.msg) raise SkipTest(dnn.dnn_available.msg)
# 'average_exc_pad' is disabled for versions < 4004 # 'average_exc_pad' is disabled for versions < 4004
if dnn.version() < 4004: if dnn.version(raises=False) < 4004:
modes = ('max', 'average_inc_pad') modes = ('max', 'average_inc_pad')
else: else:
modes = ('max', 'average_inc_pad', 'average_exc_pad') modes = ('max', 'average_inc_pad', 'average_exc_pad')
...@@ -464,7 +464,7 @@ class TestDnnInferShapes(utt.InferShapeTester): ...@@ -464,7 +464,7 @@ class TestDnnInferShapes(utt.InferShapeTester):
[conv_modes[0]])), [conv_modes[0]])),
testcase_func_name=utt.custom_name_func) testcase_func_name=utt.custom_name_func)
def test_conv(self, algo, border_mode, conv_mode): def test_conv(self, algo, border_mode, conv_mode):
if algo == 'winograd' and dnn.version() < 5000: if algo == 'winograd' and dnn.version(raises=False) < 5000:
raise SkipTest(dnn.dnn_available.msg) raise SkipTest(dnn.dnn_available.msg)
self._test_conv(T.ftensor4('img'), self._test_conv(T.ftensor4('img'),
...@@ -597,7 +597,7 @@ class TestDnnInferShapes(utt.InferShapeTester): ...@@ -597,7 +597,7 @@ class TestDnnInferShapes(utt.InferShapeTester):
) )
# 'average_exc_pad' is disabled for versions < 4004 # 'average_exc_pad' is disabled for versions < 4004
if dnn.version() < 4004: if dnn.version(raises=False) < 4004:
modes = ['max', 'average_inc_pad'] modes = ['max', 'average_inc_pad']
else: else:
modes = ['max', 'average_inc_pad', 'average_exc_pad'] modes = ['max', 'average_inc_pad', 'average_exc_pad']
...@@ -732,6 +732,8 @@ def test_dnn_conv_alpha_output_merge(): ...@@ -732,6 +732,8 @@ def test_dnn_conv_alpha_output_merge():
def test_dnn_conv_grad(): def test_dnn_conv_grad():
if not dnn.dnn_available(test_ctx_name):
raise SkipTest(dnn.dnn_available.msg)
b = 1 b = 1
c = 4 c = 4
f = 3 f = 3
...@@ -777,6 +779,10 @@ class test_SoftMax(test_nnet.test_SoftMax): ...@@ -777,6 +779,10 @@ class test_SoftMax(test_nnet.test_SoftMax):
gpu_grad_op = dnn.GpuDnnSoftmaxGrad gpu_grad_op = dnn.GpuDnnSoftmaxGrad
mode = mode_with_gpu mode = mode_with_gpu
def setUp(self):
if not dnn.dnn_available(test_ctx_name):
raise SkipTest(dnn.dnn_available.msg)
def test_softmax_shape_0(self): def test_softmax_shape_0(self):
raise SkipTest("Cudnn doesn't support 0 shapes") raise SkipTest("Cudnn doesn't support 0 shapes")
...@@ -887,9 +893,9 @@ class test_SoftMax(test_nnet.test_SoftMax): ...@@ -887,9 +893,9 @@ class test_SoftMax(test_nnet.test_SoftMax):
]) == 0) ]) == 0)
def test_log_softmax(self): def test_log_softmax(self):
# This is a test for an optimization that depends on CuDNN v3 or # This is a test for an optimization that depends on cuDNN v3 or
# more recent. Don't test if the CuDNN version is too old. # more recent. Don't test if the cuDNN version is too old.
if dnn.version() < 3000: if dnn.version(raises=False) < 3000:
raise SkipTest("Log-softmax is only in cudnn v3+") raise SkipTest("Log-softmax is only in cudnn v3+")
x = T.ftensor4() x = T.ftensor4()
...@@ -928,9 +934,9 @@ class test_SoftMax(test_nnet.test_SoftMax): ...@@ -928,9 +934,9 @@ class test_SoftMax(test_nnet.test_SoftMax):
# Test that the op LogSoftmax is correctly replaced by the op # Test that the op LogSoftmax is correctly replaced by the op
# DnnSoftmax with the 'log' mode. # DnnSoftmax with the 'log' mode.
# This is a test for an optimization that depends on CuDNN v3 or # This is a test for an optimization that depends on cuDNN v3 or
# more recent. Don't test if the CuDNN version is too old. # more recent. Don't test if the cuDNN version is too old.
if dnn.version() < 3000: if dnn.version(raises=False) < 3000:
raise SkipTest("Log-softmax is only in cudnn v3+") raise SkipTest("Log-softmax is only in cudnn v3+")
# Compile a reference function, on the CPU, to be used to validate the # Compile a reference function, on the CPU, to be used to validate the
......
...@@ -106,7 +106,7 @@ def conv2d(input, filters, input_shape=None, filter_shape=None, ...@@ -106,7 +106,7 @@ def conv2d(input, filters, input_shape=None, filter_shape=None,
Notes Notes
----- -----
If CuDNN is available, it will be used on the If cuDNN is available, it will be used on the
GPU. Otherwise, it is the *CorrMM* convolution that will be used GPU. Otherwise, it is the *CorrMM* convolution that will be used
"caffe style convolution". "caffe style convolution".
......
...@@ -225,7 +225,7 @@ def conv2d_grad_wrt_inputs(output_grad, ...@@ -225,7 +225,7 @@ def conv2d_grad_wrt_inputs(output_grad,
Notes Notes
----- -----
:note: If CuDNN is available, it will be used on the :note: If cuDNN is available, it will be used on the
GPU. Otherwise, it is the *CorrMM* convolution that will be used GPU. Otherwise, it is the *CorrMM* convolution that will be used
"caffe style convolution". "caffe style convolution".
...@@ -348,7 +348,7 @@ def conv2d_grad_wrt_weights(input, ...@@ -348,7 +348,7 @@ def conv2d_grad_wrt_weights(input,
Notes Notes
----- -----
:note: If CuDNN is available, it will be used on the :note: If cuDNN is available, it will be used on the
GPU. Otherwise, it is the *CorrMM* convolution that will be used GPU. Otherwise, it is the *CorrMM* convolution that will be used
"caffe style convolution". "caffe style convolution".
......
...@@ -78,8 +78,8 @@ def pool_2d(input, ds, ignore_border=None, st=None, padding=(0, 0), ...@@ -78,8 +78,8 @@ def pool_2d(input, ds, ignore_border=None, st=None, padding=(0, 0),
" default value changed to True (currently" " default value changed to True (currently"
" False). To have consistent behavior with all Theano" " False). To have consistent behavior with all Theano"
" version, explicitly add the parameter ignore_border=True." " version, explicitly add the parameter ignore_border=True."
" On the GPU, using ignore_border=True is needed to use CuDNN." " On the GPU, using ignore_border=True is needed to use cuDNN."
" When using ignore_border=False and not using CuDNN, the only" " When using ignore_border=False and not using cuDNN, the only"
" GPU combination supported is when" " GPU combination supported is when"
" `ds == st and padding == (0, 0) and mode == 'max'`." " `ds == st and padding == (0, 0) and mode == 'max'`."
" Otherwise, the convolution will be executed on CPU.", " Otherwise, the convolution will be executed on CPU.",
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论