Merge pull request #4366 from nouiz/cudnn_init

cudnn v5 cleanup init, small test fix.

Merge pull request #4366 from nouiz/cudnn_init
c10aa585 · Frédéric Bastien · 7444fdd6 · 5f872965 · c10aa585 · c10aa585
--- a/HISTORY.txt
+++ b/HISTORY.txt
@@ -10,7 +10,7 @@ Theano 0.7 (26th of March, 2015)
 We recommand to everyone to upgrade to this version.
 Highlights:
- * Integration of CuDNN for 2D convolutions and pooling on supported GPUs
+ * Integration of cuDNN for 2D convolutions and pooling on supported GPUs
 * Too many optimizations and new features to count
 * Various fixes and improvements to scan
 * Better support for GPU on Windows

--- a/NEWS.txt
+++ b/NEWS.txt
@@ -10,7 +10,7 @@ We recommend that everybody update to this version.
 Highlights:
 - Python 2 and 3 support with the same code base
 - Faster optimization
- - Integration of CuDNN for better GPU performance
+ - Integration of cuDNN for better GPU performance
 - Many Scan improvements (execution speed up, ...)
 - optimizer=fast_compile moves computation to the GPU.
 - Better convolution on CPU and GPU. (CorrMM, cudnn, 3d conv, more parameter)

--- a/doc/faq.txt
+++ b/doc/faq.txt
@@ -235,7 +235,7 @@ CPU and GPU memory usage.
 Could speed up and lower memory usage:
- :ref:`CuDNN <libdoc_cuda_dnn>` default CuDNN convolution use less
+- :ref:`cuDNN <libdoc_cuda_dnn>` default cuDNN convolution use less
   memory then Theano version. But some flags allow it to use more
   memory. GPU only.
 - Shortly avail, multi-GPU.

--- a/doc/index.txt
+++ b/doc/index.txt
@@ -25,7 +25,7 @@ News
 * Multi-GPU.
-* We added support for :ref:`CuDNN v4 <libdoc_cuda_dnn>`.
+* We added support for :ref:`CuDNN v5 <libdoc_cuda_dnn>`.
 * We added support for :attr:`CNMeM <config.lib.cnmem>` to speed up
  the GPU memory allocation.

--- a/doc/library/sandbox/cuda/dnn.txt
+++ b/doc/library/sandbox/cuda/dnn.txt
@@ -41,22 +41,22 @@ Theano will still work if the user did not introduce them manually.
 The recently added Theano flag :attr:`dnn.enabled
 <config.dnn.enabled>` allows to change the default behavior to force
 it or disable it. Older Theano version do not support this flag. To
-get an error when CuDNN can not be used with them, use this flag:
+get an error when cuDNN can not be used with them, use this flag:
 ``optimizer_including=cudnn``.
 .. note::
-   CuDNN v3 has now been released. CuDNN v2 remains supported but CuDNN v3 is
+   cuDNN v3 has now been released. cuDNN v2 remains supported but cuDNN v3 is
   faster and offers many more options. We recommend that everybody update to
   v3.
 .. note::
-   Starting in CuDNN v3, multiple convolution implementations are offered and
+   Starting in cuDNN v3, multiple convolution implementations are offered and
   it is possible to use heuristics to automatically choose a convolution
   implementation well suited to the parameters of the convolution.
-   The Theano flag ``dnn.conv.algo_fwd`` allows to specify the CuDNN
+   The Theano flag ``dnn.conv.algo_fwd`` allows to specify the cuDNN
   convolution implementation that Theano should use for forward convolutions.
   Possible values include :
@@ -69,20 +69,20 @@ get an error when CuDNN can not be used with them, use this flag:
   * ``fft_tiling`` : use the Fast Fourrier Transform implementation of convolution
     with tiling (high memory usage, but less then fft)
   * ``guess_once`` : the first time a convolution is executed, the
-     implementation to use is chosen according to CuDNN's heuristics and reused
+     implementation to use is chosen according to cuDNN's heuristics and reused
     for every subsequent execution of the convolution.
   * ``guess_on_shape_change`` : like ``guess_once`` but a new convolution
     implementation selected every time the shapes of the inputs and kernels
     don't match the shapes from the last execution.
   * ``time_once`` : the first time a convolution is executed, every convolution
-     implementation offered by CuDNN is executed and timed. The fastest is
+     implementation offered by cuDNN is executed and timed. The fastest is
     reused for every subsequent execution of the convolution.
   * ``time_on_shape_change`` : like ``time_once`` but a new convolution
     implementation selected every time the shapes of the inputs and kernels
     don't match the shapes from the last execution.
   The Theano flag ``dnn.conv.algo_bwd_filter`` and
-   ``dnn.conv.algo_bwd_data`` allows to specify the CuDNN
+   ``dnn.conv.algo_bwd_data`` allows to specify the cuDNN
   convolution implementation that Theano should use for gradient
   convolutions.  Possible values include :
@@ -92,13 +92,13 @@ get an error when CuDNN can not be used with them, use this flag:
   * ``fft`` : use the Fast Fourrier Transform implementation of convolution
     (very high memory usage)
   * ``guess_once`` : the first time a convolution is executed, the
-     implementation to use is chosen according to CuDNN's heuristics and reused
+     implementation to use is chosen according to cuDNN's heuristics and reused
     for every subsequent execution of the convolution.
   * ``guess_on_shape_change`` : like ``guess_once`` but a new convolution
     implementation selected every time the shapes of the inputs and kernels
     don't match the shapes from the last execution.
   * ``time_once`` : the first time a convolution is executed, every convolution
-     implementation offered by CuDNN is executed and timed. The fastest is
+     implementation offered by cuDNN is executed and timed. The fastest is
     reused for every subsequent execution of the convolution.
   * ``time_on_shape_change`` : like ``time_once`` but a new convolution
     implementation selected every time the shapes of the inputs and kernels

--- a/doc/library/sandbox/gpuarray/dnn.txt
+++ b/doc/library/sandbox/gpuarray/dnn.txt
@@ -43,17 +43,17 @@ To get an error if Theano can not use cuDNN, use this Theano flag:
 .. note::
-   CuDNN v3 has now been released. CuDNN v2 remains supported but CuDNN v3 is
+   cuDNN v3 has now been released. cuDNN v2 remains supported but cuDNN v3 is
   faster and offers many more options. We recommend that everybody update to
   v3.
 .. note::
-   Starting in CuDNN v3, multiple convolution implementations are offered and
+   Starting in cuDNN v3, multiple convolution implementations are offered and
   it is possible to use heuristics to automatically choose a convolution
   implementation well suited to the parameters of the convolution.
-   The Theano flag ``dnn.conv.algo_fwd`` allows to specify the CuDNN
+   The Theano flag ``dnn.conv.algo_fwd`` allows to specify the cuDNN
   convolution implementation that Theano should use for forward convolutions.
   Possible values include :
@@ -64,19 +64,19 @@ To get an error if Theano can not use cuDNN, use this Theano flag:
   * ``fft`` : use the Fast Fourrier Transform implementation of convolution
     (very high memory usage)
   * ``guess_once`` : the first time a convolution is executed, the
-     implementation to use is chosen according to CuDNN's heuristics and reused
+     implementation to use is chosen according to cuDNN's heuristics and reused
     for every subsequent execution of the convolution.
   * ``guess_on_shape_change`` : like ``guess_once`` but a new convolution
     implementation selected every time the shapes of the inputs and kernels
     don't match the shapes from the last execution.
   * ``time_once`` : the first time a convolution is executed, every convolution
-     implementation offered by CuDNN is executed and timed. The fastest is
+     implementation offered by cuDNN is executed and timed. The fastest is
     reused for every subsequent execution of the convolution.
   * ``time_on_shape_change`` : like ``time_once`` but a new convolution
     implementation selected every time the shapes of the inputs and kernels
     don't match the shapes from the last execution.
-   The Theano flag ``dnn.conv.algo_bwd`` allows to specify the CuDNN
+   The Theano flag ``dnn.conv.algo_bwd`` allows to specify the cuDNN
   convolution implementation that Theano should use for gradient convolutions.
   Possible values include :
@@ -86,13 +86,13 @@ To get an error if Theano can not use cuDNN, use this Theano flag:
   * ``fft`` : use the Fast Fourrier Transform implementation of convolution
     (very high memory usage)
   * ``guess_once`` : the first time a convolution is executed, the
-     implementation to use is chosen according to CuDNN's heuristics and reused
+     implementation to use is chosen according to cuDNN's heuristics and reused
     for every subsequent execution of the convolution.
   * ``guess_on_shape_change`` : like ``guess_once`` but a new convolution
     implementation selected every time the shapes of the inputs and kernels
     don't match the shapes from the last execution.
   * ``time_once`` : the first time a convolution is executed, every convolution
-     implementation offered by CuDNN is executed and timed. The fastest is
+     implementation offered by cuDNN is executed and timed. The fastest is
     reused for every subsequent execution of the convolution.
   * ``time_on_shape_change`` : like ``time_once`` but a new convolution
     implementation selected every time the shapes of the inputs and kernels

--- a/theano/configdefaults.py
+++ b/theano/configdefaults.py
@@ -309,25 +309,25 @@ AddConfigVar('dnn.conv.algo_bwd',
             in_c_key=False)
 AddConfigVar('dnn.conv.algo_fwd',
-             "Default implementation to use for CuDNN forward convolution.",
+             "Default implementation to use for cuDNN forward convolution.",
             EnumStr(*SUPPORTED_DNN_CONV_ALGO_FWD),
             in_c_key=False)
 AddConfigVar('dnn.conv.algo_bwd_data',
-             "Default implementation to use for CuDNN backward convolution to "
+             "Default implementation to use for cuDNN backward convolution to "
             "get the gradients of the convolution with regard to the inputs.",
             EnumStr(*SUPPORTED_DNN_CONV_ALGO_BWD_DATA),
             in_c_key=False)
 AddConfigVar('dnn.conv.algo_bwd_filter',
-             "Default implementation to use for CuDNN backward convolution to "
+             "Default implementation to use for cuDNN backward convolution to "
             "get the gradients of the convolution with regard to the "
             "filters.",
             EnumStr(*SUPPORTED_DNN_CONV_ALGO_BWD_FILTER),
             in_c_key=False)
 AddConfigVar('dnn.conv.precision',
-             "Default data precision to use for the computation in CuDNN "
+             "Default data precision to use for the computation in cuDNN "
             "convolutions (defaults to the same dtype as the inputs of the "
             "convolutions).",
             EnumStr('as_input', 'float16', 'float32', 'float64'),
@@ -350,9 +350,9 @@ AddConfigVar('dnn.library_path',
             StrParam(default_dnn_path('lib' if sys.platform == 'darwin' else 'lib64')))
 AddConfigVar('dnn.enabled',
-             "'auto', use CuDNN if available, but silently fall back"
+             "'auto', use cuDNN if available, but silently fall back"
             " to not using it if not present."
-             " If True and CuDNN can not be used, raise an error."
+             " If True and cuDNN can not be used, raise an error."
             " If False, disable cudnn",
             StrParam("auto", "True", "False"),
             in_c_key=False)

--- a/theano/sandbox/cuda/__init__.py
+++ b/theano/sandbox/cuda/__init__.py
@@ -270,14 +270,14 @@ from theano.sandbox.cuda.type import CudaNdarrayType
 def dnn_available():
    if config.dnn.enabled == "False":
        dnn_available.avail = False
-        dnn_available.msg = "disabled by dnn.enabled flag"
+        dnn_available.msg = "Disabled by dnn.enabled flag"
    if dnn_available.avail is None and not cuda_available:
        dnn_available.msg = "CUDA not available"
        dnn_available.avail = False
    elif dnn_available.avail is None:
        dev = active_device_number()
        if device_properties(dev)['major'] < 3:
-            dnn_available.msg = "Device not supported by cuDNN"
+            dnn_available.msg = "Device not supported"
            dnn_available.avail = False
        else:
            preambule = """
@@ -315,7 +315,7 @@ if ((err = cudnnCreate(&_handle)) != CUDNN_STATUS_SUCCESS) {
            dnn_available.avail = comp
            if not dnn_available.avail:
                dnn_available.msg = (
-                    "Theano can not compile with cuDNN. We got this error:\n" +
+                    "Can not compile with cuDNN. We got this error:\n" +
                    str(err))
            else:
                # If we can compile, check that we can import and run.
@@ -326,18 +326,17 @@ if ((err = cudnnCreate(&_handle)) != CUDNN_STATUS_SUCCESS) {
                                         " from one version, but we link with"
                                         " a different version %s" % str(v))
                    raise RuntimeError(dnn_available.msg)
-                if v == -1 or v[0] < 3007:
+                if v == -1 or v[0] < 4007:
-                    # 3007 is the final release of cudnn v3
+                    # 4007 is the final release of cudnn v4
                    dnn_available.avail = False
-                    dnn_available.msg = (
+                    dnn_available.msg = "Version is too old. Update to v5, was %d." % v[0]
-                        "You have an old release of CuDNN (or a release "
-                        "candidate) that isn't supported.  Please update to "
-                        "at least v3 final version.")
                    raise RuntimeError(dnn_available.msg)
+                else:
+                    dnn_available.avail = comp
    if config.dnn.enabled == "True":
        if not dnn_available.avail:
            raise RuntimeError(
-                "You enabled CuDNN, but we aren't able to use it: %s" %
+                "You enabled cuDNN, but we aren't able to use it: %s" %
                dnn_available.msg)
    return dnn_available.avail
@@ -582,14 +581,15 @@ def use(device,
                    if dnn_available():
                        (hdr_v, runtime_v) = dnn_version()
                        cudnn_version = runtime_v
-                        # 4100 should not print warning with cudnn 4 final.
+                        # 5100 should not print warning with cudnn 5 final.
-                        if cudnn_version > 4100:
+                        if cudnn_version > 5100:
-                            warn = ("Your CuDNN version is more recent then Theano."
+                            warn = ("Your cuDNN version is more recent than the one"
-                                    " If you see problems, try updating Theano or"
+                                    " Theano officially supports."
-                                    " downgrading CuDNN to version 4.")
+                                    " If you see any problems, try updating Theano or"
+                                    " downgrading cuDNN to version 5.")
                except Exception:
-                    pass
+                    cudnn_version = dnn_available.msg
-                print("Using gpu device %d: %s (CNMeM is %s, CuDNN %s)" % (
+                print("Using gpu device %d: %s (CNMeM is %s, cuDNN %s)" % (
                    active_device_number(),
                    active_device_name(),
                    cnmem_enabled,

--- a/theano/sandbox/cuda/dnn.py
+++ b/theano/sandbox/cuda/dnn.py
@@ -323,30 +323,30 @@ class GpuDnnConv(DnnBase, COp):
        if self.inplace:
            self.destroy_map = {0: [2]}
-        # In CuDNN version older than V3, the FFT implementation and the
+        # In cuDNN version older than V3, the FFT implementation and the
        # option to time the different implementations to get the fastest
        # are both unavailable.
        if version() < (3000, 3000):
            if self.algo == 'fft':
-                raise RuntimeError("CuDNN FFT convolution requires CuDNN v3")
+                raise RuntimeError("cuDNN FFT convolution requires cuDNN v3")
            elif self.algo in ['guess_once', 'guess_on_shape_change']:
-                raise RuntimeError("CuDNN selection of convolution "
+                raise RuntimeError("cuDNN selection of convolution "
                                   "implementation based on heuristics "
-                                   "requires CuDNN v3")
+                                   "requires cuDNN v3")
            elif self.algo in ['time_once', 'time_on_shape_change']:
-                raise RuntimeError("CuDNN convolution timing requires CuDNN "
+                raise RuntimeError("cuDNN convolution timing requires cuDNN "
                                   "v3")
-        # The fft_tiling implementation is only available from CuDNN V4 onward
+        # The fft_tiling implementation is only available from cuDNN V4 onward
        if version() < (4000, 4000):
            if self.algo == 'fft_tiling':
-                raise RuntimeError("CuDNN tiled-FFT convolution requires "
+                raise RuntimeError("cuDNN tiled-FFT convolution requires "
-                                   "CuDNN v4 or more recent")
+                                   "cuDNN v4 or more recent")
        if version() < (5000, 5000):
            if self.algo == 'winograd':
-                raise RuntimeError("CuDNN winograd convolution requires "
+                raise RuntimeError("cuDNN winograd convolution requires "
-                                   "CuDNN v5 or more recent")
+                                   "cuDNN v5 or more recent")
        assert self.algo in ['none', 'small', 'large', 'fft', 'fft_tiling',
                             'winograd', 'guess_once', 'guess_on_shape_change',
@@ -517,11 +517,11 @@ class GpuDnnConv3d(GpuDnnConv):
        if version() < (5000, 5000):
            if self.algo == 'fft_tiling':
-                raise RuntimeError("CuDNN 3d tiled-FFT convolution requires "
+                raise RuntimeError("cuDNN 3d tiled-FFT convolution requires "
-                                   "CuDNN v5 or more recent")
+                                   "cuDNN v5 or more recent")
            elif self.algo == 'winograd':
-                raise RuntimeError("CuDNN 3d winograd convolution requires "
+                raise RuntimeError("cuDNN 3d winograd convolution requires "
-                                   "CuDNN v5 or more recent")
+                                   "cuDNN v5 or more recent")
    def make_node(self, img, kern, output, desc, alpha=None, beta=None):
@@ -834,17 +834,17 @@ class GpuDnnConvGradI(DnnBase, COp):
        if self.inplace:
            self.destroy_map = {0: [2]}
-        # The small-workspace implementation is only available from CuDNN V4
+        # The small-workspace implementation is only available from cuDNN V4
        # onward.
        if version() < (4000, 4000):
            if self.algo == 'fft_tiling':
-                raise RuntimeError("CuDNN's tiled-FFT convolution requires "
+                raise RuntimeError("cuDNN's tiled-FFT convolution requires "
-                                   "CuDNN v4 or more recent")
+                                   "cuDNN v4 or more recent")
        if version() < (5000, 5000):
            if self.algo == 'winograd':
-                raise RuntimeError("CuDNN's winograd convolution requires "
+                raise RuntimeError("cuDNN's winograd convolution requires "
-                                   "CuDNN v5 or more recent")
+                                   "cuDNN v5 or more recent")
        assert self.algo in ['none', 'deterministic', 'fft', 'fft_tiling',
                             'winograd', 'guess_once', 'guess_on_shape_change',
@@ -997,11 +997,11 @@ class GpuDnnConv3dGradI(GpuDnnConvGradI):
        assert self.algo in good_algo
        if version() < (5000, 5000):
            if self.algo == 'fft_tiling':
-                raise RuntimeError("CuDNN 3d tiled-FFT convolution requires "
+                raise RuntimeError("cuDNN 3d tiled-FFT convolution requires "
-                                   "CuDNN v5 or more recent")
+                                   "cuDNN v5 or more recent")
            elif self.algo == 'winograd':
-                raise RuntimeError("CuDNN 3d winograd convolution requires "
+                raise RuntimeError("cuDNN 3d winograd convolution requires "
-                                   "CuDNN v5 or more recent")
+                                   "cuDNN v5 or more recent")
    def grad(self, inp, grads):
        kerns, top, output, desc, alpha, beta = inp
@@ -1079,7 +1079,7 @@ def dnn_conv(img, kerns, border_mode='valid', subsample=(1, 1),
        *deprecated*, use parameter algo instead.
    algo : {'none', 'small', 'large', 'fft', 'guess_once', 'guess_on_shape_change', 'time_once', 'time_on_shape_change'}
        Convolution implementation to use. Some of its  values may require certain
-        versions of CuDNN to be installed. Default is the value of
+        versions of cuDNN to be installed. Default is the value of
        :attr:`config.dnn.conv.algo_fwd`.
    precision : {'as_input', 'float16', 'float32', 'float64'}
        Description of the dtype in which the computation of the convolution
@@ -1365,13 +1365,13 @@ class GpuDnnPoolDesc(GpuOp):
        self.pad = pad
        if (pad[0] != 0 or pad[1] != 0) and version() == -1:
-            raise RuntimeError("CuDNN pooling with padding requires CuDNN v2")
+            raise RuntimeError("cuDNN pooling with padding requires cuDNN v2")
        if self.get_ndim() == 3 and version() < (3000, 3000):
-            raise RuntimeError("CuDNN 3d pooling requires CuDNN v3")
+            raise RuntimeError("cuDNN 3d pooling requires cuDNN v3")
        if (mode == 'average_exc_pad' and max(pad) > 0 and
                version() < (4004, 4004)):
            raise RuntimeError(
-                "CuDNN pooling mode 'average_exc_pad' requires at least v4")
+                "cuDNN pooling mode 'average_exc_pad' requires at least v4")
    def get_ndim(self):
        return len(self.ws)
@@ -1383,7 +1383,7 @@ class GpuDnnPoolDesc(GpuOp):
    def make_node(self):
        if self.pad != (0, 0) and version() == -1:
-            raise RuntimeError("CuDNN pooling with padding requires CuDNN v2")
+            raise RuntimeError("cuDNN pooling with padding requires cuDNN v2")
        node = Apply(self, [],
                     [CDataType("cudnnPoolingDescriptor_t",
@@ -1983,7 +1983,7 @@ class GpuDnnSoftmaxBase(DnnBase):
        Always set this to 'bc01'.
    algo : {'fast', 'accurate', 'log'}
        Indicating whether, respectively, computations should be optimized for
-        speed, for accuracy, or if CuDNN should rather compute the log-softmax instead.
+        speed, for accuracy, or if cuDNN should rather compute the log-softmax instead.
    mode : {'instance', 'channel'}
        Indicating whether the softmax should be computed per image across 'c01'
        or per spatial location '01' per image across 'c'.
@@ -2004,7 +2004,7 @@ class GpuDnnSoftmaxBase(DnnBase):
        self.tensor_format = tensor_format
        if algo == 'log' and version() < (3000, 3000):
-            raise RuntimeError("CuDNN log-softmax requires CuDNN v3")
+            raise RuntimeError("cuDNN log-softmax requires cuDNN v3")
        assert(algo in ('fast', 'accurate', 'log'))
        self.algo = algo
@@ -2526,7 +2526,7 @@ if True:
    @register_opt('cudnn')
    @local_optimizer([GpuElemwise, LogSoftmax])
    def local_log_softmax_dnn(node):
-        # The log-softmax implementation is only available starting at CuDNN V3
+        # The log-softmax implementation is only available starting at cuDNN V3
        if not dnn_available() or version() < (3000, 3000):
            return

--- a/theano/sandbox/cuda/dnn_fwd.c
+++ b/theano/sandbox/cuda/dnn_fwd.c
@@ -78,7 +78,7 @@ APPLY_SPECIFIC(conv_fwd)(CudaNdarray *input, CudaNdarray *kerns,
        // Obtain a convolution algorithm appropriate for the input and kernel
        // shapes. Either by choosing one according to heuristics or by making
-        // CuDNN time every implementation and choose the best one.
+        // cuDNN time every implementation and choose the best one.
        if (CHOOSE_ALGO_TIME)
        {
          // Time the different implementations to choose the best one

--- a/theano/sandbox/cuda/dnn_gi.c
+++ b/theano/sandbox/cuda/dnn_gi.c
@@ -76,7 +76,7 @@ APPLY_SPECIFIC(conv_gi)(CudaNdarray *kerns, CudaNdarray *output,
      {
        // Obtain a convolution algorithm appropriate for the kernel and output
        // shapes. Either by choosing one according to heuristics or by making
-        // CuDNN time every implementation and choose the best one.
+        // cuDNN time every implementation and choose the best one.
        if (CHOOSE_ALGO_TIME)
        {
          // Time the different implementations to choose the best one

--- a/theano/sandbox/cuda/dnn_gw.c
+++ b/theano/sandbox/cuda/dnn_gw.c
@@ -76,7 +76,7 @@ APPLY_SPECIFIC(conv_gw)(CudaNdarray *input, CudaNdarray *output,
      {
        // Obtain a convolution algorithm appropriate for the input and output
        // shapes. Either by choosing one according to heuristics or by making
-        // CuDNN time every implementation and choose the best one.
+        // cuDNN time every implementation and choose the best one.
        if (CHOOSE_ALGO_TIME)
        {
          // Time the different implementations to choose the best one

--- a/theano/sandbox/cuda/tests/test_abstractconv.py
+++ b/theano/sandbox/cuda/tests/test_abstractconv.py
@@ -25,7 +25,7 @@ else:
 class TestDnnConv2d(test_abstract_conv.BaseTestConv2d):
    def setUp(self):
        super(TestDnnConv2d, self).setUp()
-        # provide_shape is not used by the CuDNN impementation
+        # provide_shape is not used by the cuDNN impementation
        self.provide_shape = [False]
        self.shared = gpu_shared

--- a/theano/sandbox/cuda/tests/test_conv_cuda_ndarray.py
+++ b/theano/sandbox/cuda/tests/test_conv_cuda_ndarray.py
@@ -520,7 +520,7 @@ def _test_full(cls, mode=None, version=[-1], extra_shapes=[],
 def test_full():
-    # If using CuDNN version before v3, only run the tests where the
+    # If using cuDNN version before v3, only run the tests where the
    # kernels are not larger than the input in any spatial dimension.
    if cuda.dnn.dnn_available() and cuda.dnn.version() < (3000, 3000):
        test_bigger_kernels = False
@@ -542,7 +542,7 @@ def test_dnn_full():
    if not cuda.dnn.dnn_available():
        raise SkipTest(cuda.dnn.dnn_available.msg)
-    # If using CuDNN version before v3, only run the tests where the
+    # If using cuDNN version before v3, only run the tests where the
    # kernels are not larger than the input in any spatial dimension.
    if cuda.dnn.version() < (3000, 3000):
        test_bigger_kernels = False

--- a/theano/sandbox/cuda/tests/test_dnn.py
+++ b/theano/sandbox/cuda/tests/test_dnn.py
@@ -413,7 +413,7 @@ def test_old_pool_interface():
 def test_pooling3d():
-    # CuDNN 3d pooling requires CuDNN v3. Don't test if the CuDNN version is
+    # cuDNN 3d pooling requires cuDNN v3. Don't test if the cuDNN version is
    # too old.
    if not cuda.dnn.dnn_available() or cuda.dnn.version() < (3000, 3000):
        raise SkipTest(cuda.dnn.dnn_available.msg)
@@ -641,8 +641,8 @@ class test_DnnSoftMax(test_nnet.test_SoftMax):
                    )]) == 0)
    def test_log_softmax(self):
-        # This is a test for an optimization that depends on CuDNN v3 or
+        # This is a test for an optimization that depends on cuDNN v3 or
-        # more recent. Don't test if the CuDNN version is too old.
+        # more recent. Don't test if the cuDNN version is too old.
        if cuda.dnn.version() < (3000, 3000):
            raise SkipTest("Log-softmax is only in cudnn v3+")
@@ -826,7 +826,7 @@ class TestDnnInferShapes(utt.InferShapeTester):
    def test_conv3d(self):
        if not (cuda.dnn.dnn_available() and dnn.version() >= (2000, 2000)):
-            raise SkipTest('"CuDNN 3D convolution requires CuDNN v2')
+            raise SkipTest('"cuDNN 3D convolution requires cuDNN v2')
        ftensor5 = T.TensorType(dtype="float32", broadcastable=(False,) * 5)
        img = ftensor5('img')
        kerns = ftensor5('kerns')
@@ -914,7 +914,7 @@ class TestDnnInferShapes(utt.InferShapeTester):
    def test_conv3d_gradw(self):
        if not (cuda.dnn.dnn_available() and dnn.version() >= (2000, 2000)):
-            raise SkipTest('"CuDNN 3D convolution requires CuDNN v2')
+            raise SkipTest('"cuDNN 3D convolution requires cuDNN v2')
        ftensor5 = T.TensorType(dtype="float32", broadcastable=(False,) * 5)
        img = ftensor5('img')
        kerns = ftensor5('kerns')
@@ -1004,7 +1004,7 @@ class TestDnnInferShapes(utt.InferShapeTester):
    def test_conv3d_gradi(self):
        if not (cuda.dnn.dnn_available() and dnn.version() >= (2000, 2000)):
-            raise SkipTest('"CuDNN 3D convolution requires CuDNN v2')
+            raise SkipTest('"cuDNN 3D convolution requires cuDNN v2')
        ftensor5 = T.TensorType(dtype="float32", broadcastable=(False,) * 5)
        img = ftensor5('img')
        kerns = ftensor5('kerns')
@@ -1392,7 +1392,7 @@ def get_conv3d_test_cases():
        itt = chain(product(test_shapes, border_modes, conv_modes),
                    product(test_shapes_full, ['full'], conv_modes))
    else:
-        # CuDNN, before V3, did not support kernels larger than the inputs,
+        # cuDNN, before V3, did not support kernels larger than the inputs,
        # even if the original inputs were padded so they would be larger than
        # the kernels. If using a version older than V3 don't run the tests
        # with kernels larger than the unpadded inputs.
@@ -1404,7 +1404,7 @@ def get_conv3d_test_cases():
 def test_conv3d_fwd():
    if not (cuda.dnn.dnn_available() and dnn.version() >= (2000, 2000)):
-        raise SkipTest('"CuDNN 3D convolution requires CuDNN v2')
+        raise SkipTest('"cuDNN 3D convolution requires cuDNN v2')
    def run_conv3d_fwd(inputs_shape, filters_shape, subsample,
                       border_mode, conv_mode):
@@ -1421,7 +1421,7 @@ def test_conv3d_fwd():
        filters = shared(filters_val)
        bias = shared(numpy.zeros(filters_shape[0]).astype('float32'))
-        # Compile a theano function for the CuDNN implementation
+        # Compile a theano function for the cuDNN implementation
        conv = dnn.dnn_conv3d(img=inputs, kerns=filters,
                              border_mode=border_mode, subsample=subsample,
                              conv_mode=conv_mode)
@@ -1476,7 +1476,7 @@ def test_conv3d_fwd():
 def test_conv3d_bwd():
    if not (cuda.dnn.dnn_available() and dnn.version() >= (2000, 2000)):
-        raise SkipTest('"CuDNN 3D convolution requires CuDNN v2')
+        raise SkipTest('"cuDNN 3D convolution requires cuDNN v2')
    def run_conv3d_bwd(inputs_shape, filters_shape, subsample,
                       border_mode, conv_mode):
@@ -1488,7 +1488,7 @@ def test_conv3d_bwd():
        filters = shared(filters_val)
        bias = shared(numpy.zeros(filters_shape[0]).astype('float32'))
-        # Compile a theano function for the CuDNN implementation
+        # Compile a theano function for the cuDNN implementation
        conv = dnn.dnn_conv3d(img=inputs, kerns=filters,
                              border_mode=border_mode, subsample=subsample,
                              conv_mode=conv_mode)

--- a/theano/sandbox/cuda/tests/test_nnet.py
+++ b/theano/sandbox/cuda/tests/test_nnet.py
@@ -20,7 +20,7 @@ if theano.config.mode == 'FAST_COMPILE':
    mode_without_gpu = theano.compile.mode.get_mode('FAST_RUN')
 else:
    mode_with_gpu = theano.compile.mode.get_default_mode().including('gpu')
-    mode_without_gpu = theano.compile.mode.get_default_mode()
+    mode_without_gpu = theano.compile.mode.get_default_mode().excluding('gpu')
 def test_GpuCrossentropySoftmaxArgmax1HotWithBias():

--- a/theano/sandbox/gpuarray/__init__.py
+++ b/theano/sandbox/gpuarray/__init__.py
@@ -69,17 +69,17 @@ def init_dev(dev, name=None):
        warn = None
        cudnn_version = ""
        if dev.startswith('cuda'):
-            cudnn_version = " (CuDNN not available)"
+            cudnn_version = " (cuDNN not available)"
            try:
                cudnn_version = dnn.version()
-                # 4100 should not print warning with cudnn 4 final.
+                # 5100 should not print warning with cudnn 5 final.
-                if cudnn_version > 4100:
+                if cudnn_version > 5100:
-                    warn = ("Your CuDNN version is more recent than Theano."
+                    warn = ("Your cuDNN version is more recent than Theano."
                            " If you see problems, try updating Theano or"
-                            " downgrading CuDNN to version 4.")
+                            " downgrading cuDNN to version 5.")
-                cudnn_version = " (CuDNN version %s)" % cudnn_version
+                cudnn_version = " (cuDNN version %s)" % cudnn_version
            except Exception:
-                pass
+                cudnn_version = dnn.dnn_present.msg
        print("Mapped name %s to device %s: %s%s" % (
            name, dev, context.devname, cudnn_version),
              file=sys.stderr)

--- a/theano/sandbox/gpuarray/dnn.py
+++ b/theano/sandbox/gpuarray/dnn.py
--- a/theano/sandbox/gpuarray/tests/test_abstractconv.py
+++ b/theano/sandbox/gpuarray/tests/test_abstractconv.py
@@ -18,7 +18,7 @@ class TestDnnConv2d(test_abstract_conv.BaseTestConv2d):
    def setUp(self):
        super(TestDnnConv2d, self).setUp()
        self.shared = gpuarray_shared_constructor
-        # provide_shape is not used by the CuDNN impementation
+        # provide_shape is not used by the cuDNN impementation
        self.provide_shape = [False]
    def tcase(self, i, f, s, b, flip, provide_shape):

--- a/theano/sandbox/gpuarray/tests/test_dnn.py
+++ b/theano/sandbox/gpuarray/tests/test_dnn.py
@@ -171,7 +171,7 @@ def test_pooling():
        raise SkipTest(dnn.dnn_available.msg)
    # 'average_exc_pad' is disabled for versions < 4004
-    if dnn.version() < 4004:
+    if dnn.version(raises=False) < 4004:
        modes = ('max', 'average_inc_pad')
    else:
        modes = ('max', 'average_inc_pad', 'average_exc_pad')
@@ -464,7 +464,7 @@ class TestDnnInferShapes(utt.InferShapeTester):
                                        [conv_modes[0]])),
                          testcase_func_name=utt.custom_name_func)
    def test_conv(self, algo, border_mode, conv_mode):
-        if algo == 'winograd' and dnn.version() < 5000:
+        if algo == 'winograd' and dnn.version(raises=False) < 5000:
            raise SkipTest(dnn.dnn_available.msg)
        self._test_conv(T.ftensor4('img'),
@@ -597,7 +597,7 @@ class TestDnnInferShapes(utt.InferShapeTester):
        )
        # 'average_exc_pad' is disabled for versions < 4004
-        if dnn.version() < 4004:
+        if dnn.version(raises=False) < 4004:
            modes = ['max', 'average_inc_pad']
        else:
            modes = ['max', 'average_inc_pad', 'average_exc_pad']
@@ -732,6 +732,8 @@ def test_dnn_conv_alpha_output_merge():
 def test_dnn_conv_grad():
+    if not dnn.dnn_available(test_ctx_name):
+        raise SkipTest(dnn.dnn_available.msg)
    b = 1
    c = 4
    f = 3
@@ -777,6 +779,10 @@ class test_SoftMax(test_nnet.test_SoftMax):
    gpu_grad_op = dnn.GpuDnnSoftmaxGrad
    mode = mode_with_gpu
+    def setUp(self):
+        if not dnn.dnn_available(test_ctx_name):
+            raise SkipTest(dnn.dnn_available.msg)
    def test_softmax_shape_0(self):
        raise SkipTest("Cudnn doesn't support 0 shapes")
@@ -887,9 +893,9 @@ class test_SoftMax(test_nnet.test_SoftMax):
                    ]) == 0)
    def test_log_softmax(self):
-        # This is a test for an optimization that depends on CuDNN v3 or
+        # This is a test for an optimization that depends on cuDNN v3 or
-        # more recent. Don't test if the CuDNN version is too old.
+        # more recent. Don't test if the cuDNN version is too old.
-        if dnn.version() < 3000:
+        if dnn.version(raises=False) < 3000:
            raise SkipTest("Log-softmax is only in cudnn v3+")
        x = T.ftensor4()
@@ -928,9 +934,9 @@ class test_SoftMax(test_nnet.test_SoftMax):
        # Test that the op LogSoftmax is correctly replaced by the op
        # DnnSoftmax with the 'log' mode.
-        # This is a test for an optimization that depends on CuDNN v3 or
+        # This is a test for an optimization that depends on cuDNN v3 or
-        # more recent. Don't test if the CuDNN version is too old.
+        # more recent. Don't test if the cuDNN version is too old.
-        if dnn.version() < 3000:
+        if dnn.version(raises=False) < 3000:
            raise SkipTest("Log-softmax is only in cudnn v3+")
        # Compile a reference function, on the CPU, to be used to validate the

--- a/theano/tensor/nnet/__init__.py
+++ b/theano/tensor/nnet/__init__.py
@@ -106,7 +106,7 @@ def conv2d(input, filters, input_shape=None, filter_shape=None,
    Notes
    -----
-        If CuDNN is available, it will be used on the
+        If cuDNN is available, it will be used on the
        GPU. Otherwise, it is the *CorrMM* convolution that will be used
        "caffe style convolution".

--- a/theano/tensor/nnet/abstract_conv.py
+++ b/theano/tensor/nnet/abstract_conv.py
@@ -225,7 +225,7 @@ def conv2d_grad_wrt_inputs(output_grad,
    Notes
    -----
-    :note: If CuDNN is available, it will be used on the
+    :note: If cuDNN is available, it will be used on the
        GPU. Otherwise, it is the *CorrMM* convolution that will be used
        "caffe style convolution".
@@ -348,7 +348,7 @@ def conv2d_grad_wrt_weights(input,
    Notes
    -----
-    :note: If CuDNN is available, it will be used on the
+    :note: If cuDNN is available, it will be used on the
        GPU. Otherwise, it is the *CorrMM* convolution that will be used
        "caffe style convolution".

--- a/theano/tensor/signal/pool.py
+++ b/theano/tensor/signal/pool.py
@@ -78,8 +78,8 @@ def pool_2d(input, ds, ignore_border=None, st=None, padding=(0, 0),
            " default value changed to True (currently"
            " False). To have consistent behavior with all Theano"
            " version, explicitly add the parameter ignore_border=True."
-            " On the GPU, using ignore_border=True is needed to use CuDNN."
+            " On the GPU, using ignore_border=True is needed to use cuDNN."
-            " When using ignore_border=False and not using CuDNN, the only"
+            " When using ignore_border=False and not using cuDNN, the only"
            " GPU combination supported is when"
            " `ds == st and padding == (0, 0) and mode == 'max'`."
            " Otherwise, the convolution will be executed on CPU.",