提交 783d83e7 authored 作者: Frederic Bastien's avatar Frederic Bastien

Chagne CuDNN to cuDNN at all other places

上级 fc4b5c18
......@@ -10,7 +10,7 @@ Theano 0.7 (26th of March, 2015)
We recommand to everyone to upgrade to this version.
Highlights:
* Integration of CuDNN for 2D convolutions and pooling on supported GPUs
* Integration of cuDNN for 2D convolutions and pooling on supported GPUs
* Too many optimizations and new features to count
* Various fixes and improvements to scan
* Better support for GPU on Windows
......
......@@ -10,7 +10,7 @@ We recommend that everybody update to this version.
Highlights:
- Python 2 and 3 support with the same code base
- Faster optimization
- Integration of CuDNN for better GPU performance
- Integration of cuDNN for better GPU performance
- Many Scan improvements (execution speed up, ...)
- optimizer=fast_compile moves computation to the GPU.
- Better convolution on CPU and GPU. (CorrMM, cudnn, 3d conv, more parameter)
......
......@@ -235,7 +235,7 @@ CPU and GPU memory usage.
Could speed up and lower memory usage:
- :ref:`CuDNN <libdoc_cuda_dnn>` default CuDNN convolution use less
- :ref:`cuDNN <libdoc_cuda_dnn>` default cuDNN convolution use less
memory then Theano version. But some flags allow it to use more
memory. GPU only.
- Shortly avail, multi-GPU.
......
......@@ -41,22 +41,22 @@ Theano will still work if the user did not introduce them manually.
The recently added Theano flag :attr:`dnn.enabled
<config.dnn.enabled>` allows to change the default behavior to force
it or disable it. Older Theano version do not support this flag. To
get an error when CuDNN can not be used with them, use this flag:
get an error when cuDNN can not be used with them, use this flag:
``optimizer_including=cudnn``.
.. note::
CuDNN v3 has now been released. CuDNN v2 remains supported but CuDNN v3 is
cuDNN v3 has now been released. cuDNN v2 remains supported but cuDNN v3 is
faster and offers many more options. We recommend that everybody update to
v3.
.. note::
Starting in CuDNN v3, multiple convolution implementations are offered and
Starting in cuDNN v3, multiple convolution implementations are offered and
it is possible to use heuristics to automatically choose a convolution
implementation well suited to the parameters of the convolution.
The Theano flag ``dnn.conv.algo_fwd`` allows to specify the CuDNN
The Theano flag ``dnn.conv.algo_fwd`` allows to specify the cuDNN
convolution implementation that Theano should use for forward convolutions.
Possible values include :
......@@ -69,20 +69,20 @@ get an error when CuDNN can not be used with them, use this flag:
* ``fft_tiling`` : use the Fast Fourrier Transform implementation of convolution
with tiling (high memory usage, but less then fft)
* ``guess_once`` : the first time a convolution is executed, the
implementation to use is chosen according to CuDNN's heuristics and reused
implementation to use is chosen according to cuDNN's heuristics and reused
for every subsequent execution of the convolution.
* ``guess_on_shape_change`` : like ``guess_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels
don't match the shapes from the last execution.
* ``time_once`` : the first time a convolution is executed, every convolution
implementation offered by CuDNN is executed and timed. The fastest is
implementation offered by cuDNN is executed and timed. The fastest is
reused for every subsequent execution of the convolution.
* ``time_on_shape_change`` : like ``time_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels
don't match the shapes from the last execution.
The Theano flag ``dnn.conv.algo_bwd_filter`` and
``dnn.conv.algo_bwd_data`` allows to specify the CuDNN
``dnn.conv.algo_bwd_data`` allows to specify the cuDNN
convolution implementation that Theano should use for gradient
convolutions. Possible values include :
......@@ -92,13 +92,13 @@ get an error when CuDNN can not be used with them, use this flag:
* ``fft`` : use the Fast Fourrier Transform implementation of convolution
(very high memory usage)
* ``guess_once`` : the first time a convolution is executed, the
implementation to use is chosen according to CuDNN's heuristics and reused
implementation to use is chosen according to cuDNN's heuristics and reused
for every subsequent execution of the convolution.
* ``guess_on_shape_change`` : like ``guess_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels
don't match the shapes from the last execution.
* ``time_once`` : the first time a convolution is executed, every convolution
implementation offered by CuDNN is executed and timed. The fastest is
implementation offered by cuDNN is executed and timed. The fastest is
reused for every subsequent execution of the convolution.
* ``time_on_shape_change`` : like ``time_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels
......
......@@ -43,17 +43,17 @@ To get an error if Theano can not use cuDNN, use this Theano flag:
.. note::
CuDNN v3 has now been released. CuDNN v2 remains supported but CuDNN v3 is
cuDNN v3 has now been released. cuDNN v2 remains supported but cuDNN v3 is
faster and offers many more options. We recommend that everybody update to
v3.
.. note::
Starting in CuDNN v3, multiple convolution implementations are offered and
Starting in cuDNN v3, multiple convolution implementations are offered and
it is possible to use heuristics to automatically choose a convolution
implementation well suited to the parameters of the convolution.
The Theano flag ``dnn.conv.algo_fwd`` allows to specify the CuDNN
The Theano flag ``dnn.conv.algo_fwd`` allows to specify the cuDNN
convolution implementation that Theano should use for forward convolutions.
Possible values include :
......@@ -64,19 +64,19 @@ To get an error if Theano can not use cuDNN, use this Theano flag:
* ``fft`` : use the Fast Fourrier Transform implementation of convolution
(very high memory usage)
* ``guess_once`` : the first time a convolution is executed, the
implementation to use is chosen according to CuDNN's heuristics and reused
implementation to use is chosen according to cuDNN's heuristics and reused
for every subsequent execution of the convolution.
* ``guess_on_shape_change`` : like ``guess_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels
don't match the shapes from the last execution.
* ``time_once`` : the first time a convolution is executed, every convolution
implementation offered by CuDNN is executed and timed. The fastest is
implementation offered by cuDNN is executed and timed. The fastest is
reused for every subsequent execution of the convolution.
* ``time_on_shape_change`` : like ``time_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels
don't match the shapes from the last execution.
The Theano flag ``dnn.conv.algo_bwd`` allows to specify the CuDNN
The Theano flag ``dnn.conv.algo_bwd`` allows to specify the cuDNN
convolution implementation that Theano should use for gradient convolutions.
Possible values include :
......@@ -86,13 +86,13 @@ To get an error if Theano can not use cuDNN, use this Theano flag:
* ``fft`` : use the Fast Fourrier Transform implementation of convolution
(very high memory usage)
* ``guess_once`` : the first time a convolution is executed, the
implementation to use is chosen according to CuDNN's heuristics and reused
implementation to use is chosen according to cuDNN's heuristics and reused
for every subsequent execution of the convolution.
* ``guess_on_shape_change`` : like ``guess_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels
don't match the shapes from the last execution.
* ``time_once`` : the first time a convolution is executed, every convolution
implementation offered by CuDNN is executed and timed. The fastest is
implementation offered by cuDNN is executed and timed. The fastest is
reused for every subsequent execution of the convolution.
* ``time_on_shape_change`` : like ``time_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels
......
......@@ -309,25 +309,25 @@ AddConfigVar('dnn.conv.algo_bwd',
in_c_key=False)
AddConfigVar('dnn.conv.algo_fwd',
"Default implementation to use for CuDNN forward convolution.",
"Default implementation to use for cuDNN forward convolution.",
EnumStr(*SUPPORTED_DNN_CONV_ALGO_FWD),
in_c_key=False)
AddConfigVar('dnn.conv.algo_bwd_data',
"Default implementation to use for CuDNN backward convolution to "
"Default implementation to use for cuDNN backward convolution to "
"get the gradients of the convolution with regard to the inputs.",
EnumStr(*SUPPORTED_DNN_CONV_ALGO_BWD_DATA),
in_c_key=False)
AddConfigVar('dnn.conv.algo_bwd_filter',
"Default implementation to use for CuDNN backward convolution to "
"Default implementation to use for cuDNN backward convolution to "
"get the gradients of the convolution with regard to the "
"filters.",
EnumStr(*SUPPORTED_DNN_CONV_ALGO_BWD_FILTER),
in_c_key=False)
AddConfigVar('dnn.conv.precision',
"Default data precision to use for the computation in CuDNN "
"Default data precision to use for the computation in cuDNN "
"convolutions (defaults to the same dtype as the inputs of the "
"convolutions).",
EnumStr('as_input', 'float16', 'float32', 'float64'),
......@@ -350,9 +350,9 @@ AddConfigVar('dnn.library_path',
StrParam(default_dnn_path('lib' if sys.platform == 'darwin' else 'lib64')))
AddConfigVar('dnn.enabled',
"'auto', use CuDNN if available, but silently fall back"
"'auto', use cuDNN if available, but silently fall back"
" to not using it if not present."
" If True and CuDNN can not be used, raise an error."
" If True and cuDNN can not be used, raise an error."
" If False, disable cudnn",
StrParam("auto", "True", "False"),
in_c_key=False)
......
......@@ -78,7 +78,7 @@ APPLY_SPECIFIC(conv_fwd)(CudaNdarray *input, CudaNdarray *kerns,
// Obtain a convolution algorithm appropriate for the input and kernel
// shapes. Either by choosing one according to heuristics or by making
// CuDNN time every implementation and choose the best one.
// cuDNN time every implementation and choose the best one.
if (CHOOSE_ALGO_TIME)
{
// Time the different implementations to choose the best one
......
......@@ -76,7 +76,7 @@ APPLY_SPECIFIC(conv_gi)(CudaNdarray *kerns, CudaNdarray *output,
{
// Obtain a convolution algorithm appropriate for the kernel and output
// shapes. Either by choosing one according to heuristics or by making
// CuDNN time every implementation and choose the best one.
// cuDNN time every implementation and choose the best one.
if (CHOOSE_ALGO_TIME)
{
// Time the different implementations to choose the best one
......
......@@ -76,7 +76,7 @@ APPLY_SPECIFIC(conv_gw)(CudaNdarray *input, CudaNdarray *output,
{
// Obtain a convolution algorithm appropriate for the input and output
// shapes. Either by choosing one according to heuristics or by making
// CuDNN time every implementation and choose the best one.
// cuDNN time every implementation and choose the best one.
if (CHOOSE_ALGO_TIME)
{
// Time the different implementations to choose the best one
......
......@@ -25,7 +25,7 @@ else:
class TestDnnConv2d(test_abstract_conv.BaseTestConv2d):
def setUp(self):
super(TestDnnConv2d, self).setUp()
# provide_shape is not used by the CuDNN impementation
# provide_shape is not used by the cuDNN impementation
self.provide_shape = [False]
self.shared = gpu_shared
......
......@@ -520,7 +520,7 @@ def _test_full(cls, mode=None, version=[-1], extra_shapes=[],
def test_full():
# If using CuDNN version before v3, only run the tests where the
# If using cuDNN version before v3, only run the tests where the
# kernels are not larger than the input in any spatial dimension.
if cuda.dnn.dnn_available() and cuda.dnn.version() < (3000, 3000):
test_bigger_kernels = False
......@@ -542,7 +542,7 @@ def test_dnn_full():
if not cuda.dnn.dnn_available():
raise SkipTest(cuda.dnn.dnn_available.msg)
# If using CuDNN version before v3, only run the tests where the
# If using cuDNN version before v3, only run the tests where the
# kernels are not larger than the input in any spatial dimension.
if cuda.dnn.version() < (3000, 3000):
test_bigger_kernels = False
......
......@@ -413,7 +413,7 @@ def test_old_pool_interface():
def test_pooling3d():
# CuDNN 3d pooling requires CuDNN v3. Don't test if the CuDNN version is
# cuDNN 3d pooling requires cuDNN v3. Don't test if the cuDNN version is
# too old.
if not cuda.dnn.dnn_available() or cuda.dnn.version() < (3000, 3000):
raise SkipTest(cuda.dnn.dnn_available.msg)
......@@ -641,8 +641,8 @@ class test_DnnSoftMax(test_nnet.test_SoftMax):
)]) == 0)
def test_log_softmax(self):
# This is a test for an optimization that depends on CuDNN v3 or
# more recent. Don't test if the CuDNN version is too old.
# This is a test for an optimization that depends on cuDNN v3 or
# more recent. Don't test if the cuDNN version is too old.
if cuda.dnn.version() < (3000, 3000):
raise SkipTest("Log-softmax is only in cudnn v3+")
......@@ -826,7 +826,7 @@ class TestDnnInferShapes(utt.InferShapeTester):
def test_conv3d(self):
if not (cuda.dnn.dnn_available() and dnn.version() >= (2000, 2000)):
raise SkipTest('"CuDNN 3D convolution requires CuDNN v2')
raise SkipTest('"cuDNN 3D convolution requires cuDNN v2')
ftensor5 = T.TensorType(dtype="float32", broadcastable=(False,) * 5)
img = ftensor5('img')
kerns = ftensor5('kerns')
......@@ -914,7 +914,7 @@ class TestDnnInferShapes(utt.InferShapeTester):
def test_conv3d_gradw(self):
if not (cuda.dnn.dnn_available() and dnn.version() >= (2000, 2000)):
raise SkipTest('"CuDNN 3D convolution requires CuDNN v2')
raise SkipTest('"cuDNN 3D convolution requires cuDNN v2')
ftensor5 = T.TensorType(dtype="float32", broadcastable=(False,) * 5)
img = ftensor5('img')
kerns = ftensor5('kerns')
......@@ -1004,7 +1004,7 @@ class TestDnnInferShapes(utt.InferShapeTester):
def test_conv3d_gradi(self):
if not (cuda.dnn.dnn_available() and dnn.version() >= (2000, 2000)):
raise SkipTest('"CuDNN 3D convolution requires CuDNN v2')
raise SkipTest('"cuDNN 3D convolution requires cuDNN v2')
ftensor5 = T.TensorType(dtype="float32", broadcastable=(False,) * 5)
img = ftensor5('img')
kerns = ftensor5('kerns')
......@@ -1392,7 +1392,7 @@ def get_conv3d_test_cases():
itt = chain(product(test_shapes, border_modes, conv_modes),
product(test_shapes_full, ['full'], conv_modes))
else:
# CuDNN, before V3, did not support kernels larger than the inputs,
# cuDNN, before V3, did not support kernels larger than the inputs,
# even if the original inputs were padded so they would be larger than
# the kernels. If using a version older than V3 don't run the tests
# with kernels larger than the unpadded inputs.
......@@ -1404,7 +1404,7 @@ def get_conv3d_test_cases():
def test_conv3d_fwd():
if not (cuda.dnn.dnn_available() and dnn.version() >= (2000, 2000)):
raise SkipTest('"CuDNN 3D convolution requires CuDNN v2')
raise SkipTest('"cuDNN 3D convolution requires cuDNN v2')
def run_conv3d_fwd(inputs_shape, filters_shape, subsample,
border_mode, conv_mode):
......@@ -1421,7 +1421,7 @@ def test_conv3d_fwd():
filters = shared(filters_val)
bias = shared(numpy.zeros(filters_shape[0]).astype('float32'))
# Compile a theano function for the CuDNN implementation
# Compile a theano function for the cuDNN implementation
conv = dnn.dnn_conv3d(img=inputs, kerns=filters,
border_mode=border_mode, subsample=subsample,
conv_mode=conv_mode)
......@@ -1476,7 +1476,7 @@ def test_conv3d_fwd():
def test_conv3d_bwd():
if not (cuda.dnn.dnn_available() and dnn.version() >= (2000, 2000)):
raise SkipTest('"CuDNN 3D convolution requires CuDNN v2')
raise SkipTest('"cuDNN 3D convolution requires cuDNN v2')
def run_conv3d_bwd(inputs_shape, filters_shape, subsample,
border_mode, conv_mode):
......@@ -1488,7 +1488,7 @@ def test_conv3d_bwd():
filters = shared(filters_val)
bias = shared(numpy.zeros(filters_shape[0]).astype('float32'))
# Compile a theano function for the CuDNN implementation
# Compile a theano function for the cuDNN implementation
conv = dnn.dnn_conv3d(img=inputs, kerns=filters,
border_mode=border_mode, subsample=subsample,
conv_mode=conv_mode)
......
......@@ -18,7 +18,7 @@ class TestDnnConv2d(test_abstract_conv.BaseTestConv2d):
def setUp(self):
super(TestDnnConv2d, self).setUp()
self.shared = gpuarray_shared_constructor
# provide_shape is not used by the CuDNN impementation
# provide_shape is not used by the cuDNN impementation
self.provide_shape = [False]
def tcase(self, i, f, s, b, flip, provide_shape):
......
......@@ -893,8 +893,8 @@ class test_SoftMax(test_nnet.test_SoftMax):
]) == 0)
def test_log_softmax(self):
# This is a test for an optimization that depends on CuDNN v3 or
# more recent. Don't test if the CuDNN version is too old.
# This is a test for an optimization that depends on cuDNN v3 or
# more recent. Don't test if the cuDNN version is too old.
if dnn.version(False) < 3000:
raise SkipTest("Log-softmax is only in cudnn v3+")
......@@ -934,8 +934,8 @@ class test_SoftMax(test_nnet.test_SoftMax):
# Test that the op LogSoftmax is correctly replaced by the op
# DnnSoftmax with the 'log' mode.
# This is a test for an optimization that depends on CuDNN v3 or
# more recent. Don't test if the CuDNN version is too old.
# This is a test for an optimization that depends on cuDNN v3 or
# more recent. Don't test if the cuDNN version is too old.
if dnn.version(False) < 3000:
raise SkipTest("Log-softmax is only in cudnn v3+")
......
......@@ -106,7 +106,7 @@ def conv2d(input, filters, input_shape=None, filter_shape=None,
Notes
-----
If CuDNN is available, it will be used on the
If cuDNN is available, it will be used on the
GPU. Otherwise, it is the *CorrMM* convolution that will be used
"caffe style convolution".
......
......@@ -225,7 +225,7 @@ def conv2d_grad_wrt_inputs(output_grad,
Notes
-----
:note: If CuDNN is available, it will be used on the
:note: If cuDNN is available, it will be used on the
GPU. Otherwise, it is the *CorrMM* convolution that will be used
"caffe style convolution".
......@@ -348,7 +348,7 @@ def conv2d_grad_wrt_weights(input,
Notes
-----
:note: If CuDNN is available, it will be used on the
:note: If cuDNN is available, it will be used on the
GPU. Otherwise, it is the *CorrMM* convolution that will be used
"caffe style convolution".
......
......@@ -78,8 +78,8 @@ def pool_2d(input, ds, ignore_border=None, st=None, padding=(0, 0),
" default value changed to True (currently"
" False). To have consistent behavior with all Theano"
" version, explicitly add the parameter ignore_border=True."
" On the GPU, using ignore_border=True is needed to use CuDNN."
" When using ignore_border=False and not using CuDNN, the only"
" On the GPU, using ignore_border=True is needed to use cuDNN."
" When using ignore_border=False and not using cuDNN, the only"
" GPU combination supported is when"
" `ds == st and padding == (0, 0) and mode == 'max'`."
" Otherwise, the convolution will be executed on CPU.",
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论