提交 03f42b36 authored 作者: Frederic's avatar Frederic

doc cudnn stuff

上级 413531a2
.. _libdoc_cuda_dnn:
================================
:mod:`sandbox.cuda.dnn` -- cuDNN
================================
.. moduleauthor:: LISA
Normally you should not call directly those Ops, but the CPU interface
currently don't allow all option supported by those ops, so it is
possible that you need to call them manually.
`cuDNN <https://developer.nvidia.com/cuDNN>`_ is NVIDIA library with
functionality used by deep neural network. It provide faster
implementation of some operation like the convolution. cuDNN currently
is not installed with CUDA 6.5. You must download it and install it
yourself.
To install it, decompress the downloaded file and make the *.h and
*.so* files available to the compilation environment. On Linux, this
can be done by setting the environment variable LD_LIBRARY_PATH,
LIBRARY_PATH and CPATH to the uncompressed directory path. They work
the same way as PATH. Or you can copy the *.h files to /usr/include
and the files *.so* to /lib64.
Then you need to tell Theano to use it. For the convolution, if cuDNN
is available, we will use it by default, but not for other
operations. Also, it do not give you an error in case it can't use
cuDNN as it will fall back to a slower and more memory hungry version.
To enable the use of all cuDNN operation and get an error if we can't
use cuDNN, use the Theano flags: ``optimizer_including=cudnn``.
Functions
=========
.. automodule:: theano.sandbox.cuda.dnn
:members: dnn_conv, dnn_pool
Ops
===
.. automodule:: theano.sandbox.cuda.dnn
:members: GpuDnnConvDesc, GpuDnnConv, GpuDnnConvGradW, GpuDnnConvGradI, GpuDnnPoolDesc, GpuDnnPool, GpuDnnPoolGrad, GpuDnnSoftmax
...@@ -13,6 +13,7 @@ ...@@ -13,6 +13,7 @@
.. toctree:: .. toctree::
:maxdepth: 1 :maxdepth: 1
op
var var
type type
op dnn
...@@ -96,6 +96,13 @@ if ((err = cudnnCreate(&_handle)) != CUDNN_STATUS_SUCCESS) { ...@@ -96,6 +96,13 @@ if ((err = cudnnCreate(&_handle)) != CUDNN_STATUS_SUCCESS) {
class GpuDnnConvDesc(GpuOp): class GpuDnnConvDesc(GpuOp):
"""
The convolution description.
:param border_mode: 'valid' or 'full'
:param subsample: The subsample, tuple like (dx, dy)
:param conv_mode: 'conv' or 'cross'
"""
__props__ = ('border_mode', 'subsample', 'conv_mode') __props__ = ('border_mode', 'subsample', 'conv_mode')
def c_headers(self): def c_headers(self):
...@@ -354,6 +361,14 @@ if (err%(name)s != CUDNN_STATUS_SUCCESS) { ...@@ -354,6 +361,14 @@ if (err%(name)s != CUDNN_STATUS_SUCCESS) {
class GpuDnnConv(GpuDnnConvBase): class GpuDnnConv(GpuDnnConvBase):
"""
The forward convolution.
:param image:
:param kernel:
:param descr: the convolution descriptor
"""
conv_inputs = 'input', 'kerns' conv_inputs = 'input', 'kerns'
conv_output = 'output' conv_output = 'output'
conv_types = 'tensor4d', 'filter', 'tensor4d' conv_types = 'tensor4d', 'filter', 'tensor4d'
...@@ -377,6 +392,15 @@ class GpuDnnConv(GpuDnnConvBase): ...@@ -377,6 +392,15 @@ class GpuDnnConv(GpuDnnConvBase):
class GpuDnnConvGradW(GpuDnnConvBase): class GpuDnnConvGradW(GpuDnnConvBase):
"""
The convolution gradient with respect to the weights.
:param image:
:param kernel:
:param descr: the convolution descriptor
"""
conv_inputs = 'input', 'output', conv_inputs = 'input', 'output',
conv_output = 'kerns' conv_output = 'kerns'
conv_types = 'tensor4d', 'tensor4d', 'filter' conv_types = 'tensor4d', 'tensor4d', 'filter'
...@@ -385,6 +409,15 @@ class GpuDnnConvGradW(GpuDnnConvBase): ...@@ -385,6 +409,15 @@ class GpuDnnConvGradW(GpuDnnConvBase):
class GpuDnnConvGradI(GpuDnnConvBase): class GpuDnnConvGradI(GpuDnnConvBase):
"""
The convolution gradient with respect to the inputs.
:param image:
:param kernel:
:param descr: the convolution descriptor
"""
conv_inputs = 'kerns', 'output', conv_inputs = 'kerns', 'output',
conv_output = 'input' conv_output = 'input'
conv_types = 'filter', 'tensor4d', 'tensor4d' conv_types = 'filter', 'tensor4d', 'tensor4d'
...@@ -496,6 +529,12 @@ class GpuDnnPoolDesc(GpuOp): ...@@ -496,6 +529,12 @@ class GpuDnnPoolDesc(GpuOp):
class GpuDnnPool(DnnBase): class GpuDnnPool(DnnBase):
"""
Pooling.
:param img: the image 4d tensor.
:param desc: the pooling descriptor.
"""
__props__ = () __props__ = ()
def make_node(self, img, desc): def make_node(self, img, desc):
...@@ -622,6 +661,14 @@ if (err%(name)s != CUDNN_STATUS_SUCCESS) { ...@@ -622,6 +661,14 @@ if (err%(name)s != CUDNN_STATUS_SUCCESS) {
class GpuDnnPoolGrad(DnnBase): class GpuDnnPoolGrad(DnnBase):
"""
The pooling gradient.
:param inp: the input of the pooling.
:param inp_grad: same size as out, but is the corresponding gradient information.
:param out: the output of the pooling in the forward.
:param desc: The pooling descriptor.
"""
__props__ = () __props__ = ()
def make_node(self, inp, inp_grad, out, desc): def make_node(self, inp, inp_grad, out, desc):
...@@ -784,13 +831,12 @@ class GpuDnnSoftmax(DnnBase): ...@@ -784,13 +831,12 @@ class GpuDnnSoftmax(DnnBase):
""" """
Op for the cuDNN Softmax. Op for the cuDNN Softmax.
Parameters'' :param tensor_format: Whether the data format is 'bc01' or 'b01c'
-tensor_format: Whether the data format is 'bc01' or 'b01c' :param algo: 'fast' or 'accurate' indicating whether computations should be
-algo: 'fast' or 'accurate' indicating whether computations should be optimized for speed or accuracy respectively.
optimized for speed or accuracy respectively. :param mode: 'instance' or 'channel' indicating whether the softmax should
-mode: 'instance' or 'channel' indicating whether the softmax should be be computed per image across 'c01' or per spationali location '01' per
computed per image across 'c01' or per spationali location '01' per image image across 'c'.
across 'c'.
""" """
__props__ = ('tensor_format', 'mode', 'algo') __props__ = ('tensor_format', 'mode', 'algo')
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论