Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
P
pytensor
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
图表
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
CI / CD
CI / CD
流水线
作业
日程
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
图像
聊天
创建新问题
作业
提交
问题看板
Open sidebar
testgroup
pytensor
Commits
36eefee4
提交
36eefee4
authored
11月 17, 2014
作者:
f0k
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Updated documentation to match current convolution code
上级
29024484
隐藏空白字符变更
内嵌
并排
正在显示
1 个修改的文件
包含
48 行增加
和
64 行删除
+48
-64
conv.txt
doc/library/tensor/nnet/conv.txt
+48
-64
没有找到文件。
doc/library/tensor/nnet/conv.txt
浏览文件 @
36eefee4
...
@@ -25,64 +25,59 @@
...
@@ -25,64 +25,59 @@
.. note::
.. note::
As of October 21st, 2014, the default GPU image convolution
As of October 21st, 2014, the default GPU image convolution
changed. Here is the algo:
changed: By default, if `cuDNN <https://developer.nvidia.com/cuDNN>`_
is available, we will use it, otherwise we will fall back to using the
gemm version (slower then cuDNN in most cases, uses more memory, but
faster than the legacy version we used before).
- If we can use `cuDNN <https://developer.nvidia.com/cuDNN>`_, use it.
Both cuDNN and the gemm version can be disabled using the Theano flags
- If not, use gemm version (slower then cuDNN, uses more memory).
``optimizer_excluding=conv_dnn`` and ``optimizer_excluding=conv_gemm``,
respectively. In this case, we will fall back to using the legacy
convolution code, which is slower, but does not require extra memory.
To verify that cuDNN is used, you can supply the Theano flag
``optimizer_including=cudnn``. This will raise an error if cuDNN is
unavailable.
If the users do not want the extra memory usage of the gemm
It is not advised to ever disable cuDNN, as this is usually the fastest
version, they can enable the legacy code that is even slower, but
option. Disabling the gemm version is only useful if cuDNN is unavailable
does not use extra memory. For this, use the Theano flag
and you run out of GPU memory.
``optimizer_excluding=conv_gemm``.
There is no reason to use the legacy code or the gemm version if
There are two other implementations: An FFT-based convolution integrated
cuDNN is available.
into Theano, and an implementation by Alex Krizhevsky available via
Pylearn2. See the documentation below on how to use them.
2 other options:
- There is also the fft version that is the fastest in some cases,
but uses even more memory. It does not support striding to remove
computation and has some shapes restriction.
- There is also the cuda_convnet convolution in Pylearn2. It uses a
different memory layout, has shapes restrictions, but does not use
extra memory and is faster then the legacy convolution.
If you want to verify the usage of cuDNN, you can use the Theano
flag ``optimizer_including=cudnn``. This will raise an error if we
can't use cuDNN.
TODO: Give examples on how to use these things! They are pretty complicated.
TODO: Give examples on how to use these things! They are pretty complicated.
- Convolution operators implemented:
- Implemented operators for neural network 2D / image convolution:
- :func:`signal.conv2d <theano.tensor.signal.conv.conv2d>`. See note above.
- :func:`nnet.conv2d <theano.tensor.nnet.conv.conv2d>`.
- :func:`nnet.conv2d <theano.tensor.nnet.conv.conv2d>`.
This is the standard operator for convolutional neural networks working
This is the standard operator for convolutional neural networks working
with batches of multi-channel 2D images, available for CPU and GPU.
with batches of multi-channel 2D images, available for CPU and GPU. It
Most of the more efficient GPU implementations listed below can be used
computes a convolution, i.e., it flips the kernel.
as an automatic replacement for nnet.conv2d by enabling specific graph
Most of the more efficient GPU implementations listed below can be
optimizations. It flip the kernel.
inserted automatically as a replacement for nnet.conv2d via graph
optimizations. Some of these graph optimizations are enabled by default,
others can be enabled via Theano flags.
- :func:`conv2d_fft <theano.sandbox.cuda.fftconv.conv2d_fft>` This
- :func:`conv2d_fft <theano.sandbox.cuda.fftconv.conv2d_fft>` This
is a GPU-only version of nnet.conv2d that uses an FFT transform
is a GPU-only version of nnet.conv2d that uses an FFT transform
to perform the work. It flip
the kernel as
``conv2d``.
to perform the work. It flip
s the kernel just like
``conv2d``.
conv2d_fft should not be used directly as
conv2d_fft should not be used directly as
it does not provide a gradient. Instead, use nnet.conv2d and
it does not provide a gradient. Instead, use nnet.conv2d and
allow Theano's graph optimizer to replace it by the FFT version
allow Theano's graph optimizer to replace it by the FFT version
by setting
by setting
'THEANO_FLAGS=optimizer_including=conv_fft'
'THEANO_FLAGS=optimizer_including=conv_fft_valid:conv_fft_full'
in your environment. If enabled, it will take precedence over cuDNN
in your environement. This
is not enabled by default because it
and the gemm version. It
is not enabled by default because it
has some restrictions on input and uses a lot more memory. Also
has some restrictions on input and uses a lot more memory. Also
note that it requires CUDA >= 5.0, scikits.cuda >= 0.5.0 and
note that it requires CUDA >= 5.0, scikits.cuda >= 0.5.0 and
PyCUDA to run. To deactivate the FFT optimization on a specific
PyCUDA to run. To deactivate the FFT optimization on a specific
nnet.conv2d while the optimization flag
s are
active, you can set
nnet.conv2d while the optimization flag
is
active, you can set
its ``version`` parameter to ``'no_fft'``. To enable it for just
its ``version`` parameter to ``'no_fft'``. To enable it for just
one Theano function:
one Theano function:
.. code-block:: python
.. code-block:: python
mode = theano.compile.get_default_mode()
mode = theano.compile.get_default_mode()
mode = mode.including('conv_fft
_valid', 'conv_fft_full
')
mode = mode.including('conv_fft')
f = theano.function(..., mode=mode)
f = theano.function(..., mode=mode)
...
@@ -90,17 +85,18 @@ TODO: Give examples on how to use these things! They are pretty complicated.
...
@@ -90,17 +85,18 @@ TODO: Give examples on how to use these things! They are pretty complicated.
Wrapper for an open-source GPU-only implementation of conv2d by Alex
Wrapper for an open-source GPU-only implementation of conv2d by Alex
Krizhevsky, very fast, but with several restrictions on input and kernel
Krizhevsky, very fast, but with several restrictions on input and kernel
shapes, and with a different memory layout for the input.
shapes, and with a different memory layout for the input. It does not
flip the kernel.
This is in Pylearn2, where it is normally called from the `linear transform
This is in Pylearn2, where it is normally called from the `linear transform
<http://deeplearning.net/software/pylearn2/library/linear.html>`_
<http://deeplearning.net/software/pylearn2/library/linear.html>`_
implementation, but it can also be used `directly from within Theano
implementation, but it can also be used `directly from within Theano
<http://benanne.github.io/2014/04/03/faster-convolutions-in-theano.html>`_
<http://benanne.github.io/2014/04/03/faster-convolutions-in-theano.html>`_
as a manual replacement for nnet.conv2d.
It does not flip the kernel.
as a manual replacement for nnet.conv2d.
- :func:`GpuCorrMM <theano.sandbox.cuda.blas.GpuCorrMM>`
- :func:`GpuCorrMM <theano.sandbox.cuda.blas.GpuCorrMM>`
This is a GPU-only 2d correlation implementation taken from
This is a GPU-only 2d correlation implementation taken from
`caffe <https://github.com/BVLC/caffe/blob/master/src/caffe/layers/conv_layer.cu>`_
`caffe <https://github.com/BVLC/caffe/blob/master/src/caffe/layers/conv_layer.cu>`_
and also used by Torch.
and also used by Torch.
It does not flip the kernel.
For each element in a batch, it first creates a
For each element in a batch, it first creates a
`Toeplitz <http://en.wikipedia.org/wiki/Toeplitz_matrix>`_ matrix in a CUDA kernel.
`Toeplitz <http://en.wikipedia.org/wiki/Toeplitz_matrix>`_ matrix in a CUDA kernel.
...
@@ -110,36 +106,24 @@ TODO: Give examples on how to use these things! They are pretty complicated.
...
@@ -110,36 +106,24 @@ TODO: Give examples on how to use these things! They are pretty complicated.
``(no of channels * filter width * filter height, output width * output height)``.
``(no of channels * filter width * filter height, output width * output height)``.
As it provides a gradient, you can use it as a replacement for nnet.conv2d.
As it provides a gradient, you can use it as a replacement for nnet.conv2d.
Alternatively, you can use nnet.conv2d and allow Theano's graph optimizer
But usually, you will just use nnet.conv2d and allow Theano's graph
to replace it by the GEMM version by setting
optimizer to automatically replace it by the GEMM version if cuDNN is not
``THEANO_FLAGS=optimizer_including=conv_gemm`` in your environment.
available. To explicitly disable the graph optimizer, set
This is not enabled by default because it uses some extra memory, but the
``THEANO_FLAGS=optimizer_excluding=conv_gemm`` in your environment.
overhead is small compared to conv2d_fft, there are no restrictions on
input or kernel shapes and it is sometimes still faster than cuda-convnet.
If using it, please see the warning about a bug in CUDA 5.0 to 6.0 below.
If using it, please see the warning about a bug in CUDA 5.0 to 6.0 below.
To enable it for just one Theano function:
.. code-block:: python
mode = theano.compile.get_default_mode()
mode = mode.including('conv_gemm')
f = theano.function(..., mode=mode)
- :func:`dnn_conv <theano.sandbox.cuda.dnn.dnn_conv>` GPU-only
- :func:`dnn_conv <theano.sandbox.cuda.dnn.dnn_conv>` GPU-only
convolution using NVIDIA's cuDNN library. To have conv2d()
convolution using NVIDIA's cuDNN library. This requires that you have
automatically converted set
cuDNN installed and available, which in turn requires CUDA 6.5 and a GPU
``THEANO_FLAGS=optimizer_including=cudnn`` in your environment.
with compute capability 3.0 or more.
This will also replace other operations by their a
cuDNN-accelerated equivalent. This requires that you have cuDNN
If cuDNN is available, by default, Theano will replace all nnet.conv2d
installed and available. It requires a GPU with compute
operations with dnn_conv. To explicitly disable it, set
capability 3.0 or more.
``THEANO_FLAGS=optimizer_excluding=conv_dnn`` in your environment.
As dnn_conv has a gradient defined, you can also use it manually.
Since it has a gradient defined it can also be used manually.
- Implemented operators for neural network 3D / video convolution:
- :func:`conv3D <theano.tensor.nnet.Conv3D.conv3D>`
- :func:`conv3D <theano.tensor.nnet.Conv3D.conv3D>`
3D Convolution applying multi-channel 3D filters to batches of
3D Convolution applying multi-channel 3D filters to batches of
multi-channel 3D images. It do not flip the kernel.
multi-channel 3D images. It do
es
not flip the kernel.
- :func:`conv3d_fft <theano.sandbox.cuda.fftconv.conv3d_fft>`
- :func:`conv3d_fft <theano.sandbox.cuda.fftconv.conv3d_fft>`
GPU-only version of conv3D using FFT transform. conv3d_fft should
GPU-only version of conv3D using FFT transform. conv3d_fft should
not be called directly as it does not provide a gradient.
not be called directly as it does not provide a gradient.
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论