Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
P
pytensor
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
图表
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
CI / CD
CI / CD
流水线
作业
日程
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
图像
聊天
创建新问题
作业
提交
问题看板
Open sidebar
testgroup
pytensor
Commits
c10aa585
提交
c10aa585
authored
4月 13, 2016
作者:
Frédéric Bastien
浏览文件
操作
浏览文件
下载
差异文件
Merge pull request #4366 from nouiz/cudnn_init
cudnn v5 cleanup init, small test fix.
上级
7444fdd6
5f872965
全部展开
隐藏空白字符变更
内嵌
并排
正在显示
23 个修改的文件
包含
121 行增加
和
115 行删除
+121
-115
HISTORY.txt
HISTORY.txt
+1
-1
NEWS.txt
NEWS.txt
+1
-1
faq.txt
doc/faq.txt
+1
-1
index.txt
doc/index.txt
+1
-1
dnn.txt
doc/library/sandbox/cuda/dnn.txt
+9
-9
dnn.txt
doc/library/sandbox/gpuarray/dnn.txt
+8
-8
configdefaults.py
theano/configdefaults.py
+6
-6
__init__.py
theano/sandbox/cuda/__init__.py
+17
-17
dnn.py
theano/sandbox/cuda/dnn.py
+31
-31
dnn_fwd.c
theano/sandbox/cuda/dnn_fwd.c
+1
-1
dnn_gi.c
theano/sandbox/cuda/dnn_gi.c
+1
-1
dnn_gw.c
theano/sandbox/cuda/dnn_gw.c
+1
-1
test_abstractconv.py
theano/sandbox/cuda/tests/test_abstractconv.py
+1
-1
test_conv_cuda_ndarray.py
theano/sandbox/cuda/tests/test_conv_cuda_ndarray.py
+2
-2
test_dnn.py
theano/sandbox/cuda/tests/test_dnn.py
+11
-11
test_nnet.py
theano/sandbox/cuda/tests/test_nnet.py
+1
-1
__init__.py
theano/sandbox/gpuarray/__init__.py
+7
-7
dnn.py
theano/sandbox/gpuarray/dnn.py
+0
-0
test_abstractconv.py
theano/sandbox/gpuarray/tests/test_abstractconv.py
+1
-1
test_dnn.py
theano/sandbox/gpuarray/tests/test_dnn.py
+15
-9
__init__.py
theano/tensor/nnet/__init__.py
+1
-1
abstract_conv.py
theano/tensor/nnet/abstract_conv.py
+2
-2
pool.py
theano/tensor/signal/pool.py
+2
-2
没有找到文件。
HISTORY.txt
浏览文件 @
c10aa585
...
...
@@ -10,7 +10,7 @@ Theano 0.7 (26th of March, 2015)
We recommand to everyone to upgrade to this version.
Highlights:
* Integration of
C
uDNN for 2D convolutions and pooling on supported GPUs
* Integration of
c
uDNN for 2D convolutions and pooling on supported GPUs
* Too many optimizations and new features to count
* Various fixes and improvements to scan
* Better support for GPU on Windows
...
...
NEWS.txt
浏览文件 @
c10aa585
...
...
@@ -10,7 +10,7 @@ We recommend that everybody update to this version.
Highlights:
- Python 2 and 3 support with the same code base
- Faster optimization
- Integration of
C
uDNN for better GPU performance
- Integration of
c
uDNN for better GPU performance
- Many Scan improvements (execution speed up, ...)
- optimizer=fast_compile moves computation to the GPU.
- Better convolution on CPU and GPU. (CorrMM, cudnn, 3d conv, more parameter)
...
...
doc/faq.txt
浏览文件 @
c10aa585
...
...
@@ -235,7 +235,7 @@ CPU and GPU memory usage.
Could speed up and lower memory usage:
- :ref:`
CuDNN <libdoc_cuda_dnn>` default C
uDNN convolution use less
- :ref:`
cuDNN <libdoc_cuda_dnn>` default c
uDNN convolution use less
memory then Theano version. But some flags allow it to use more
memory. GPU only.
- Shortly avail, multi-GPU.
...
...
doc/index.txt
浏览文件 @
c10aa585
...
...
@@ -25,7 +25,7 @@ News
* Multi-GPU.
* We added support for :ref:`CuDNN v
4
<libdoc_cuda_dnn>`.
* We added support for :ref:`CuDNN v
5
<libdoc_cuda_dnn>`.
* We added support for :attr:`CNMeM <config.lib.cnmem>` to speed up
the GPU memory allocation.
...
...
doc/library/sandbox/cuda/dnn.txt
浏览文件 @
c10aa585
...
...
@@ -41,22 +41,22 @@ Theano will still work if the user did not introduce them manually.
The recently added Theano flag :attr:`dnn.enabled
<config.dnn.enabled>` allows to change the default behavior to force
it or disable it. Older Theano version do not support this flag. To
get an error when
C
uDNN can not be used with them, use this flag:
get an error when
c
uDNN can not be used with them, use this flag:
``optimizer_including=cudnn``.
.. note::
CuDNN v3 has now been released. CuDNN v2 remains supported but C
uDNN v3 is
cuDNN v3 has now been released. cuDNN v2 remains supported but c
uDNN v3 is
faster and offers many more options. We recommend that everybody update to
v3.
.. note::
Starting in
C
uDNN v3, multiple convolution implementations are offered and
Starting in
c
uDNN v3, multiple convolution implementations are offered and
it is possible to use heuristics to automatically choose a convolution
implementation well suited to the parameters of the convolution.
The Theano flag ``dnn.conv.algo_fwd`` allows to specify the
C
uDNN
The Theano flag ``dnn.conv.algo_fwd`` allows to specify the
c
uDNN
convolution implementation that Theano should use for forward convolutions.
Possible values include :
...
...
@@ -69,20 +69,20 @@ get an error when CuDNN can not be used with them, use this flag:
* ``fft_tiling`` : use the Fast Fourrier Transform implementation of convolution
with tiling (high memory usage, but less then fft)
* ``guess_once`` : the first time a convolution is executed, the
implementation to use is chosen according to
C
uDNN's heuristics and reused
implementation to use is chosen according to
c
uDNN's heuristics and reused
for every subsequent execution of the convolution.
* ``guess_on_shape_change`` : like ``guess_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels
don't match the shapes from the last execution.
* ``time_once`` : the first time a convolution is executed, every convolution
implementation offered by
C
uDNN is executed and timed. The fastest is
implementation offered by
c
uDNN is executed and timed. The fastest is
reused for every subsequent execution of the convolution.
* ``time_on_shape_change`` : like ``time_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels
don't match the shapes from the last execution.
The Theano flag ``dnn.conv.algo_bwd_filter`` and
``dnn.conv.algo_bwd_data`` allows to specify the
C
uDNN
``dnn.conv.algo_bwd_data`` allows to specify the
c
uDNN
convolution implementation that Theano should use for gradient
convolutions. Possible values include :
...
...
@@ -92,13 +92,13 @@ get an error when CuDNN can not be used with them, use this flag:
* ``fft`` : use the Fast Fourrier Transform implementation of convolution
(very high memory usage)
* ``guess_once`` : the first time a convolution is executed, the
implementation to use is chosen according to
C
uDNN's heuristics and reused
implementation to use is chosen according to
c
uDNN's heuristics and reused
for every subsequent execution of the convolution.
* ``guess_on_shape_change`` : like ``guess_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels
don't match the shapes from the last execution.
* ``time_once`` : the first time a convolution is executed, every convolution
implementation offered by
C
uDNN is executed and timed. The fastest is
implementation offered by
c
uDNN is executed and timed. The fastest is
reused for every subsequent execution of the convolution.
* ``time_on_shape_change`` : like ``time_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels
...
...
doc/library/sandbox/gpuarray/dnn.txt
浏览文件 @
c10aa585
...
...
@@ -43,17 +43,17 @@ To get an error if Theano can not use cuDNN, use this Theano flag:
.. note::
CuDNN v3 has now been released. CuDNN v2 remains supported but C
uDNN v3 is
cuDNN v3 has now been released. cuDNN v2 remains supported but c
uDNN v3 is
faster and offers many more options. We recommend that everybody update to
v3.
.. note::
Starting in
C
uDNN v3, multiple convolution implementations are offered and
Starting in
c
uDNN v3, multiple convolution implementations are offered and
it is possible to use heuristics to automatically choose a convolution
implementation well suited to the parameters of the convolution.
The Theano flag ``dnn.conv.algo_fwd`` allows to specify the
C
uDNN
The Theano flag ``dnn.conv.algo_fwd`` allows to specify the
c
uDNN
convolution implementation that Theano should use for forward convolutions.
Possible values include :
...
...
@@ -64,19 +64,19 @@ To get an error if Theano can not use cuDNN, use this Theano flag:
* ``fft`` : use the Fast Fourrier Transform implementation of convolution
(very high memory usage)
* ``guess_once`` : the first time a convolution is executed, the
implementation to use is chosen according to
C
uDNN's heuristics and reused
implementation to use is chosen according to
c
uDNN's heuristics and reused
for every subsequent execution of the convolution.
* ``guess_on_shape_change`` : like ``guess_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels
don't match the shapes from the last execution.
* ``time_once`` : the first time a convolution is executed, every convolution
implementation offered by
C
uDNN is executed and timed. The fastest is
implementation offered by
c
uDNN is executed and timed. The fastest is
reused for every subsequent execution of the convolution.
* ``time_on_shape_change`` : like ``time_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels
don't match the shapes from the last execution.
The Theano flag ``dnn.conv.algo_bwd`` allows to specify the
C
uDNN
The Theano flag ``dnn.conv.algo_bwd`` allows to specify the
c
uDNN
convolution implementation that Theano should use for gradient convolutions.
Possible values include :
...
...
@@ -86,13 +86,13 @@ To get an error if Theano can not use cuDNN, use this Theano flag:
* ``fft`` : use the Fast Fourrier Transform implementation of convolution
(very high memory usage)
* ``guess_once`` : the first time a convolution is executed, the
implementation to use is chosen according to
C
uDNN's heuristics and reused
implementation to use is chosen according to
c
uDNN's heuristics and reused
for every subsequent execution of the convolution.
* ``guess_on_shape_change`` : like ``guess_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels
don't match the shapes from the last execution.
* ``time_once`` : the first time a convolution is executed, every convolution
implementation offered by
C
uDNN is executed and timed. The fastest is
implementation offered by
c
uDNN is executed and timed. The fastest is
reused for every subsequent execution of the convolution.
* ``time_on_shape_change`` : like ``time_once`` but a new convolution
implementation selected every time the shapes of the inputs and kernels
...
...
theano/configdefaults.py
浏览文件 @
c10aa585
...
...
@@ -309,25 +309,25 @@ AddConfigVar('dnn.conv.algo_bwd',
in_c_key
=
False
)
AddConfigVar
(
'dnn.conv.algo_fwd'
,
"Default implementation to use for
C
uDNN forward convolution."
,
"Default implementation to use for
c
uDNN forward convolution."
,
EnumStr
(
*
SUPPORTED_DNN_CONV_ALGO_FWD
),
in_c_key
=
False
)
AddConfigVar
(
'dnn.conv.algo_bwd_data'
,
"Default implementation to use for
C
uDNN backward convolution to "
"Default implementation to use for
c
uDNN backward convolution to "
"get the gradients of the convolution with regard to the inputs."
,
EnumStr
(
*
SUPPORTED_DNN_CONV_ALGO_BWD_DATA
),
in_c_key
=
False
)
AddConfigVar
(
'dnn.conv.algo_bwd_filter'
,
"Default implementation to use for
C
uDNN backward convolution to "
"Default implementation to use for
c
uDNN backward convolution to "
"get the gradients of the convolution with regard to the "
"filters."
,
EnumStr
(
*
SUPPORTED_DNN_CONV_ALGO_BWD_FILTER
),
in_c_key
=
False
)
AddConfigVar
(
'dnn.conv.precision'
,
"Default data precision to use for the computation in
C
uDNN "
"Default data precision to use for the computation in
c
uDNN "
"convolutions (defaults to the same dtype as the inputs of the "
"convolutions)."
,
EnumStr
(
'as_input'
,
'float16'
,
'float32'
,
'float64'
),
...
...
@@ -350,9 +350,9 @@ AddConfigVar('dnn.library_path',
StrParam
(
default_dnn_path
(
'lib'
if
sys
.
platform
==
'darwin'
else
'lib64'
)))
AddConfigVar
(
'dnn.enabled'
,
"'auto', use
C
uDNN if available, but silently fall back"
"'auto', use
c
uDNN if available, but silently fall back"
" to not using it if not present."
" If True and
C
uDNN can not be used, raise an error."
" If True and
c
uDNN can not be used, raise an error."
" If False, disable cudnn"
,
StrParam
(
"auto"
,
"True"
,
"False"
),
in_c_key
=
False
)
...
...
theano/sandbox/cuda/__init__.py
浏览文件 @
c10aa585
...
...
@@ -270,14 +270,14 @@ from theano.sandbox.cuda.type import CudaNdarrayType
def
dnn_available
():
if
config
.
dnn
.
enabled
==
"False"
:
dnn_available
.
avail
=
False
dnn_available
.
msg
=
"
d
isabled by dnn.enabled flag"
dnn_available
.
msg
=
"
D
isabled by dnn.enabled flag"
if
dnn_available
.
avail
is
None
and
not
cuda_available
:
dnn_available
.
msg
=
"CUDA not available"
dnn_available
.
avail
=
False
elif
dnn_available
.
avail
is
None
:
dev
=
active_device_number
()
if
device_properties
(
dev
)[
'major'
]
<
3
:
dnn_available
.
msg
=
"Device not supported
by cuDNN
"
dnn_available
.
msg
=
"Device not supported"
dnn_available
.
avail
=
False
else
:
preambule
=
"""
...
...
@@ -315,7 +315,7 @@ if ((err = cudnnCreate(&_handle)) != CUDNN_STATUS_SUCCESS) {
dnn_available
.
avail
=
comp
if
not
dnn_available
.
avail
:
dnn_available
.
msg
=
(
"
Theano c
an not compile with cuDNN. We got this error:
\n
"
+
"
C
an not compile with cuDNN. We got this error:
\n
"
+
str
(
err
))
else
:
# If we can compile, check that we can import and run.
...
...
@@ -326,18 +326,17 @@ if ((err = cudnnCreate(&_handle)) != CUDNN_STATUS_SUCCESS) {
" from one version, but we link with"
" a different version
%
s"
%
str
(
v
))
raise
RuntimeError
(
dnn_available
.
msg
)
if
v
==
-
1
or
v
[
0
]
<
3
007
:
#
3007 is the final release of cudnn v3
if
v
==
-
1
or
v
[
0
]
<
4
007
:
#
4007 is the final release of cudnn v4
dnn_available
.
avail
=
False
dnn_available
.
msg
=
(
"You have an old release of CuDNN (or a release "
"candidate) that isn't supported. Please update to "
"at least v3 final version."
)
dnn_available
.
msg
=
"Version is too old. Update to v5, was
%
d."
%
v
[
0
]
raise
RuntimeError
(
dnn_available
.
msg
)
else
:
dnn_available
.
avail
=
comp
if
config
.
dnn
.
enabled
==
"True"
:
if
not
dnn_available
.
avail
:
raise
RuntimeError
(
"You enabled
C
uDNN, but we aren't able to use it:
%
s"
%
"You enabled
c
uDNN, but we aren't able to use it:
%
s"
%
dnn_available
.
msg
)
return
dnn_available
.
avail
...
...
@@ -582,14 +581,15 @@ def use(device,
if
dnn_available
():
(
hdr_v
,
runtime_v
)
=
dnn_version
()
cudnn_version
=
runtime_v
# 4100 should not print warning with cudnn 4 final.
if
cudnn_version
>
4100
:
warn
=
(
"Your CuDNN version is more recent then Theano."
" If you see problems, try updating Theano or"
" downgrading CuDNN to version 4."
)
# 5100 should not print warning with cudnn 5 final.
if
cudnn_version
>
5100
:
warn
=
(
"Your cuDNN version is more recent than the one"
" Theano officially supports."
" If you see any problems, try updating Theano or"
" downgrading cuDNN to version 5."
)
except
Exception
:
pass
print
(
"Using gpu device
%
d:
%
s (CNMeM is
%
s,
C
uDNN
%
s)"
%
(
cudnn_version
=
dnn_available
.
msg
print
(
"Using gpu device
%
d:
%
s (CNMeM is
%
s,
c
uDNN
%
s)"
%
(
active_device_number
(),
active_device_name
(),
cnmem_enabled
,
...
...
theano/sandbox/cuda/dnn.py
浏览文件 @
c10aa585
...
...
@@ -323,30 +323,30 @@ class GpuDnnConv(DnnBase, COp):
if
self
.
inplace
:
self
.
destroy_map
=
{
0
:
[
2
]}
# In
C
uDNN version older than V3, the FFT implementation and the
# In
c
uDNN version older than V3, the FFT implementation and the
# option to time the different implementations to get the fastest
# are both unavailable.
if
version
()
<
(
3000
,
3000
):
if
self
.
algo
==
'fft'
:
raise
RuntimeError
(
"
CuDNN FFT convolution requires C
uDNN v3"
)
raise
RuntimeError
(
"
cuDNN FFT convolution requires c
uDNN v3"
)
elif
self
.
algo
in
[
'guess_once'
,
'guess_on_shape_change'
]:
raise
RuntimeError
(
"
C
uDNN selection of convolution "
raise
RuntimeError
(
"
c
uDNN selection of convolution "
"implementation based on heuristics "
"requires
C
uDNN v3"
)
"requires
c
uDNN v3"
)
elif
self
.
algo
in
[
'time_once'
,
'time_on_shape_change'
]:
raise
RuntimeError
(
"
CuDNN convolution timing requires C
uDNN "
raise
RuntimeError
(
"
cuDNN convolution timing requires c
uDNN "
"v3"
)
# The fft_tiling implementation is only available from
C
uDNN V4 onward
# The fft_tiling implementation is only available from
c
uDNN V4 onward
if
version
()
<
(
4000
,
4000
):
if
self
.
algo
==
'fft_tiling'
:
raise
RuntimeError
(
"
C
uDNN tiled-FFT convolution requires "
"
C
uDNN v4 or more recent"
)
raise
RuntimeError
(
"
c
uDNN tiled-FFT convolution requires "
"
c
uDNN v4 or more recent"
)
if
version
()
<
(
5000
,
5000
):
if
self
.
algo
==
'winograd'
:
raise
RuntimeError
(
"
C
uDNN winograd convolution requires "
"
C
uDNN v5 or more recent"
)
raise
RuntimeError
(
"
c
uDNN winograd convolution requires "
"
c
uDNN v5 or more recent"
)
assert
self
.
algo
in
[
'none'
,
'small'
,
'large'
,
'fft'
,
'fft_tiling'
,
'winograd'
,
'guess_once'
,
'guess_on_shape_change'
,
...
...
@@ -517,11 +517,11 @@ class GpuDnnConv3d(GpuDnnConv):
if
version
()
<
(
5000
,
5000
):
if
self
.
algo
==
'fft_tiling'
:
raise
RuntimeError
(
"
C
uDNN 3d tiled-FFT convolution requires "
"
C
uDNN v5 or more recent"
)
raise
RuntimeError
(
"
c
uDNN 3d tiled-FFT convolution requires "
"
c
uDNN v5 or more recent"
)
elif
self
.
algo
==
'winograd'
:
raise
RuntimeError
(
"
C
uDNN 3d winograd convolution requires "
"
C
uDNN v5 or more recent"
)
raise
RuntimeError
(
"
c
uDNN 3d winograd convolution requires "
"
c
uDNN v5 or more recent"
)
def
make_node
(
self
,
img
,
kern
,
output
,
desc
,
alpha
=
None
,
beta
=
None
):
...
...
@@ -834,17 +834,17 @@ class GpuDnnConvGradI(DnnBase, COp):
if
self
.
inplace
:
self
.
destroy_map
=
{
0
:
[
2
]}
# The small-workspace implementation is only available from
C
uDNN V4
# The small-workspace implementation is only available from
c
uDNN V4
# onward.
if
version
()
<
(
4000
,
4000
):
if
self
.
algo
==
'fft_tiling'
:
raise
RuntimeError
(
"
C
uDNN's tiled-FFT convolution requires "
"
C
uDNN v4 or more recent"
)
raise
RuntimeError
(
"
c
uDNN's tiled-FFT convolution requires "
"
c
uDNN v4 or more recent"
)
if
version
()
<
(
5000
,
5000
):
if
self
.
algo
==
'winograd'
:
raise
RuntimeError
(
"
C
uDNN's winograd convolution requires "
"
C
uDNN v5 or more recent"
)
raise
RuntimeError
(
"
c
uDNN's winograd convolution requires "
"
c
uDNN v5 or more recent"
)
assert
self
.
algo
in
[
'none'
,
'deterministic'
,
'fft'
,
'fft_tiling'
,
'winograd'
,
'guess_once'
,
'guess_on_shape_change'
,
...
...
@@ -997,11 +997,11 @@ class GpuDnnConv3dGradI(GpuDnnConvGradI):
assert
self
.
algo
in
good_algo
if
version
()
<
(
5000
,
5000
):
if
self
.
algo
==
'fft_tiling'
:
raise
RuntimeError
(
"
C
uDNN 3d tiled-FFT convolution requires "
"
C
uDNN v5 or more recent"
)
raise
RuntimeError
(
"
c
uDNN 3d tiled-FFT convolution requires "
"
c
uDNN v5 or more recent"
)
elif
self
.
algo
==
'winograd'
:
raise
RuntimeError
(
"
C
uDNN 3d winograd convolution requires "
"
C
uDNN v5 or more recent"
)
raise
RuntimeError
(
"
c
uDNN 3d winograd convolution requires "
"
c
uDNN v5 or more recent"
)
def
grad
(
self
,
inp
,
grads
):
kerns
,
top
,
output
,
desc
,
alpha
,
beta
=
inp
...
...
@@ -1079,7 +1079,7 @@ def dnn_conv(img, kerns, border_mode='valid', subsample=(1, 1),
*deprecated*, use parameter algo instead.
algo : {'none', 'small', 'large', 'fft', 'guess_once', 'guess_on_shape_change', 'time_once', 'time_on_shape_change'}
Convolution implementation to use. Some of its values may require certain
versions of
C
uDNN to be installed. Default is the value of
versions of
c
uDNN to be installed. Default is the value of
:attr:`config.dnn.conv.algo_fwd`.
precision : {'as_input', 'float16', 'float32', 'float64'}
Description of the dtype in which the computation of the convolution
...
...
@@ -1365,13 +1365,13 @@ class GpuDnnPoolDesc(GpuOp):
self
.
pad
=
pad
if
(
pad
[
0
]
!=
0
or
pad
[
1
]
!=
0
)
and
version
()
==
-
1
:
raise
RuntimeError
(
"
CuDNN pooling with padding requires C
uDNN v2"
)
raise
RuntimeError
(
"
cuDNN pooling with padding requires c
uDNN v2"
)
if
self
.
get_ndim
()
==
3
and
version
()
<
(
3000
,
3000
):
raise
RuntimeError
(
"
CuDNN 3d pooling requires C
uDNN v3"
)
raise
RuntimeError
(
"
cuDNN 3d pooling requires c
uDNN v3"
)
if
(
mode
==
'average_exc_pad'
and
max
(
pad
)
>
0
and
version
()
<
(
4004
,
4004
)):
raise
RuntimeError
(
"
C
uDNN pooling mode 'average_exc_pad' requires at least v4"
)
"
c
uDNN pooling mode 'average_exc_pad' requires at least v4"
)
def
get_ndim
(
self
):
return
len
(
self
.
ws
)
...
...
@@ -1383,7 +1383,7 @@ class GpuDnnPoolDesc(GpuOp):
def
make_node
(
self
):
if
self
.
pad
!=
(
0
,
0
)
and
version
()
==
-
1
:
raise
RuntimeError
(
"
CuDNN pooling with padding requires C
uDNN v2"
)
raise
RuntimeError
(
"
cuDNN pooling with padding requires c
uDNN v2"
)
node
=
Apply
(
self
,
[],
[
CDataType
(
"cudnnPoolingDescriptor_t"
,
...
...
@@ -1983,7 +1983,7 @@ class GpuDnnSoftmaxBase(DnnBase):
Always set this to 'bc01'.
algo : {'fast', 'accurate', 'log'}
Indicating whether, respectively, computations should be optimized for
speed, for accuracy, or if
C
uDNN should rather compute the log-softmax instead.
speed, for accuracy, or if
c
uDNN should rather compute the log-softmax instead.
mode : {'instance', 'channel'}
Indicating whether the softmax should be computed per image across 'c01'
or per spatial location '01' per image across 'c'.
...
...
@@ -2004,7 +2004,7 @@ class GpuDnnSoftmaxBase(DnnBase):
self
.
tensor_format
=
tensor_format
if
algo
==
'log'
and
version
()
<
(
3000
,
3000
):
raise
RuntimeError
(
"
CuDNN log-softmax requires C
uDNN v3"
)
raise
RuntimeError
(
"
cuDNN log-softmax requires c
uDNN v3"
)
assert
(
algo
in
(
'fast'
,
'accurate'
,
'log'
))
self
.
algo
=
algo
...
...
@@ -2526,7 +2526,7 @@ if True:
@register_opt
(
'cudnn'
)
@local_optimizer
([
GpuElemwise
,
LogSoftmax
])
def
local_log_softmax_dnn
(
node
):
# The log-softmax implementation is only available starting at
C
uDNN V3
# The log-softmax implementation is only available starting at
c
uDNN V3
if
not
dnn_available
()
or
version
()
<
(
3000
,
3000
):
return
...
...
theano/sandbox/cuda/dnn_fwd.c
浏览文件 @
c10aa585
...
...
@@ -78,7 +78,7 @@ APPLY_SPECIFIC(conv_fwd)(CudaNdarray *input, CudaNdarray *kerns,
// Obtain a convolution algorithm appropriate for the input and kernel
// shapes. Either by choosing one according to heuristics or by making
//
C
uDNN time every implementation and choose the best one.
//
c
uDNN time every implementation and choose the best one.
if
(
CHOOSE_ALGO_TIME
)
{
// Time the different implementations to choose the best one
...
...
theano/sandbox/cuda/dnn_gi.c
浏览文件 @
c10aa585
...
...
@@ -76,7 +76,7 @@ APPLY_SPECIFIC(conv_gi)(CudaNdarray *kerns, CudaNdarray *output,
{
// Obtain a convolution algorithm appropriate for the kernel and output
// shapes. Either by choosing one according to heuristics or by making
//
C
uDNN time every implementation and choose the best one.
//
c
uDNN time every implementation and choose the best one.
if
(
CHOOSE_ALGO_TIME
)
{
// Time the different implementations to choose the best one
...
...
theano/sandbox/cuda/dnn_gw.c
浏览文件 @
c10aa585
...
...
@@ -76,7 +76,7 @@ APPLY_SPECIFIC(conv_gw)(CudaNdarray *input, CudaNdarray *output,
{
// Obtain a convolution algorithm appropriate for the input and output
// shapes. Either by choosing one according to heuristics or by making
//
C
uDNN time every implementation and choose the best one.
//
c
uDNN time every implementation and choose the best one.
if
(
CHOOSE_ALGO_TIME
)
{
// Time the different implementations to choose the best one
...
...
theano/sandbox/cuda/tests/test_abstractconv.py
浏览文件 @
c10aa585
...
...
@@ -25,7 +25,7 @@ else:
class
TestDnnConv2d
(
test_abstract_conv
.
BaseTestConv2d
):
def
setUp
(
self
):
super
(
TestDnnConv2d
,
self
)
.
setUp
()
# provide_shape is not used by the
C
uDNN impementation
# provide_shape is not used by the
c
uDNN impementation
self
.
provide_shape
=
[
False
]
self
.
shared
=
gpu_shared
...
...
theano/sandbox/cuda/tests/test_conv_cuda_ndarray.py
浏览文件 @
c10aa585
...
...
@@ -520,7 +520,7 @@ def _test_full(cls, mode=None, version=[-1], extra_shapes=[],
def
test_full
():
# If using
C
uDNN version before v3, only run the tests where the
# If using
c
uDNN version before v3, only run the tests where the
# kernels are not larger than the input in any spatial dimension.
if
cuda
.
dnn
.
dnn_available
()
and
cuda
.
dnn
.
version
()
<
(
3000
,
3000
):
test_bigger_kernels
=
False
...
...
@@ -542,7 +542,7 @@ def test_dnn_full():
if
not
cuda
.
dnn
.
dnn_available
():
raise
SkipTest
(
cuda
.
dnn
.
dnn_available
.
msg
)
# If using
C
uDNN version before v3, only run the tests where the
# If using
c
uDNN version before v3, only run the tests where the
# kernels are not larger than the input in any spatial dimension.
if
cuda
.
dnn
.
version
()
<
(
3000
,
3000
):
test_bigger_kernels
=
False
...
...
theano/sandbox/cuda/tests/test_dnn.py
浏览文件 @
c10aa585
...
...
@@ -413,7 +413,7 @@ def test_old_pool_interface():
def
test_pooling3d
():
#
CuDNN 3d pooling requires CuDNN v3. Don't test if the C
uDNN version is
#
cuDNN 3d pooling requires cuDNN v3. Don't test if the c
uDNN version is
# too old.
if
not
cuda
.
dnn
.
dnn_available
()
or
cuda
.
dnn
.
version
()
<
(
3000
,
3000
):
raise
SkipTest
(
cuda
.
dnn
.
dnn_available
.
msg
)
...
...
@@ -641,8 +641,8 @@ class test_DnnSoftMax(test_nnet.test_SoftMax):
)])
==
0
)
def
test_log_softmax
(
self
):
# This is a test for an optimization that depends on
C
uDNN v3 or
# more recent. Don't test if the
C
uDNN version is too old.
# This is a test for an optimization that depends on
c
uDNN v3 or
# more recent. Don't test if the
c
uDNN version is too old.
if
cuda
.
dnn
.
version
()
<
(
3000
,
3000
):
raise
SkipTest
(
"Log-softmax is only in cudnn v3+"
)
...
...
@@ -826,7 +826,7 @@ class TestDnnInferShapes(utt.InferShapeTester):
def
test_conv3d
(
self
):
if
not
(
cuda
.
dnn
.
dnn_available
()
and
dnn
.
version
()
>=
(
2000
,
2000
)):
raise
SkipTest
(
'"
CuDNN 3D convolution requires C
uDNN v2'
)
raise
SkipTest
(
'"
cuDNN 3D convolution requires c
uDNN v2'
)
ftensor5
=
T
.
TensorType
(
dtype
=
"float32"
,
broadcastable
=
(
False
,)
*
5
)
img
=
ftensor5
(
'img'
)
kerns
=
ftensor5
(
'kerns'
)
...
...
@@ -914,7 +914,7 @@ class TestDnnInferShapes(utt.InferShapeTester):
def
test_conv3d_gradw
(
self
):
if
not
(
cuda
.
dnn
.
dnn_available
()
and
dnn
.
version
()
>=
(
2000
,
2000
)):
raise
SkipTest
(
'"
CuDNN 3D convolution requires C
uDNN v2'
)
raise
SkipTest
(
'"
cuDNN 3D convolution requires c
uDNN v2'
)
ftensor5
=
T
.
TensorType
(
dtype
=
"float32"
,
broadcastable
=
(
False
,)
*
5
)
img
=
ftensor5
(
'img'
)
kerns
=
ftensor5
(
'kerns'
)
...
...
@@ -1004,7 +1004,7 @@ class TestDnnInferShapes(utt.InferShapeTester):
def
test_conv3d_gradi
(
self
):
if
not
(
cuda
.
dnn
.
dnn_available
()
and
dnn
.
version
()
>=
(
2000
,
2000
)):
raise
SkipTest
(
'"
CuDNN 3D convolution requires C
uDNN v2'
)
raise
SkipTest
(
'"
cuDNN 3D convolution requires c
uDNN v2'
)
ftensor5
=
T
.
TensorType
(
dtype
=
"float32"
,
broadcastable
=
(
False
,)
*
5
)
img
=
ftensor5
(
'img'
)
kerns
=
ftensor5
(
'kerns'
)
...
...
@@ -1392,7 +1392,7 @@ def get_conv3d_test_cases():
itt
=
chain
(
product
(
test_shapes
,
border_modes
,
conv_modes
),
product
(
test_shapes_full
,
[
'full'
],
conv_modes
))
else
:
#
C
uDNN, before V3, did not support kernels larger than the inputs,
#
c
uDNN, before V3, did not support kernels larger than the inputs,
# even if the original inputs were padded so they would be larger than
# the kernels. If using a version older than V3 don't run the tests
# with kernels larger than the unpadded inputs.
...
...
@@ -1404,7 +1404,7 @@ def get_conv3d_test_cases():
def
test_conv3d_fwd
():
if
not
(
cuda
.
dnn
.
dnn_available
()
and
dnn
.
version
()
>=
(
2000
,
2000
)):
raise
SkipTest
(
'"
CuDNN 3D convolution requires C
uDNN v2'
)
raise
SkipTest
(
'"
cuDNN 3D convolution requires c
uDNN v2'
)
def
run_conv3d_fwd
(
inputs_shape
,
filters_shape
,
subsample
,
border_mode
,
conv_mode
):
...
...
@@ -1421,7 +1421,7 @@ def test_conv3d_fwd():
filters
=
shared
(
filters_val
)
bias
=
shared
(
numpy
.
zeros
(
filters_shape
[
0
])
.
astype
(
'float32'
))
# Compile a theano function for the
C
uDNN implementation
# Compile a theano function for the
c
uDNN implementation
conv
=
dnn
.
dnn_conv3d
(
img
=
inputs
,
kerns
=
filters
,
border_mode
=
border_mode
,
subsample
=
subsample
,
conv_mode
=
conv_mode
)
...
...
@@ -1476,7 +1476,7 @@ def test_conv3d_fwd():
def
test_conv3d_bwd
():
if
not
(
cuda
.
dnn
.
dnn_available
()
and
dnn
.
version
()
>=
(
2000
,
2000
)):
raise
SkipTest
(
'"
CuDNN 3D convolution requires C
uDNN v2'
)
raise
SkipTest
(
'"
cuDNN 3D convolution requires c
uDNN v2'
)
def
run_conv3d_bwd
(
inputs_shape
,
filters_shape
,
subsample
,
border_mode
,
conv_mode
):
...
...
@@ -1488,7 +1488,7 @@ def test_conv3d_bwd():
filters
=
shared
(
filters_val
)
bias
=
shared
(
numpy
.
zeros
(
filters_shape
[
0
])
.
astype
(
'float32'
))
# Compile a theano function for the
C
uDNN implementation
# Compile a theano function for the
c
uDNN implementation
conv
=
dnn
.
dnn_conv3d
(
img
=
inputs
,
kerns
=
filters
,
border_mode
=
border_mode
,
subsample
=
subsample
,
conv_mode
=
conv_mode
)
...
...
theano/sandbox/cuda/tests/test_nnet.py
浏览文件 @
c10aa585
...
...
@@ -20,7 +20,7 @@ if theano.config.mode == 'FAST_COMPILE':
mode_without_gpu
=
theano
.
compile
.
mode
.
get_mode
(
'FAST_RUN'
)
else
:
mode_with_gpu
=
theano
.
compile
.
mode
.
get_default_mode
()
.
including
(
'gpu'
)
mode_without_gpu
=
theano
.
compile
.
mode
.
get_default_mode
()
mode_without_gpu
=
theano
.
compile
.
mode
.
get_default_mode
()
.
excluding
(
'gpu'
)
def
test_GpuCrossentropySoftmaxArgmax1HotWithBias
():
...
...
theano/sandbox/gpuarray/__init__.py
浏览文件 @
c10aa585
...
...
@@ -69,17 +69,17 @@ def init_dev(dev, name=None):
warn
=
None
cudnn_version
=
""
if
dev
.
startswith
(
'cuda'
):
cudnn_version
=
" (
C
uDNN not available)"
cudnn_version
=
" (
c
uDNN not available)"
try
:
cudnn_version
=
dnn
.
version
()
#
4100 should not print warning with cudnn 4
final.
if
cudnn_version
>
4
100
:
warn
=
(
"Your
C
uDNN version is more recent than Theano."
#
5100 should not print warning with cudnn 5
final.
if
cudnn_version
>
5
100
:
warn
=
(
"Your
c
uDNN version is more recent than Theano."
" If you see problems, try updating Theano or"
" downgrading
CuDNN to version 4
."
)
cudnn_version
=
" (
C
uDNN version
%
s)"
%
cudnn_version
" downgrading
cuDNN to version 5
."
)
cudnn_version
=
" (
c
uDNN version
%
s)"
%
cudnn_version
except
Exception
:
pass
cudnn_version
=
dnn
.
dnn_present
.
msg
print
(
"Mapped name
%
s to device
%
s:
%
s
%
s"
%
(
name
,
dev
,
context
.
devname
,
cudnn_version
),
file
=
sys
.
stderr
)
...
...
theano/sandbox/gpuarray/dnn.py
浏览文件 @
c10aa585
差异被折叠。
点击展开。
theano/sandbox/gpuarray/tests/test_abstractconv.py
浏览文件 @
c10aa585
...
...
@@ -18,7 +18,7 @@ class TestDnnConv2d(test_abstract_conv.BaseTestConv2d):
def
setUp
(
self
):
super
(
TestDnnConv2d
,
self
)
.
setUp
()
self
.
shared
=
gpuarray_shared_constructor
# provide_shape is not used by the
C
uDNN impementation
# provide_shape is not used by the
c
uDNN impementation
self
.
provide_shape
=
[
False
]
def
tcase
(
self
,
i
,
f
,
s
,
b
,
flip
,
provide_shape
):
...
...
theano/sandbox/gpuarray/tests/test_dnn.py
浏览文件 @
c10aa585
...
...
@@ -171,7 +171,7 @@ def test_pooling():
raise
SkipTest
(
dnn
.
dnn_available
.
msg
)
# 'average_exc_pad' is disabled for versions < 4004
if
dnn
.
version
()
<
4004
:
if
dnn
.
version
(
raises
=
False
)
<
4004
:
modes
=
(
'max'
,
'average_inc_pad'
)
else
:
modes
=
(
'max'
,
'average_inc_pad'
,
'average_exc_pad'
)
...
...
@@ -464,7 +464,7 @@ class TestDnnInferShapes(utt.InferShapeTester):
[
conv_modes
[
0
]])),
testcase_func_name
=
utt
.
custom_name_func
)
def
test_conv
(
self
,
algo
,
border_mode
,
conv_mode
):
if
algo
==
'winograd'
and
dnn
.
version
()
<
5000
:
if
algo
==
'winograd'
and
dnn
.
version
(
raises
=
False
)
<
5000
:
raise
SkipTest
(
dnn
.
dnn_available
.
msg
)
self
.
_test_conv
(
T
.
ftensor4
(
'img'
),
...
...
@@ -597,7 +597,7 @@ class TestDnnInferShapes(utt.InferShapeTester):
)
# 'average_exc_pad' is disabled for versions < 4004
if
dnn
.
version
()
<
4004
:
if
dnn
.
version
(
raises
=
False
)
<
4004
:
modes
=
[
'max'
,
'average_inc_pad'
]
else
:
modes
=
[
'max'
,
'average_inc_pad'
,
'average_exc_pad'
]
...
...
@@ -732,6 +732,8 @@ def test_dnn_conv_alpha_output_merge():
def
test_dnn_conv_grad
():
if
not
dnn
.
dnn_available
(
test_ctx_name
):
raise
SkipTest
(
dnn
.
dnn_available
.
msg
)
b
=
1
c
=
4
f
=
3
...
...
@@ -777,6 +779,10 @@ class test_SoftMax(test_nnet.test_SoftMax):
gpu_grad_op
=
dnn
.
GpuDnnSoftmaxGrad
mode
=
mode_with_gpu
def
setUp
(
self
):
if
not
dnn
.
dnn_available
(
test_ctx_name
):
raise
SkipTest
(
dnn
.
dnn_available
.
msg
)
def
test_softmax_shape_0
(
self
):
raise
SkipTest
(
"Cudnn doesn't support 0 shapes"
)
...
...
@@ -887,9 +893,9 @@ class test_SoftMax(test_nnet.test_SoftMax):
])
==
0
)
def
test_log_softmax
(
self
):
# This is a test for an optimization that depends on
C
uDNN v3 or
# more recent. Don't test if the
C
uDNN version is too old.
if
dnn
.
version
()
<
3000
:
# This is a test for an optimization that depends on
c
uDNN v3 or
# more recent. Don't test if the
c
uDNN version is too old.
if
dnn
.
version
(
raises
=
False
)
<
3000
:
raise
SkipTest
(
"Log-softmax is only in cudnn v3+"
)
x
=
T
.
ftensor4
()
...
...
@@ -928,9 +934,9 @@ class test_SoftMax(test_nnet.test_SoftMax):
# Test that the op LogSoftmax is correctly replaced by the op
# DnnSoftmax with the 'log' mode.
# This is a test for an optimization that depends on
C
uDNN v3 or
# more recent. Don't test if the
C
uDNN version is too old.
if
dnn
.
version
()
<
3000
:
# This is a test for an optimization that depends on
c
uDNN v3 or
# more recent. Don't test if the
c
uDNN version is too old.
if
dnn
.
version
(
raises
=
False
)
<
3000
:
raise
SkipTest
(
"Log-softmax is only in cudnn v3+"
)
# Compile a reference function, on the CPU, to be used to validate the
...
...
theano/tensor/nnet/__init__.py
浏览文件 @
c10aa585
...
...
@@ -106,7 +106,7 @@ def conv2d(input, filters, input_shape=None, filter_shape=None,
Notes
-----
If
C
uDNN is available, it will be used on the
If
c
uDNN is available, it will be used on the
GPU. Otherwise, it is the *CorrMM* convolution that will be used
"caffe style convolution".
...
...
theano/tensor/nnet/abstract_conv.py
浏览文件 @
c10aa585
...
...
@@ -225,7 +225,7 @@ def conv2d_grad_wrt_inputs(output_grad,
Notes
-----
:note: If
C
uDNN is available, it will be used on the
:note: If
c
uDNN is available, it will be used on the
GPU. Otherwise, it is the *CorrMM* convolution that will be used
"caffe style convolution".
...
...
@@ -348,7 +348,7 @@ def conv2d_grad_wrt_weights(input,
Notes
-----
:note: If
C
uDNN is available, it will be used on the
:note: If
c
uDNN is available, it will be used on the
GPU. Otherwise, it is the *CorrMM* convolution that will be used
"caffe style convolution".
...
...
theano/tensor/signal/pool.py
浏览文件 @
c10aa585
...
...
@@ -78,8 +78,8 @@ def pool_2d(input, ds, ignore_border=None, st=None, padding=(0, 0),
" default value changed to True (currently"
" False). To have consistent behavior with all Theano"
" version, explicitly add the parameter ignore_border=True."
" On the GPU, using ignore_border=True is needed to use
C
uDNN."
" When using ignore_border=False and not using
C
uDNN, the only"
" On the GPU, using ignore_border=True is needed to use
c
uDNN."
" When using ignore_border=False and not using
c
uDNN, the only"
" GPU combination supported is when"
" `ds == st and padding == (0, 0) and mode == 'max'`."
" Otherwise, the convolution will be executed on CPU."
,
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论