Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
P
pytensor
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
图表
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
CI / CD
CI / CD
流水线
作业
日程
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
图像
聊天
创建新问题
作业
提交
问题看板
Open sidebar
testgroup
pytensor
Commits
bd544674
提交
bd544674
authored
5月 11, 2016
作者:
slefrancois
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Update doc with instructions for using new gpu backend
上级
319382b5
全部展开
隐藏空白字符变更
内嵌
并排
正在显示
10 个修改的文件
包含
36 行增加
和
72 行删除
+36
-72
.gitignore
.gitignore
+2
-0
extending_theano.txt
doc/extending/extending_theano.txt
+2
-2
install.txt
doc/install.txt
+8
-6
install_ubuntu.txt
doc/install_ubuntu.txt
+11
-11
install_windows.txt
doc/install_windows.txt
+7
-5
optimizations.txt
doc/optimizations.txt
+1
-1
aliasing.txt
doc/tutorial/aliasing.txt
+0
-47
using_gpu.txt
doc/tutorial/using_gpu.txt
+0
-0
using_gpu_solution_1.py
doc/tutorial/using_gpu_solution_1.py
+0
-0
check_blas.py
theano/misc/check_blas.py
+5
-0
没有找到文件。
.gitignore
浏览文件 @
bd544674
...
@@ -37,3 +37,4 @@ Theano.suo
...
@@ -37,3 +37,4 @@ Theano.suo
.ipynb_checkpoints
.ipynb_checkpoints
.pydevproject
.pydevproject
.ropeproject
.ropeproject
core
\ No newline at end of file
doc/extending/extending_theano.txt
浏览文件 @
bd544674
...
@@ -681,8 +681,8 @@ For instance, to verify the Rop method of the DoubleOp, you can use this:
...
@@ -681,8 +681,8 @@ For instance, to verify the Rop method of the DoubleOp, you can use this:
Testing GPU Ops
Testing GPU Ops
^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^
Ops to be executed on the GPU should inherit from the
When using the old GPU backend, Ops to be executed on the GPU should inherit
``theano.sandbox.cuda.GpuOp`` and not ``theano.Op``. This allows
from
``theano.sandbox.cuda.GpuOp`` and not ``theano.Op``. This allows
Theano to distinguish them. Currently, we use this to test if the
Theano to distinguish them. Currently, we use this to test if the
NVIDIA driver works correctly with our sum reduction code on the GPU.
NVIDIA driver works correctly with our sum reduction code on the GPU.
...
...
doc/install.txt
浏览文件 @
bd544674
...
@@ -375,7 +375,7 @@ If ``theano-nose`` is not found by your shell, you will need to add
...
@@ -375,7 +375,7 @@ If ``theano-nose`` is not found by your shell, you will need to add
If you want GPU-related tests to run on a specific GPU device, and not
If you want GPU-related tests to run on a specific GPU device, and not
the default one, you should use :attr:`~config.init_gpu_device`.
the default one, you should use :attr:`~config.init_gpu_device`.
For instance: ``THEANO_FLAGS=device=cpu,init_gpu_device=
gpu
1``.
For instance: ``THEANO_FLAGS=device=cpu,init_gpu_device=
cuda
1``.
See :ref:`libdoc_config` for more information on how to change these
See :ref:`libdoc_config` for more information on how to change these
configuration options.
configuration options.
...
@@ -508,25 +508,25 @@ Any one of them is enough.
...
@@ -508,25 +508,25 @@ Any one of them is enough.
:ref:`Ubuntu instructions <install_ubuntu_gpu>`.
:ref:`Ubuntu instructions <install_ubuntu_gpu>`.
Next, install `libgpuarray <http://deeplearning.net/software/libgpuarray/installation.html>`_.
Once that is done, the only thing left is to change the ``device`` option to name the GPU device in your
Once that is done, the only thing left is to change the ``device`` option to name the GPU device in your
computer, and set the default floating point computations to float32.
computer, and set the default floating point computations to float32.
For example: ``THEANO_FLAGS='cuda.root=/path/to/cuda/root,device=
gpu
,floatX=float32'``.
For example: ``THEANO_FLAGS='cuda.root=/path/to/cuda/root,device=
cuda
,floatX=float32'``.
You can also set these options in the .theanorc file's ``[global]`` section:
You can also set these options in the .theanorc file's ``[global]`` section:
.. code-block:: cfg
.. code-block:: cfg
[global]
[global]
device =
gpu
device =
cuda
floatX = float32
floatX = float32
Note that:
Note that:
* If your computer has multiple GPUs and you use 'device=
gpu
', the driver
* If your computer has multiple GPUs and you use 'device=
cuda
', the driver
selects the one to use (usually gpu0).
selects the one to use (usually gpu0).
* You can use the program nvida-smi to change this policy.
* You can use the program nvida-smi to change this policy.
* You can choose one specific GPU by specifying 'device=
gpu
X', with X the
* You can choose one specific GPU by specifying 'device=
cuda
X', with X the
the corresponding GPU index (0, 1, 2, ...)
the corresponding GPU index (0, 1, 2, ...)
* By default, when ``device`` indicates preference for GPU computations,
* By default, when ``device`` indicates preference for GPU computations,
Theano will fall back to the CPU if there is a problem with the GPU.
Theano will fall back to the CPU if there is a problem with the GPU.
...
@@ -794,6 +794,8 @@ setup CUDA, but be aware of the following caveats:
...
@@ -794,6 +794,8 @@ setup CUDA, but be aware of the following caveats:
toggle your GPU on, which can be done with
toggle your GPU on, which can be done with
`gfxCardStatus <http://codykrieger.com/gfxCardStatus>`__.
`gfxCardStatus <http://codykrieger.com/gfxCardStatus>`__.
Next, install `libgpuarray <http://deeplearning.net/software/libgpuarray/installation.html>`_.
Once your setup is complete, head to :ref:`using_gpu` to find how to verify
Once your setup is complete, head to :ref:`using_gpu` to find how to verify
everything is working properly.
everything is working properly.
...
...
doc/install_ubuntu.txt
浏览文件 @
bd544674
...
@@ -43,7 +43,7 @@ For Ubuntu 11.10 through 14.04:
...
@@ -43,7 +43,7 @@ For Ubuntu 11.10 through 14.04:
sudo apt-get install python-numpy python-scipy python-dev python-pip python-nose g++ libopenblas-dev git
sudo apt-get install python-numpy python-scipy python-dev python-pip python-nose g++ libopenblas-dev git
sudo pip install Theano
sudo pip install Theano
On 14.04, this will install Python 2 by default. If you want to use Python 3:
On 14.04, this will install Python 2 by default. If you want to use Python 3:
.. code-block:: bash
.. code-block:: bash
...
@@ -104,30 +104,30 @@ For Ubuntu 11.04:
...
@@ -104,30 +104,30 @@ For Ubuntu 11.04:
The development version of Theano supports Python 3.3 and
The development version of Theano supports Python 3.3 and
probably supports Python 3.2, but we do not test on it.
probably supports Python 3.2, but we do not test on it.
Bleeding Edge Installs
Bleeding Edge Installs
----------------------
----------------------
If you would like, instead, to install the bleeding edge Theano (from github)
If you would like, instead, to install the bleeding edge Theano (from github)
such that you can edit and contribute to Theano, replace the `pip install Theano`
such that you can edit and contribute to Theano, replace the `pip install Theano`
command with:
command with:
.. code-block:: bash
.. code-block:: bash
git clone git://github.com/Theano/Theano.git
git clone git://github.com/Theano/Theano.git
cd Theano
cd Theano
python setup.py develop --user
python setup.py develop --user
cd ..
cd ..
VirtualEnv
VirtualEnv
----------
----------
If you would like to install Theano in a VirtualEnv, you will want to pass the
If you would like to install Theano in a VirtualEnv, you will want to pass the
`--system-site-packages` flag when creating the VirtualEnv so that it will pick up
`--system-site-packages` flag when creating the VirtualEnv so that it will pick up
the system-provided `Numpy` and `SciPy`.
the system-provided `Numpy` and `SciPy`.
.. code-block:: bash
.. code-block:: bash
virtualenv --system-site-packages -p python2.7 theano-env
virtualenv --system-site-packages -p python2.7 theano-env
source theano-env/bin/activate
source theano-env/bin/activate
pip install Theano
pip install Theano
...
@@ -208,7 +208,7 @@ Updating Bleeding Edge Installs
...
@@ -208,7 +208,7 @@ Updating Bleeding Edge Installs
Change to the Theano directory and run:
Change to the Theano directory and run:
.. code-block:: bash
.. code-block:: bash
git pull
git pull
...
@@ -303,7 +303,7 @@ Test GPU configuration
...
@@ -303,7 +303,7 @@ Test GPU configuration
.. code-block:: bash
.. code-block:: bash
THEANO_FLAGS=floatX=float32,device=
gpu
python /usr/lib/python2.*/site-packages/theano/misc/check_blas.py
THEANO_FLAGS=floatX=float32,device=
cuda
python /usr/lib/python2.*/site-packages/theano/misc/check_blas.py
.. note::
.. note::
...
...
doc/install_windows.txt
浏览文件 @
bd544674
...
@@ -423,16 +423,16 @@ Create a test file containing:
...
@@ -423,16 +423,16 @@ Create a test file containing:
print("NP time: %f[s], theano time: %f[s] (times should be close when run on CPU!)" %(
print("NP time: %f[s], theano time: %f[s] (times should be close when run on CPU!)" %(
np_end-np_start, t_end-t_start))
np_end-np_start, t_end-t_start))
print("Result difference: %f" % (np.abs(AB-tAB).max(), ))
print("Result difference: %f" % (np.abs(AB-tAB).max(), ))
.. testoutput::
.. testoutput::
:hide:
:hide:
:options: +ELLIPSIS
:options: +ELLIPSIS
NP time: ...[s], theano time: ...[s] (times should be close when run on CPU!)
NP time: ...[s], theano time: ...[s] (times should be close when run on CPU!)
Result difference: ...
Result difference: ...
.. code-block:: none
.. code-block:: none
NP time: 1.480863[s], theano time: 1.475381[s] (times should be close when run on CPU!)
NP time: 1.480863[s], theano time: 1.475381[s] (times should be close when run on CPU!)
Result difference: 0.000000
Result difference: 0.000000
...
@@ -445,6 +445,8 @@ routine for matrix multiplication)
...
@@ -445,6 +445,8 @@ routine for matrix multiplication)
Configure Theano for GPU use
Configure Theano for GPU use
############################
############################
Install `libgpuarray <http://deeplearning.net/software/libgpuarray/installation.html>`_ if you have not already done so.
Theano can be configured with a ``.theanorc`` text file (or
Theano can be configured with a ``.theanorc`` text file (or
``.theanorc.txt``, whichever is easier for you to create under
``.theanorc.txt``, whichever is easier for you to create under
Windows). It should be placed in the directory pointed to by the
Windows). It should be placed in the directory pointed to by the
...
@@ -457,7 +459,7 @@ To use the GPU please write the following configuration file:
...
@@ -457,7 +459,7 @@ To use the GPU please write the following configuration file:
.. code-block:: cfg
.. code-block:: cfg
[global]
[global]
device =
gpu
device =
cuda
floatX = float32
floatX = float32
[nvcc]
[nvcc]
...
@@ -498,7 +500,7 @@ within an MSYS shell if you installed Nose manually as described above.
...
@@ -498,7 +500,7 @@ within an MSYS shell if you installed Nose manually as described above.
Compiling a faster BLAS
Compiling a faster BLAS
~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~
If you installed Python through WinPython or EPD, Theano will automatically
If you installed Python through WinPython or EPD, Theano will automatically
link with the MKL library, so you should not need to compile your own BLAS.
link with the MKL library, so you should not need to compile your own BLAS.
.. note::
.. note::
...
...
doc/optimizations.txt
浏览文件 @
bd544674
...
@@ -32,6 +32,7 @@ Optimization FAST_RUN FAST_COMPILE
...
@@ -32,6 +32,7 @@ Optimization FAST_RUN FAST_COMPILE
========================================================= ========= ============ =============
========================================================= ========= ============ =============
:term:`merge` x x
:term:`merge` x x
:term:`constant folding<constant folding>` x x
:term:`constant folding<constant folding>` x x
:term:`GPU transfer` x x
:term:`shape promotion<shape promotion>` x
:term:`shape promotion<shape promotion>` x
:term:`fill cut<fill cut>` x
:term:`fill cut<fill cut>` x
:term:`inc_subtensor srlz.<inc_subtensor serialization>` x
:term:`inc_subtensor srlz.<inc_subtensor serialization>` x
...
@@ -52,7 +53,6 @@ Optimization FAST_RUN FAST_COMPILE
...
@@ -52,7 +53,6 @@ Optimization FAST_RUN FAST_COMPILE
:term:`inplace_elemwise` x
:term:`inplace_elemwise` x
:term:`inplace_random` x
:term:`inplace_random` x
:term:`elemwise fusion` x
:term:`elemwise fusion` x
:term:`GPU transfer` x
:term:`local_log_softmax` x x
:term:`local_log_softmax` x x
:term:`local_remove_all_assert`
:term:`local_remove_all_assert`
========================================================= ========= ============ =============
========================================================= ========= ============ =============
...
...
doc/tutorial/aliasing.txt
浏览文件 @
bd544674
...
@@ -261,52 +261,6 @@ combination of ``return_internal_type=True`` and ``borrow=True`` arguments to
...
@@ -261,52 +261,6 @@ combination of ``return_internal_type=True`` and ``borrow=True`` arguments to
hints that give more flexibility to the compilation and optimization of the
hints that give more flexibility to the compilation and optimization of the
graph.
graph.
For GPU graphs, this borrowing can have a major speed impact. See the following code:
.. code-block:: python
from theano import function, config, shared, sandbox, tensor, Out
import numpy
import time
vlen = 10 * 30 * 768 # 10 x # cores x # threads per core
iters = 1000
rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f1 = function([], sandbox.cuda.basic_ops.gpu_from_host(tensor.exp(x)))
f2 = function([],
Out(sandbox.cuda.basic_ops.gpu_from_host(tensor.exp(x)),
borrow=True))
t0 = time.time()
for i in range(iters):
r = f1()
t1 = time.time()
no_borrow = t1 - t0
t0 = time.time()
for i in range(iters):
r = f2()
t1 = time.time()
print(
"Looping %s times took %s seconds without borrow "
"and %s seconds with borrow" % (iters, no_borrow, (t1 - t0))
)
if numpy.any([isinstance(x.op, tensor.Elemwise) and
('Gpu' not in type(x.op).__name__)
for x in f1.maker.fgraph.toposort()]):
print('Used the cpu')
else:
print('Used the gpu')
Which produces this output:
.. code-block:: none
$ THEANO_FLAGS=device=gpu0,floatX=float32 python test1.py
Using gpu device 0: GeForce GTX 275
Looping 1000 times took 0.368273973465 seconds without borrow and 0.0240728855133 seconds with borrow.
Used the gpu
*Take home message:*
*Take home message:*
When an input *x* to a function is not needed after the function
When an input *x* to a function is not needed after the function
...
@@ -317,4 +271,3 @@ requirement. When a return value *y* is large (in terms of memory
...
@@ -317,4 +271,3 @@ requirement. When a return value *y* is large (in terms of memory
footprint), and you only need to read from it once, right away when
footprint), and you only need to read from it once, right away when
it's returned, then consider marking it with an ``Out(y,
it's returned, then consider marking it with an ``Out(y,
borrow=True)``.
borrow=True)``.
doc/tutorial/using_gpu.txt
浏览文件 @
bd544674
差异被折叠。
点击展开。
doc/tutorial/using_gpu_solution_1.py
浏览文件 @
bd544674
差异被折叠。
点击展开。
theano/misc/check_blas.py
100755 → 100644
浏览文件 @
bd544674
...
@@ -86,15 +86,20 @@ def execute(execute=True, verbose=True, M=2000, N=2000, K=2000,
...
@@ -86,15 +86,20 @@ def execute(execute=True, verbose=True, M=2000, N=2000, K=2000,
t0
=
0
t0
=
0
t1
=
-
1
t1
=
-
1
f
()
# Ignore first function call to get representative time.
if
execute
:
if
execute
:
sync
=
(
hasattr
(
theano
,
"sandbox"
)
and
sync
=
(
hasattr
(
theano
,
"sandbox"
)
and
hasattr
(
theano
.
sandbox
,
"cuda"
)
and
hasattr
(
theano
.
sandbox
,
"cuda"
)
and
theano
.
sandbox
.
cuda
.
cuda_available
)
theano
.
sandbox
.
cuda
.
cuda_available
)
sync2
=
(
hasattr
(
theano
,
"gpuarray"
)
and
theano
.
gpuarray
.
pygpu_activated
)
t0
=
time
.
time
()
t0
=
time
.
time
()
for
i
in
range
(
iters
):
for
i
in
range
(
iters
):
f
()
f
()
if
sync
:
if
sync
:
theano
.
sandbox
.
cuda
.
synchronize
()
theano
.
sandbox
.
cuda
.
synchronize
()
if
sync2
:
c
.
get_value
(
borrow
=
True
,
return_internal_type
=
True
)
.
sync
()
t1
=
time
.
time
()
t1
=
time
.
time
()
return
t1
-
t0
,
impl
return
t1
-
t0
,
impl
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论