Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
P
pytensor
项目
项目
详情
活动
周期分析
仓库
仓库
文件
提交
分支
标签
贡献者
图表
比较
统计图
议题
0
议题
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
CI / CD
CI / CD
流水线
作业
日程
统计图
Wiki
Wiki
代码片段
代码片段
成员
成员
折叠边栏
关闭边栏
活动
图像
聊天
创建新问题
作业
提交
问题看板
Open sidebar
testgroup
pytensor
Commits
919bcd62
提交
919bcd62
authored
12月 02, 2011
作者:
nouiz
浏览文件
操作
浏览文件
下载
差异文件
Merge pull request #254 from delallea/win_gpu_doc
Updated instructions for GPU on Windows
上级
b921a352
6157ab63
隐藏空白字符变更
内嵌
并排
正在显示
4 个修改的文件
包含
38 行增加
和
38 行删除
+38
-38
theano.txt
doc/cifarSC2011/theano.txt
+1
-1
presentation.tex
doc/hpcs2011_tutorial/presentation.tex
+1
-1
install.txt
doc/install.txt
+35
-35
introduction.txt
doc/introduction.txt
+1
-1
没有找到文件。
doc/cifarSC2011/theano.txt
浏览文件 @
919bcd62
...
@@ -32,7 +32,7 @@ Description
...
@@ -32,7 +32,7 @@ Description
* Transparent use of a GPU
* Transparent use of a GPU
* float32 only for now (working on other data types)
* float32 only for now (working on other data types)
*
Doesn't work on Windows for now
*
Still in experimental state on Windows
* On GPU data-intensive calculations are typically between 6.5x and 44x faster. We've seen speedups up to 140x
* On GPU data-intensive calculations are typically between 6.5x and 44x faster. We've seen speedups up to 140x
* Extensive unit-testing and self-verification
* Extensive unit-testing and self-verification
...
...
doc/hpcs2011_tutorial/presentation.tex
浏览文件 @
919bcd62
...
@@ -415,7 +415,7 @@ HPCS 2011, Montr\'eal
...
@@ -415,7 +415,7 @@ HPCS 2011, Montr\'eal
\item
Transparent use of a GPU
\item
Transparent use of a GPU
\begin{itemize}
\begin{itemize}
\item
float32 only for now (working on other data types)
\item
float32 only for now (working on other data types)
\item
Doesn't work on Windows for now
\item
Still in experimental state on Windows
\item
On GPU data-intensive calculations are typically between 6.5x and 44x faster. We've seen speedups up to 140x
\item
On GPU data-intensive calculations are typically between 6.5x and 44x faster. We've seen speedups up to 140x
\end{itemize}
\end{itemize}
\end{itemize}
\end{itemize}
}
}
...
...
doc/install.txt
浏览文件 @
919bcd62
...
@@ -19,8 +19,7 @@ instructions below for detailed installation steps):
...
@@ -19,8 +19,7 @@ instructions below for detailed installation steps):
Linux, Mac OS X or Windows operating system
Linux, Mac OS X or Windows operating system
We develop mainly on 64-bit Linux machines. 32-bit architectures are
We develop mainly on 64-bit Linux machines. 32-bit architectures are
not well-tested. Note that GPU computing does not work yet under
not well-tested.
Windows.
Python_ >= 2.4
Python_ >= 2.4
The development package (``python-dev`` or ``python-devel``
The development package (``python-dev`` or ``python-devel``
...
@@ -773,22 +772,16 @@ follows:
...
@@ -773,22 +772,16 @@ follows:
Using the GPU
Using the GPU
~~~~~~~~~~~~~
~~~~~~~~~~~~~
At this point, GPU computing does not work under Windows. The current main
Currently, GPU support under Windows is still in an experimental state.
issue is that the compilation commands used under Linux / MacOS to create
The following instructions should allow you to run GPU-enabled Theano code
and use a CUDA-based shared library with the nvcc compiler do not work with
only within a Visual Studio command prompt.
Windows DLLs. If anyone can figure out the proper compilation steps for
Windows, please let us know on the `theano-dev`_ mailing list.
Instructions below should at least get you started so you can reproduce the
above-mentioned issue.
Those are instructions for the 32-bit version of Python (the one that comes
Those are instructions for the 32-bit version of Python (the one that comes
with Python(x,y) is 32-bit).
with Python(x,y) is 32-bit).
Blanks or non ASCII characters are not always supported in paths. Python supports
Blanks or non ASCII characters are not always supported in paths. Python supports
them, but nvcc (at least version 3.1) does not.
them, but nvcc may not (for instance version 3.1 does not).
If your ``USERPROFILE`` directory (the one you get into when you run ``cmd``)
It is thus suggested to manually define a compilation directory without such
contains such characters, you must edit your Theano configuration file to
characters, by adding to your Theano configuration file:
use a compilation directory located somewhere else:
.. code-block:: cfg
.. code-block:: cfg
...
@@ -797,43 +790,50 @@ use a compilation directory located somewhere else:
...
@@ -797,43 +790,50 @@ use a compilation directory located somewhere else:
Then
Then
1) Install CUDA driver (32-bit on 32-bit Windows, idem for 64-bit).
1) From the CUDA downloads page, download and install:
a. The Developer Drivers (32-bit on 32-bit Windows, 64-bit on 64-bit
Windows).
2) Install CUDA toolkit 32-bit (even if you computer is 64-bit,
b. The CUDA Toolkit (32-bit even if your Windows is 64-bit, as it must
must match the Python installation vers
ion).
match your Python installat
ion).
3) Install CUDA SDK 32-bit
.
c. The GPU Computing SDK (32-bit as well)
.
4) Test some pre-compiled example of the sdk
.
2) Test some pre-compiled examples of the SDK
.
5) Download Visual Studio 2008 Express (free, VS2010 not supported by nvcc 3.1,
3) Install Visual C++ (you can find free versions by looking for "Visual
VS2005 is not available for download but supported by nvcc, the non
Studio Express").
free version should work too).
6) Follow the instruction in the GettingStartedWindows.pdf file from the CUDA web
4) Follow instructions from the "CUDA Getting Started Guide" available on
site to compile CUDA code with VS2008. If that does not work, you will
the NVidia website to compile CUDA code with Visual C++. If that does not
not be able to compile GPU code with Theano.
work, you will probably
not be able to compile GPU code with Theano.
7) Edit your Theano configuration file to add lines like the following
5) Edit your Theano configuration file to add lines like the following
(make sure these paths match your own specific installation):
(make sure these paths match your own specific versions of Python and
Visual Studio):
.. code-block:: cfg
.. code-block:: cfg
[nvcc]
[nvcc]
flags=-LC:\Python26\libs
flags=-LC:\Python26\libs
compiler_bindir=C:\Program Files (x86)\Microsoft Visual Studio
9
.0\VC\bin
compiler_bindir=C:\Program Files (x86)\Microsoft Visual Studio
10
.0\VC\bin
8) In Python do: ``import theano.sandbox.cuda``. This will compile the
6) Start a Visual Studio command prompt (found under the "Visual Studio
Tools" programs folder).
In Python do: ``import theano.sandbox.cuda``. This will compile the
first CUDA file, and no error should occur.
first CUDA file, and no error should occur.
9) Then run the Theano CUDA test files with nosetests from the
7) To test a simple GPU computation, first set up Theano to use the GPU
``theano/sandbox/cuda/tests`` subdirectory. In the current version of
by editing your configuration file:
Theano, this should fail with an error like:
.. code-block:: cfg
.. code-block:: bash
[global]
device = gpu
floatX = float32
NVCC: nvcc fatal: Don't know what to do with
Then run the ``theano/misc/check_blas.py`` test file.
'C:/CUDA/compile/tmpmkgqx6/../cuda_ndarray/cuda_ndarray.pyd'
Generating the documentation
Generating the documentation
...
...
doc/introduction.txt
浏览文件 @
919bcd62
...
@@ -184,7 +184,7 @@ Here is the state of that vision as of 24 October 2011 (after Theano release
...
@@ -184,7 +184,7 @@ Here is the state of that vision as of 24 October 2011 (after Theano release
* Efforts have begun towards a generic GPU ndarray (GPU tensor) (started in the
* Efforts have begun towards a generic GPU ndarray (GPU tensor) (started in the
`compyte <https://github.com/inducer/compyte/wiki>`_ project)
`compyte <https://github.com/inducer/compyte/wiki>`_ project)
* Move GPU backend outside of Theano (on top of PyCUDA/PyOpenCL)
* Move GPU backend outside of Theano (on top of PyCUDA/PyOpenCL)
* Will
allow GPU to work
on Windows and use an OpenCL backend on CPU.
* Will
provide better support for GPU
on Windows and use an OpenCL backend on CPU.
* Loops work, but not all related optimizations are currently done.
* Loops work, but not all related optimizations are currently done.
* The cvm linker allows lazy evaluation. It works, but some work is still
* The cvm linker allows lazy evaluation. It works, but some work is still
needed before enabling it by default.
needed before enabling it by default.
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论