提交 919bcd62 authored 作者: nouiz's avatar nouiz

Merge pull request #254 from delallea/win_gpu_doc

Updated instructions for GPU on Windows
...@@ -32,7 +32,7 @@ Description ...@@ -32,7 +32,7 @@ Description
* Transparent use of a GPU * Transparent use of a GPU
* float32 only for now (working on other data types) * float32 only for now (working on other data types)
* Doesn't work on Windows for now * Still in experimental state on Windows
* On GPU data-intensive calculations are typically between 6.5x and 44x faster. We've seen speedups up to 140x * On GPU data-intensive calculations are typically between 6.5x and 44x faster. We've seen speedups up to 140x
* Extensive unit-testing and self-verification * Extensive unit-testing and self-verification
......
...@@ -415,7 +415,7 @@ HPCS 2011, Montr\'eal ...@@ -415,7 +415,7 @@ HPCS 2011, Montr\'eal
\item Transparent use of a GPU \item Transparent use of a GPU
\begin{itemize} \begin{itemize}
\item float32 only for now (working on other data types) \item float32 only for now (working on other data types)
\item Doesn't work on Windows for now \item Still in experimental state on Windows
\item On GPU data-intensive calculations are typically between 6.5x and 44x faster. We've seen speedups up to 140x \item On GPU data-intensive calculations are typically between 6.5x and 44x faster. We've seen speedups up to 140x
\end{itemize} \end{itemize} \end{itemize} \end{itemize}
} }
......
...@@ -19,8 +19,7 @@ instructions below for detailed installation steps): ...@@ -19,8 +19,7 @@ instructions below for detailed installation steps):
Linux, Mac OS X or Windows operating system Linux, Mac OS X or Windows operating system
We develop mainly on 64-bit Linux machines. 32-bit architectures are We develop mainly on 64-bit Linux machines. 32-bit architectures are
not well-tested. Note that GPU computing does not work yet under not well-tested.
Windows.
Python_ >= 2.4 Python_ >= 2.4
The development package (``python-dev`` or ``python-devel`` The development package (``python-dev`` or ``python-devel``
...@@ -773,22 +772,16 @@ follows: ...@@ -773,22 +772,16 @@ follows:
Using the GPU Using the GPU
~~~~~~~~~~~~~ ~~~~~~~~~~~~~
At this point, GPU computing does not work under Windows. The current main Currently, GPU support under Windows is still in an experimental state.
issue is that the compilation commands used under Linux / MacOS to create The following instructions should allow you to run GPU-enabled Theano code
and use a CUDA-based shared library with the nvcc compiler do not work with only within a Visual Studio command prompt.
Windows DLLs. If anyone can figure out the proper compilation steps for
Windows, please let us know on the `theano-dev`_ mailing list.
Instructions below should at least get you started so you can reproduce the
above-mentioned issue.
Those are instructions for the 32-bit version of Python (the one that comes Those are instructions for the 32-bit version of Python (the one that comes
with Python(x,y) is 32-bit). with Python(x,y) is 32-bit).
Blanks or non ASCII characters are not always supported in paths. Python supports Blanks or non ASCII characters are not always supported in paths. Python supports
them, but nvcc (at least version 3.1) does not. them, but nvcc may not (for instance version 3.1 does not).
If your ``USERPROFILE`` directory (the one you get into when you run ``cmd``) It is thus suggested to manually define a compilation directory without such
contains such characters, you must edit your Theano configuration file to characters, by adding to your Theano configuration file:
use a compilation directory located somewhere else:
.. code-block:: cfg .. code-block:: cfg
...@@ -797,43 +790,50 @@ use a compilation directory located somewhere else: ...@@ -797,43 +790,50 @@ use a compilation directory located somewhere else:
Then Then
1) Install CUDA driver (32-bit on 32-bit Windows, idem for 64-bit). 1) From the CUDA downloads page, download and install:
a. The Developer Drivers (32-bit on 32-bit Windows, 64-bit on 64-bit
Windows).
2) Install CUDA toolkit 32-bit (even if you computer is 64-bit, b. The CUDA Toolkit (32-bit even if your Windows is 64-bit, as it must
must match the Python installation version). match your Python installation).
3) Install CUDA SDK 32-bit. c. The GPU Computing SDK (32-bit as well).
4) Test some pre-compiled example of the sdk. 2) Test some pre-compiled examples of the SDK.
5) Download Visual Studio 2008 Express (free, VS2010 not supported by nvcc 3.1, 3) Install Visual C++ (you can find free versions by looking for "Visual
VS2005 is not available for download but supported by nvcc, the non Studio Express").
free version should work too).
6) Follow the instruction in the GettingStartedWindows.pdf file from the CUDA web 4) Follow instructions from the "CUDA Getting Started Guide" available on
site to compile CUDA code with VS2008. If that does not work, you will the NVidia website to compile CUDA code with Visual C++. If that does not
not be able to compile GPU code with Theano. work, you will probably not be able to compile GPU code with Theano.
7) Edit your Theano configuration file to add lines like the following 5) Edit your Theano configuration file to add lines like the following
(make sure these paths match your own specific installation): (make sure these paths match your own specific versions of Python and
Visual Studio):
.. code-block:: cfg .. code-block:: cfg
[nvcc] [nvcc]
flags=-LC:\Python26\libs flags=-LC:\Python26\libs
compiler_bindir=C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\bin compiler_bindir=C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin
8) In Python do: ``import theano.sandbox.cuda``. This will compile the 6) Start a Visual Studio command prompt (found under the "Visual Studio
Tools" programs folder).
In Python do: ``import theano.sandbox.cuda``. This will compile the
first CUDA file, and no error should occur. first CUDA file, and no error should occur.
9) Then run the Theano CUDA test files with nosetests from the 7) To test a simple GPU computation, first set up Theano to use the GPU
``theano/sandbox/cuda/tests`` subdirectory. In the current version of by editing your configuration file:
Theano, this should fail with an error like:
.. code-block:: bash .. code-block:: cfg
[global]
device = gpu
floatX = float32
NVCC: nvcc fatal: Don't know what to do with Then run the ``theano/misc/check_blas.py`` test file.
'C:/CUDA/compile/tmpmkgqx6/../cuda_ndarray/cuda_ndarray.pyd'
Generating the documentation Generating the documentation
......
...@@ -184,7 +184,7 @@ Here is the state of that vision as of 24 October 2011 (after Theano release ...@@ -184,7 +184,7 @@ Here is the state of that vision as of 24 October 2011 (after Theano release
* Efforts have begun towards a generic GPU ndarray (GPU tensor) (started in the * Efforts have begun towards a generic GPU ndarray (GPU tensor) (started in the
`compyte <https://github.com/inducer/compyte/wiki>`_ project) `compyte <https://github.com/inducer/compyte/wiki>`_ project)
* Move GPU backend outside of Theano (on top of PyCUDA/PyOpenCL) * Move GPU backend outside of Theano (on top of PyCUDA/PyOpenCL)
* Will allow GPU to work on Windows and use an OpenCL backend on CPU. * Will provide better support for GPU on Windows and use an OpenCL backend on CPU.
* Loops work, but not all related optimizations are currently done. * Loops work, but not all related optimizations are currently done.
* The cvm linker allows lazy evaluation. It works, but some work is still * The cvm linker allows lazy evaluation. It works, but some work is still
needed before enabling it by default. needed before enabling it by default.
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论