@@ -479,12 +480,12 @@ Windows V1.5 (optional follow-up to V1 instructions)
...
@@ -479,12 +480,12 @@ Windows V1.5 (optional follow-up to V1 instructions)
/postinstall/pi.sh
/postinstall/pi.sh
It will ask for your MinGW installation directory (e.g.
It will ask for your MinGW installation directory (e.g.
``c:\pythonxy\mingw``).
``c:/pythonxy/mingw``).
e) Download `ActivePerl <http://www.activestate.com/activeperl>`_ and
e) Download `ActivePerl <http://www.activestate.com/activeperl/downloads>`_ and
install it.
install it (other Perl interpreters should also work).
f) Unpack GotoBLAS2 (e.g. using `7-zip <http://www.7-zip.org/>`_ or in
f) Unpack GotoBLAS2, either using `7-zip <http://www.7-zip.org/>`_ or in
MSYS with:
MSYS with:
.. code-block:: bash
.. code-block:: bash
...
@@ -500,47 +501,61 @@ Windows V1.5 (optional follow-up to V1 instructions)
...
@@ -500,47 +501,61 @@ Windows V1.5 (optional follow-up to V1 instructions)
quickbuild.win32 1>log.txt 2>err.txt
quickbuild.win32 1>log.txt 2>err.txt
Compilation should take a few minutes. Afterwards, you will probably
Compilation should take a few minutes. Afterwards, you will probably
find many error messages in err.txt, but also a libgoto2.dll
find many error messages in err.txt, but there should be an ``exports``
file in the exports folder. [NOTE: INSTRUCTIONS TO BE CONTINUED]
folder containing in particular ``libgoto2.dll``.
i) Copy libgoto2.dll from the exports folder to ``pythonxy\mingw\bin``
i) Copy ``libgoto2.dll`` from the ``exports`` folder to ``pythonxy\mingw\bin``
and ``pythonxy\mingw\lib``.
and ``pythonxy\mingw\lib``.
j) Modify your .theanorc (or .theanorc.txt) with "ldflags = -lgoto2".
j) Modify your .theanorc (or .theanorc.txt) with "ldflags = -lgoto2".
This setting can also be changed in Python for testing purposes:
This setting can also be changed in Python for testing purpose (in which
case it will remain only for the duration of your Python session):
.. code-block:: python
.. code-block:: python
theano.config.blas.ldflags = "-lgoto2"
theano.config.blas.ldflags = "-lgoto2"
- (Optional). To test the BLAS performance, you can run the script ``check_blas.py``.
k) To test the BLAS performance, you can run the script
For comparison I also downloaded and compiled the unoptimized standard
``theano/misc/check_blas.py``.
BLAS. The results were the following (Intel Core2 Duo 1.86 GHz):
Note that you may control the number of threads used by GotoBLAS2 with
the ``GOTO_NUM_THREADS`` environment variable (default behavior is to use
all available cores).
Here are some performance results on an Intel Core2 Duo 1.86 GHz,
compared to using Numpy's BLAS or the un-optimized standard BLAS
(compiled manually from its source code):
Standard BLAS: 166 sec (unoptimized, 1 thread)
* GotoBLAS2 (2 threads): 16s
NumPy: 48 sec (1 thread)
* NumPy (1 thread): 48s
Goto2: 16 sec (2 threads)
* Standard BLAS (un-optimized, 1 thread): 166s
Conclusions:
Conclusions:
a) The unoptimized standard BLAS is very slow. Don't use it.
* The unoptimized standard BLAS is very slow and should not be used.
b) The Windows binaries of NumPy were compiled with ATLAS and are surprisingly fast.
* The Windows binaries of NumPy were compiled with ATLAS and are surprisingly fast.
c) GotoBLAS is even faster, in particular if you have several kernels.
* GotoBLAS2 is even faster, in particular if you can use multiple cores.
- (Optional) Gpu on Windows. Not sur it work! Can you report success/error on the `theano-users <http://groups.google.com/group/theano-users>`_ mailing list?
Windows: Using the GPU
----------------------
Those are indication for 32-bit version of Python, the one that come with Python(x,y) is 32-bit.
Please note that these are tentative instructions (we have not yet been able to
get the GPU to work under Windows with Theano).
Please report your own successes / failures on the