- If you don't have Python yet, I would recommend the Python(x,y)
- If you don't have Python yet, I would recommend the Python(x,y)
distribution. It is only one installation and contains the most
distribution. It is only one installation and contains the most
...
@@ -278,10 +278,14 @@ Windows V1(bigger install, but simpler instruction + try instruction for gpu)
...
@@ -278,10 +278,14 @@ Windows V1(bigger install, but simpler instruction + try instruction for gpu)
In the USERPROFILE directory
In the USERPROFILE directory
you should create a configuration file .theanorc with the following
you should create a configuration file .theanorc with the following
two lines:
two lines:
[blas]
.. code-block:: bash
ldflags =
[blas]
ldflags =
(Create a text file in Windows Explorer and rename it from the DOS
(Create a text file in Windows Explorer and rename it from the DOS
box: "ren xyz.txt .theanorc".)
box: "ren xyz.txt .theanorc".)
Space or non ascii caracter are not always supported. If that is your case,
Space or non ascii caracter are not always supported. If that is your case,
Set the environment variable 'THEANO_FLAGS' to the value 'blas.ldflags='
Set the environment variable 'THEANO_FLAGS' to the value 'blas.ldflags='
...
@@ -313,14 +317,16 @@ Windows V1(bigger install, but simpler instruction + try instruction for gpu)
...
@@ -313,14 +317,16 @@ Windows V1(bigger install, but simpler instruction + try instruction for gpu)
e) Modify the .theanorc file: ldflags = -lgoto2
e) Modify the .theanorc file: ldflags = -lgoto2
This setting can also be changed in Python for testing purposes:
This setting can also be changed in Python for testing purposes:
theano.config.blas.ldflags = ...
theano.config.blas.ldflags = ...
- (Optional). To test the BLAS performance, you can run the script check_blas.py.
- (Optional). To test the BLAS performance, you can run the script check_blas.py.
For comparison I also downloaded and compiled the unoptimized standard
For comparison I also downloaded and compiled the unoptimized standard
BLAS. The results were the following (Intel Core2 Duo 1.86 GHz):
BLAS. The results were the following (Intel Core2 Duo 1.86 GHz):
Standard BLAS: 166 sec (unoptimized, 1 thread)
Standard BLAS: 166 sec (unoptimized, 1 thread)
NumPy: 48 sec (1 thread)
NumPy: 48 sec (1 thread)
Goto2: 16 sec (2 threads)
Goto2: 16 sec (2 threads)
Conclusions:
Conclusions:
a) The unoptimized standard BLAS is very slow. Don't use it.
a) The unoptimized standard BLAS is very slow. Don't use it.
b) The Windows binaries of NumPy were compiled with ATLAS and are surprisingly fast.
b) The Windows binaries of NumPy were compiled with ATLAS and are surprisingly fast.
...
@@ -348,7 +354,7 @@ Windows V1(bigger install, but simpler instruction + try instruction for gpu)
...
@@ -348,7 +354,7 @@ Windows V1(bigger install, but simpler instruction + try instruction for gpu)
Then run the theano cuda test file. In Windows command line (cmd.exe), run the program nosetests inside the theano repository. nosetests is installed by pythonxy.
Then run the theano cuda test file. In Windows command line (cmd.exe), run the program nosetests inside the theano repository. nosetests is installed by pythonxy.
Windows V2(smaller install, but longer instruction)
Windows V2(smaller install, but longer instruction)